Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation about the available metrics #73

Open
bdattoma opened this issue Sep 5, 2023 · 1 comment
Open

Add documentation about the available metrics #73

bdattoma opened this issue Sep 5, 2023 · 1 comment
Assignees
Labels
kind/documentation Improvements or additions to documentation rhods-2.5

Comments

@bdattoma
Copy link
Contributor

bdattoma commented Sep 5, 2023

Based on watsonx requirements, we should make available these metrics, at least:

  • '# of inference requests over defined time period
  • Avg. response time over defined time period
  • '# of successful / failed inference requests over defined time period
  • Compute utilization (CPU,GPU,Memory)

However, users won't find metrics with the same name and some of them need to be computed by combination. Examples:

  • failed inference requests over defined time period: you must do sth like tgi_batch_inference_count-tgi_batch_inference_success plus adding the time period syntax
  • Memory consumption: there isn't a specific istio/tgi/caikit metric for it (at least, i didn't find it). I thought users can compute it with sth similar to: sum(container_memory_working_set_bytes{pod='<isvc_predictor_pod_name>',namespace='<isvc_namespace>',container='',}) BY (pod, namespace)

Moreover, there are additional metrics which deserves to be documented, like tgi_request_generated_tokens_count

@bdattoma
Copy link
Contributor Author

bdattoma commented Sep 5, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Improvements or additions to documentation rhods-2.5
Projects
Status: No status
Status: No status
Status: In Progress
Development

No branches or pull requests

4 participants