Skip to content

Add Support for Querying Metrics from Google Managed Prometheus and Additional PromQL Filters now Configurable #121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Bslabe123
Copy link
Contributor

@Bslabe123 Bslabe123 commented Jun 16, 2025

Addresses: #76

Adds metrics.prometheus.google_managed; when set to true, the Prometheus URL is automatically set to the Google Cloud Managed Service for Prometheus endpoint for the currently configured project. Additional PromQL filters are also now configurable via: metrics.prometheus.filters:

metrics:
  type: prometheus
  prometheus:
    google_managed: True
    scrape_interval: 15
    filters:
      - 'namespace="test-ns"'

Newly added logs:

2025-06-17 16:00:15,994 - inference_perf.client.metricsclient.prometheus_client.base - INFO - Prometheus metrics client configured, querying metrics from 'https://monitoring.googleapis.com/v1/projects/xxxxx/location/global/prometheus/api/v1/query'

...

2025-06-17 16:04:23,924 - inference_perf.client.metricsclient.prometheus_client.google_managed_prometheus_client - INFO - Making PromQL query: 'avg_over_time(vllm:num_requests_waiting{model_name='meta-llama/Meta-Llama-3-8B',namespace="test-ns"}[238s])'
2025-06-17 16:04:24,208 - inference_perf.client.metricsclient.prometheus_client.google_managed_prometheus_client - INFO - Making PromQL query: 'sum(rate(vllm:time_to_first_token_seconds_sum{model_name='meta-llama/Meta-Llama-3-8B',namespace="test-ns"}[238s])) / (sum(rate(vllm:time_to_first_token_seconds_count{model_name='meta-llama/Meta-Llama-3-8B',namespace="test-ns"}[238s])) > 0)'
2025-06-17 16:04:24,490 - inference_perf.client.metricsclient.prometheus_client.google_managed_prometheus_client - INFO - Making PromQL query: 'histogram_quantile(0.5, sum(rate(vllm:time_to_first_token_seconds_bucket{model_name='meta-llama/Meta-Llama-3-8B',namespace="test-ns"}[238s])) by (le))'
2025-06-17 16:04:24,757 - inference_perf.client.metricsclient.prometheus_client.google_managed_prometheus_client - INFO - Making PromQL query: 'histogram_quantile(0.9, sum(rate(vllm:time_to_first_token_seconds_bucket{model_name='meta-llama/Meta-Llama-3-8B',namespace="test-ns"}[238s])) by (le))'

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Bslabe123
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 16, 2025
@Bslabe123 Bslabe123 changed the title [WIP] Add Support for Querying Metrics from Google Managed Prometheus Add Support for Querying Metrics from Google Managed Prometheus Jun 17, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 17, 2025
first commit

fix import

fix url

fix import

fix imports

missing __init__.py

fix import

fix typo

move super call to end

set config not self

add debug

debug -> info

start of queries fix

nit

add additional filters

missing spread

debug

remove newline

nit

revert

adjust logs
@Bslabe123 Bslabe123 changed the title Add Support for Querying Metrics from Google Managed Prometheus Add Support for Querying Metrics from Google Managed Prometheus and Additional PromQL Filters now Configurable Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants