-
Notifications
You must be signed in to change notification settings - Fork 653
Improve metrics log #4297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve metrics log #4297
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the clarity of metrics logging by:
- Renaming metrics methods and fields for better clarity (increment→increase, finished→completed)
- Adding tracking for API-level request states (running/waiting at the API server level)
- Reorganizing the log output to clearly separate API server metrics from engine core metrics
- Making the prefix cache hit rate conditional in the output (only shown when non-zero)
Changes:
- Renamed metrics methods from
increment_*toincrease_*andnum_finished_reqstonum_completed_reqs - Added
num_api_running_reqsandnum_api_waiting_reqstracking to distinguish API-level from engine-level request states - Improved log message format to show API server and Engine core metrics separately with clearer labels
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lmdeploy/serve/async_engine.py | Updated to track API-level running requests in the model_inst context manager and use renamed metrics methods |
| lmdeploy/metrics/stats.py | Added new fields for API-level request tracking and updated documentation to explain the metric relationships |
| lmdeploy/metrics/metrics_processor.py | Renamed methods and added increase/decrease methods for API running requests tracking |
| lmdeploy/metrics/loggers.py | Reorganized log output format, made prefix cache conditional, and updated Prometheus metric names |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
lmdeploy/metrics/stats.py
Outdated
| num_total_reqs: API server, the number of all requests received since server start. | ||
| num_completed_reqs: API server, the number of successfully completed requests since server start. | ||
| num_api_running_reqs: API server, the number of requests being assigned to engine instances. | ||
| num_api_waiting_reqs: API server, the number of requests waiting for free engine instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_api_routed_reqs: API server, the number of requests routed to request handles.
num_api_waiting_reqs: API server, the number of requests waiting for free request handles.
lmdeploy/metrics/stats.py
Outdated
| num_total_reqs: int = 0 | ||
| num_finished_reqs: int = 0 | ||
| num_completed_reqs: int = 0 | ||
| num_api_running_reqs: int = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_api_routed_reqs: int = 0
|
May merge latest main to resolve the conflicts |
Try to make the metrics log clearer. Now, it looks like: