[Log] Add trace log and add loggingInstrumentor tool#4692
[Log] Add trace log and add loggingInstrumentor tool#4692Jiang-Jia-Jun merged 6 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
a7a07eb to
afc83d9
Compare
|
/re-run run_tests_with_coverage |
|
/re-run run_tests_with_coverage |
02155ff to
67e6730
Compare
cd4e314 to
a25b7c9
Compare
f21552e to
126f89c
Compare
|
/re-run ci_xpu |
There was a problem hiding this comment.
Pull Request Overview
This PR adds comprehensive tracing and logging capabilities to the FastDeploy system for tracking request lifecycle events and improving observability.
- Introduces a new trace logging module with event-based logging across preprocessing, scheduling, inference, and postprocessing stages
- Adds OpenTelemetry logging instrumentation support to integrate with distributed tracing systems
- Implements a custom log formatter that supports structured attributes and OTEL span/trace ID injection
Reviewed Changes
Copilot reviewed 19 out of 21 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| fastdeploy/trace/trace_logger.py | New trace logging function that records events with request context |
| fastdeploy/trace/constants.py | Defines logging event names, stage names, and event-to-stage mappings |
| fastdeploy/logger/formatters.py | Adds CustomFormatter with attribute expansion and OTEL field support |
| fastdeploy/logger/logger.py | Implements get_trace_logger method for creating trace-specific loggers |
| fastdeploy/utils.py | Initializes global trace_logger instance |
| fastdeploy/metrics/trace_util.py | Integrates OpenTelemetry logging instrumentation |
| fastdeploy/output/token_processor.py | Adds trace logging for token generation events |
| fastdeploy/entrypoints/engine_client.py | Adds trace logging at preprocessing start |
| fastdeploy/engine/common_engine.py | Adds trace logging throughout scheduling and resource allocation |
| fastdeploy/entrypoints/openai/serving_completion.py | Adds trace logging for completion endpoint postprocessing end |
| fastdeploy/entrypoints/openai/serving_chat.py | Adds trace logging for chat endpoint postprocessing end |
| requirements*.txt | Adds opentelemetry-instrumentation-logging dependency |
| tests/trace/test_trace_logger.py | Tests for trace_print function |
| tests/trace/test_constants.py | Tests for event and stage enumerations |
| tests/output/test_token_processor_trace_print.py | Tests trace logging in token processor |
| tests/logger/test_logger.py | Tests for get_trace_logger method |
| tests/logger/test_formatters.py | Extensive tests for CustomFormatter and ColoredFormatter |
Comments suppressed due to low confidence (3)
fastdeploy/logger/formatters.py:59
- Except block directly handles BaseException.
except:
fastdeploy/logger/formatters.py:125
- Except block directly handles BaseException.
except:
fastdeploy/trace/trace_logger.py:23
- Except block directly handles BaseException.
except:
6dd3637 to
5c49942
Compare
24573ae to
845b899
Compare
|
/re-run run_tests_with_coverage |
|
/re-run run_ce_cases |
|
/re-run run_tests_with_coverage |
|
/re-run run_tests_logprob |
1 similar comment
|
/re-run run_tests_logprob |
|
/re-run run_tests_with_coverage |
|
/re-run run_tests_with_coverage |
Motivation
目前推理阶段缺乏细粒度的时间打点数据,无法支撑对推理内部阶段的耗时分布查询。因此,需要对 FastDeploy 推理阶段进行细化划分,并增加日志打点。
此外,现有日志系统存在以下问题:
为了解决无法快速定位的问题,引入 OpenTelemetry LoggingInstrumentor 工具,将 日志(Logs) 与 追踪(Traces) 关联起来,从而提升系统的可观测性与调试能力。
Modifications
1. 新增 Trace Logger
2. 新增自定义 Formatter
3. 引入 LoggingInstrumentor
4. FastDeploy 阶段划分与打点
在 FastDeploy 各主要阶段插入日志打点,以支持耗时分析与追踪。
打点事件与阶段对应表:
5. 打点工具类实现
为了规范化和自动化日志追踪信息的记录,定义了以下核心组件:
核心枚举类 (Enums)
这些枚举定义了 FastDeploy 请求处理流程中的标准打点事件和阶段,是实现细粒度追踪的基础。
LoggingEventName:
StageName:
EVENT_TO_STAGE_MAP:
trace_logger打印函数 (print)
Usage or Command
打点示例:
Accuracy Tests
打印示例(未开启trace):
打印示例(开启 Trace):
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.