New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modelindexer: improve observability #6630
Conversation
6103147
to
5308caf
Compare
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
5308caf
to
20fe646
Compare
This pull request is now in conflicts. Could you fix it @axw? 🙏
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great
* model/modelindexer: capture more metrics * model/modelindexer: trace flush operations * model/modelindexer: inject trace ID into logs * Update changelog (cherry picked from commit 3064a5d)
Verified using 9bb5e4c, with one caveat, I had to enable debug logging since that's the level those logs use: {"log.level":"debug","@timestamp":"2021-12-21T15:21:10.672+0800","log.logger":"modelindexer","log.origin":{"file.name":"modelindexer/indexer.go","file.line":400},"message":"bulk request completed: 4 indexed, 0 failed (0 exceeded capacity)","service.name":"apm-server","trace.id":"05aae453c235fa7efbcd3b72f3666339","transaction.id":"05aae453c235fa7e","ecs.version":"1.6.0"} Metrics are correctly added to the monitoring docs: {"log.level":"debug","@timestamp":"2021-12-21T15:21:18.966+0800","log.logger":"monitoring","log.origin":{"file.name":"processing/processors.go","file.line":203},"message":"Publish event: {\n \"@timestamp\": \"2021-12-21T07:21:18.964Z\",\n \"@metadata\": {\n \"beat\": \"apm-server\",\n \"type\": \"_doc\",\n \"version\": \"8.0.0\",\n \"params\": {\n \"interval\": \"10s\"\n },\n \"cluster_uuid\": \"nE_NVzmNRAWVnk5BzWVSJw\",\n \"type\": \"beats_stats\",\n \"interval_ms\": 10000\n },\n \"metrics\": {\n \"beat\": {\n \"memstats\": {\n \"memory_total\": 39181400,\n \"memory_alloc\": 24287288,\n \"memory_sys\": 38142984,\n \"gc_next\": 31763216,\n \"rss\": 38535168\n },\n \"cpu\": {\n \"total\": {\n \"ticks\": 2,\n \"time\": {\n \"ms\": 97\n },\n \"value\": 2\n },\n \"user\": {\n \"time\": {\n \"ms\": 54\n },\n \"ticks\": 1\n },\n \"system\": {\n \"time\": {\n \"ms\": 43\n },\n \"ticks\": 1\n }\n },\n \"runtime\": {\n \"goroutines\": 51\n },\n \"info\": {\n \"uptime\": {\n \"ms\": 30032\n },\n \"ephemeral_id\": \"93e72de4-f0c8-4fb9-b3f3-3beb681456db\",\n \"version\": \"8.0.0\"\n }\n },\n \"apm-server\": {\n \"jaeger\": {\n \"grpc\": {\n \"sampling\": {\n \"response\": {\n \"count\": 0,\n \"errors\": {\n \"count\": 0\n },\n \"valid\": {\n \"count\": 0\n }\n },\n \"event\": {\n \"received\": {\n \"count\": 0\n }\n },\n \"request\": {\n \"count\": 0\n }\n },\n \"collect\": {\n \"request\": {\n \"count\": 0\n },\n \"response\": {\n \"errors\": {\n \"count\": 0,\n \"ratelimit\": 0,\n \"timeout\": 0,\n \"unauthorized\": 0\n },\n \"valid\": {\n \"count\": 0\n },\n \"count\": 0\n }\n }\n }\n },\n \"server\": {\n \"request\": {\n \"count\": 4\n },\n \"response\": {\n \"count\": 3,\n \"errors\": {\n \"decode\": 0,\n \"unavailable\": 0,\n \"closed\": 0,\n \"validate\": 0,\n \"notfound\": 0,\n \"method\": 0,\n \"toolarge\": 0,\n \"unauthorized\": 0,\n \"internal\": 0,\n \"timeout\": 0,\n \"count\": 0,\n \"queue\": 0,\n \"ratelimit\": 0,\n \"forbidden\": 0,\n \"invalidquery\": 0\n },\n \"valid\": {\n \"count\": 3,\n \"accepted\": 3,\n \"ok\": 0,\n \"notmodified\": 0\n }\n },\n \"unset\": 0\n },\n \"decoder\": {\n \"uncompressed\": {\n \"content-length\": 0,\n \"count\": 0\n },\n \"reader\": {\n \"count\": 4\n },\n \"missing-content-length\": {\n \"count\": 4\n },\n \"deflate\": {\n \"content-length\": 0,\n \"count\": 0\n },\n \"gzip\": {\n \"content-length\": 0,\n \"count\": 0\n }\n },\n \"root\": {\n \"request\": {\n \"count\": 0\n },\n \"response\": {\n \"count\": 0,\n \"errors\": {\n \"unauthorized\": 0,\n \"toolarge\": 0,\n \"invalidquery\": 0,\n \"closed\": 0,\n \"notfound\": 0,\n \"queue\": 0,\n \"timeout\": 0,\n \"decode\": 0,\n \"internal\": 0,\n \"method\": 0,\n \"ratelimit\": 0,\n \"count\": 0,\n \"forbidden\": 0,\n \"validate\": 0,\n \"unavailable\": 0\n },\n \"valid\": {\n \"count\": 0,\n \"notmodified\": 0,\n \"accepted\": 0,\n \"ok\": 0\n }\n },\n \"unset\": 0\n },\n \"profile\": {\n \"request\": {\n \"count\": 0\n },\n \"response\": {\n \"errors\": {\n \"closed\": 0,\n \"count\": 0,\n \"unavailable\": 0,\n \"timeout\": 0,\n \"queue\": 0,\n \"decode\": 0,\n \"method\": 0,\n \"invalidquery\": 0,\n \"validate\": 0,\n \"toolarge\": 0,\n \"forbidden\": 0,\n \"ratelimit\": 0,\n \"internal\": 0,\n \"notfound\": 0,\n \"unauthorized\": 0\n },\n \"valid\": {\n \"accepted\": 0,\n \"count\": 0,\n \"ok\": 0,\n \"notmodified\": 0\n },\n \"count\": 0\n },\n \"unset\": 0\n },\n \"otlp\": {\n \"grpc\": {\n \"metrics\": {\n \"request\": {\n \"count\": 0\n },\n \"response\": {\n \"valid\": {\n \"count\": 0\n },\n \"count\": 0,\n \"errors\": {\n \"unauthorized\": 0,\n \"count\": 0,\n \"ratelimit\": 0,\n \"timeout\": 0\n }\n },\n \"consumer\": {\n \"unsupported_dropped\": 0\n }\n },\n \"traces\": {\n \"response\": {\n \"count\": 0,\n \"errors\": {\n \"timeout\": 0,\n \"unauthorized\": 0,\n \"count\": 0,\n \"ratelimit\": 0\n },\n \"valid\": {\n \"count\": 0\n }\n },\n \"request\": {\n \"count\": 0\n }\n },\n \"logs\": {\n \"request\": {\n \"count\": 0\n },\n \"response\": {\n \"valid\": {\n \"count\": 0\n },\n \"count\": 0,\n \"errors\": {\n \"ratelimit\": 0,\n \"timeout\": 0,\n \"unauthorized\": 0,\n \"count\": 0\n }\n }\n }\n }\n },\n \"sampling\": {\n \"transactions_dropped\": 1\n },\n \"acm\": {\n \"request\": {\n \"count\": 1\n },\n \"response\": {\n \"errors\": {\n \"notfound\": 0,\n \"closed\": 0,\n \"timeout\": 0,\n \"unauthorized\": 0,\n \"validate\": 0,\n \"internal\": 0,\n \"decode\": 0,\n \"ratelimit\": 0,\n \"method\": 0,\n \"queue\": 0,\n \"count\": 1,\n \"invalidquery\": 0,\n \"unavailable\": 0,\n \"toolarge\": 0,\n \"forbidden\": 1\n },\n \"valid\": {\n \"notmodified\": 0,\n \"accepted\": 0,\n \"count\": 0,\n \"ok\": 0\n },\n \"count\": 1\n },\n \"unset\": 0\n },\n \"processor\": {\n \"transaction\": {\n \"transformations\": 5\n },\n \"metric\": {\n \"transformations\": 5\n },\n \"stream\": {\n \"accepted\": 18,\n \"errors\": {\n \"invalid\": 0,\n \"toolarge\": 0\n }\n },\n \"span\": {\n \"transformations\": 9\n }\n },\n \"aggregation\": {\n \"txmetrics\": {\n \"active_groups\": 3,\n \"overflowed\": 0\n }\n }\n },\n \"libbeat\": {\n \"config\": {\n \"scans\": 0,\n \"reloads\": 0,\n \"module\": {\n \"starts\": 0,\n \"stops\": 0,\n \"running\": 0\n }\n },\n \"output\": {\n \"write\": {\n \"bytes\": 0,\n \"errors\": 0\n },\n \"read\": {\n \"errors\": 0,\n \"bytes\": 0\n },\n \"type\": \"elasticsearch\",\n \"events\": {\n \"toomany\": 0,\n \"total\": 18,\n \"acked\": 11,\n \"active\": 7,\n \"batches\": 2,\n \"failed\": 0\n }\n },\n \"pipeline\": {\n \"events\": {\n \"dropped\": 0,\n \"retry\": 0,\n \"active\": 0,\n \"total\": 0,\n \"filtered\": 0,\n \"published\": 0,\n \"failed\": 0\n },\n \"queue\": {\n \"max_events\": 4096,\n \"acked\": 0\n },\n \"clients\": 1\n }\n },\n \"system\": {\n \"cpu\": {\n \"cores\": 8\n },\n \"load\": {\n \"5\": 5.6587,\n \"15\": 6.3672,\n \"norm\": {\n \"1\": 0.5795,\n \"5\": 0.7073,\n \"15\": 0.7959\n },\n \"1\": 4.6357\n }\n }\n },\n \"beat\": {\n \"name\": \"Marcs-MacBook-Pro-2.local\",\n \"host\": \"Marcs-MacBook-Pro-2.local\",\n \"uuid\": \"bf2a6d26-c5ed-4b41-a507-feab5b75c095\",\n \"type\": \"apm-server\",\n \"version\": \"8.0.0\"\n }\n}","service.name":"apm-server","ecs.version":"1.6.0"} |
Motivation/summary
Improve observability of modelindexer:
libbeat.output.events.toomany
)libbeat.output.events.acked
)libbeat.output.events.batches
)Checklist
- [ ] Documentation has been updatedHow to test these changes
-e -E output.elasticsearch.experimental=true -E instrumentation.enabled=true -E monitoring.enabled=true
trace.id
andtransaciton.id
fields, which should correspond to a trace and transaction with the same IDs.Related issues
#6360