Refactor and fix network tracing integration in controllerconn#5648
Refactor and fix network tracing integration in controllerconn#5648eriknordmark merged 1 commit intolf-edge:masterfrom
Conversation
- Update eve-libs/nettrace to include the fix where HTTPClient.GetTrace() flushes all in-memory data and blocks until all asynchronous batch offloads are completed. This guarantees that exported traces are complete. - Extract tracing setup and teardown logic into helper methods (prepareNetworkTracing and processNetworkTraces) to remove duplication and unify trace lifecycle handling. - Fix trace handling in handleHTTPReqFailure. After introducing Bolt-based batch offloading, this path was not updated accordingly and continued to append in-memory traces to netdump. Since traces are offloaded to Bolt in batch mode, the in-memory structures were empty, resulting in missing network traces precisely when HTTP requests failed. The failure path now uses the same processing logic as the success path, ensuring that traces are properly flushed, exported, and attached to SendRetval. - Fix netdump tar path to properly use the per-session directory name Signed-off-by: Milan Lenco <milan@zededa.com>
4f5f27a to
22cac5e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5648 +/- ##
==========================================
+ Coverage 19.52% 27.96% +8.43%
==========================================
Files 19 18 -1
Lines 3021 2417 -604
==========================================
+ Hits 590 676 +86
+ Misses 2310 1591 -719
- Partials 121 150 +29 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
eriknordmark
left a comment
There was a problem hiding this comment.
LGTM
Do we have a test where we can verify that the memory usage doesn't increase (and maybe even decreases as you use Bolt differently)?
I’m not aware of any test specifically verifying that memory usage remains stable or decreases with the updated Bolt usage. Perhaps @kperakis-zededa has created or experimented with something along those lines. |
Description
Update
eve-libs/nettraceto include the fix whereHTTPClient.GetTrace()flushes all in-memory data and blocks until all asynchronous batch offloads are completed. This guarantees that exported traces are complete.Extract tracing setup and teardown logic into helper methods (
prepareNetworkTracingandprocessNetworkTraces) to remove duplication and unify trace lifecycle handling.Fix trace handling in
handleHTTPReqFailure. After introducing Bolt-based batch offloading, this path was not updated accordingly and continued to append in-memory traces to netdump. Since traces are offloaded to Bolt in batch mode, the in-memory structures were empty, resulting in missing network traces precisely when HTTP requests failed (and when traces are the most needed to troubleshoot).The failure path now uses the same processing logic as the success path, ensuring that traces are properly flushed, exported, and attached to
SendRetval.Fix netdump tar path to properly use the per-session directory name
How to test and validate this PR
To test and validate this PR, deploy an edge node running an EVE version that includes the fix. SSH into the node and navigate to
/persist/netdump, where collected netdump tarballs are stored. Extract one of the tarballs and inspect therequests/<req-name>/directories. Verify that these directories are not empty and that each contains a validnettrace.jsonfile with non-empty network trace data, including HTTP, TLS, TCP, and (if applicable) DNS traces. It is preferred to check NIM netdumps for both successful and failed HTTP requests if available (i.e.nim-ok-*andnim-fail-*tarballs), since these were most affected by the previous issues and are typically smaller and easier to review than larger downloader netdumps. Both successful and failed requests should now contain complete and properly exported traces.Changelog notes
Fix nettrace handling to ensure complete trace export (including failed requests)
PR Backports
Checklist