fix(client): avoid redundant metadata fetch in task log accessors#3214
Conversation
When loglines() or _log_size() already have meta_dict, resolve the log attempt from that dict instead of calling current_attempt, which re-fetches metadata_dict. Fixes Netflix#3034 Co-authored-by: Cursor <cursoragent@cursor.com>
Greptile SummaryThis PR eliminates a redundant
Confidence Score: 5/5Safe to merge — the change is narrowly scoped to the log-accessor hot path and preserves all existing attempt-resolution semantics. The new No files require special attention. Important Files Changed
Reviews (3): Last reviewed commit: "test(client): align log metadata regress..." | Re-trigger Greptile |
Add a regression test to ensure _resolve_log_attempt delegates to current_attempt when meta_dict is None. Refs: Netflix#3034 Co-authored-by: Cursor <cursoragent@cursor.com>
ynachiket
left a comment
There was a problem hiding this comment.
Good call, let me know if there is any other feedback.
|
If tests pass, I will merge it. Thanks for the PR. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3214 +/- ##
=========================================
Coverage ? 28.36%
=========================================
Files ? 381
Lines ? 52359
Branches ? 9244
=========================================
Hits ? 14853
Misses ? 36564
Partials ? 942 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
When loglines() or _log_size() already have meta_dict, resolve the log attempt from that dict instead of calling current_attempt, which re-fetches metadata_dict.
Fixes #3034
PR Type
Summary
Accessing task logs via
stdout/stderr(or log size) no longer triggers a redundantmetadata_dictfetch when the log path already loaded metadata once.Issue
Fixes #3034
Reproduction
Runtime: local (Client API + metadata provider; no flow execution required)
Commands to run:
cd test/unit python -m pytest test/unit/test_task_log_metadata_fetch.py -vWhere evidence shows up: unit test assertions on
Task.metadata_dictaccess count (proxy for metadata HTTP fetches in service-backed deployments)Before (error / log snippet)
After (evidence that fix works)
Root Cause
_load_log()and_get_logsize()each callself.metadata_dictonce, then pass that dict intologlines()/_log_size(). Both methods then resolved the log attempt viaself.current_attempt. WhenTaskwas constructed without an explicitattempt,current_attemptreadsself.metadata_dictagain to get the"attempt"key—duplicating the metadata load (and, with a remote metadata service, an extra HTTP round-trip per log access).Why This Fix Is Correct
_resolve_log_attempt(meta_dict)preserves the same attempt resolution order ascurrent_attempt: explicitTask(..., attempt=N)wins; otherwise usemeta_dict["attempt"]whenmeta_dictis already available; only fall back tocurrent_attemptwhen no dict was supplied. The change is limited to the log accessor hot path and does not alter metadata semantics.Failure Modes Considered
Task(..., attempt=N)still takes precedence overmeta_dict["attempt"]via_attempt is not Nonecheck.meta_dictomits"attempt", behavior matchescurrent_attempt(defaults to0). Ifmeta_dictisNone,loglines()still loadsmetadata_dictonce as before, then usescurrent_attemptonly in that fallback path.Tests
Non-Goals
latest_successful_run(Client API N+1 HTTP Query Storm: Unbounded Fetches + Client-Side-Only Filtering Cause O(n) Request Cascades #2942).log_location_*code paths beyond shared attempt resolution.AI Tool Usage
No AI tools were used in this contribution
AI tools were used (describe below)
Tool(s): Cursor (AI assistant)
Used for: identifying the redundant fetch from issue Redundant metadata_dict fetch in loglines() and _log_size() log loading path #3034, drafting
_resolve_log_attempt, and initial unit test scaffoldingReview: All code was reviewed, run locally (
pytest test/unit/test_task_log_metadata_fetch.py -v, 7 passed), and adjusted before submit