chore(agent-builder): diagnostic logs for /daily 403 root-cause hunt#420
Merged
chore(agent-builder): diagnostic logs for /daily 403 root-cause hunt#420
Conversation
After PR #418 merged to dev, production `/daily` from a Lark DM still fails with `github_proxy_access_denied`. Manual reproduction with the same NyxID account (session-token-minted child key, both narrow `allow_all_services=false` and broad scopes) succeeds end-to-end. The production-only failure means the deployed `POST /api-keys` payload or the preflight response shape carries information we can't see in the existing logs. Adds three diagnostic log lines on the daily-report create path: - Right before `nyxClient.CreateApiKeyAsync`: the resolved per-user `UserService.id`s and the literal payload JSON. Lets us verify whether the deployed binary actually emits `allow_all_services=false` and whether the resolver returned `UserService.id`s vs catalog ids. - Right after a successful create: the new api-key id, so we can correlate to NyxID-side audit logs (`proxy_request_denied`). - Inside `PreflightGitHubProxyAsync`: the *raw* probe response. The parsed `proxy_body` we return only carries the inner Lark/GitHub body string when SendAsync wraps non-2xx; the unparsed probe is the only place we see what NyxID actually returned. Distinguishes: - NyxID `ApiKeyScopeForbidden` (`error_code:9000`, our payload still wrong) - GitHub upstream 403 (`Bad credentials`, OAuth grant revoked) - other shapes - And on preflight failure: the structured envelope we send back to the user (`github_proxy_access_denied`), so the failure record is self-contained in one log line. Information-only, behavior unchanged. Tests pass (30/30 in AgentBuilderToolTests). Intended to be reverted once the production log captures the real failure shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## dev #420 +/- ##
==========================================
+ Coverage 70.37% 70.39% +0.01%
==========================================
Files 1175 1175
Lines 84453 84453
Branches 11124 11124
==========================================
+ Hits 59438 59447 +9
+ Misses 20723 20715 -8
+ Partials 4292 4291 -1
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
eanzhao
added a commit
that referenced
this pull request
Apr 25, 2026
The four `Diagnostic[#417]` log lines added in PR #420 captured the production failure shape (GitHub 403 with body "Request forbidden by administrative rules ... User-Agent header"), which led to the fix in this PR (default User-Agent injection in `NyxIdApiClient`). The logs were always intended to be temporary; they include full payload JSON including api-key prefixes and don't have a long-term place in the hot path. Also tightens the comment block at the preflight call site to reflect the new understanding: preflight catches misconfigurations that surface at request time, not just OAuth revocation. Original case (#421) was a missing User-Agent header. 421/421 ChannelRuntime tests + 12/12 AI tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After #418 merged, production
/dailystill fails withgithub_proxy_access_denied. Independent manual reproduction (session-token-minted child key, both narrow and broad scopes) succeeds end-to-end against the same NyxID account, GitHub UserService, and proxy slug. The production-only failure has to live in either the deployedPOST /api-keyspayload or the preflight response shape, neither of which is visible in current logs.This PR adds diagnostic-only log lines on the daily-report create path so the next failed
/dailycaptures ground truth, then revert.What's logged
CreateApiKeyAsync: resolved per-userUserService.ids and the literal payload JSON. Tells us whether the deployed binary actually carriesallow_all_services=falseand whetherResolveProxyServiceIdsAsyncreturned per-user ids vs catalog ids.proxy_request_deniedaudit records.PreflightGitHubProxyAsync(Information level): the raw probe response. The parsedproxy_bodywe return only carries the inner Lark/GitHub body string when SendAsync wraps non-2xx; the unparsed probe is the only place we see exactly what NyxID returned. Distinguishes:ApiKeyScopeForbidden(error_code:9000, our payload is still wrong somehow)Bad credentials, OAuth grant revoked)Why this PR exists separately
These logs are temporary debugging — once the production log captures the real failure shape, this PR (or just the four log lines) gets reverted. Splitting it off keeps the revert clean and avoids re-opening the #418 conversation.
Test plan
dotnet build agents/Aevatar.GAgents.ChannelRuntime/Aevatar.GAgents.ChannelRuntime.csproj— clean (40 pre-existing warnings, 0 errors).dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --filter AgentBuilderToolTests— 30/30 passing./daily alicefrom Lark, capture log lines taggedDiagnostic[#417]. Revert this PR after.Follow-up after capture
Depending on what
Diagnostic[#417]: GitHub preflight probe responseshows:error_code:9000→ resolver still returns wrong ids despite the apparent fix; need to revisitResolveProxyServiceIdsAsyncor how the relay-token auth path lists user-servicesBad credentials→ OAuth grant on the bot owner's GitHub binding is unhealthy on the relay-token path even though it's healthy on session-token reads; need to look at how NyxID picks the credential to inject for relay-scoped api-keys