SHARED-2644: Downgrade node auth context-cancel errors#2086
Conversation
|
👋 dvashchuk, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
📊 API Diff Results
|
There was a problem hiding this comment.
Pull request overview
Adjusts NodeJWTAuthenticator.AuthenticateJWT logging to avoid emitting ERROR logs for expected shutdown / request-cancellation scenarios when IsNodePubKeyTrusted returns context.Canceled or context.DeadlineExceeded, while preserving ERROR logs for real provider failures.
Changes:
- Downgrade log level from
ERRORtoWARNwhenIsNodePubKeyTrustedfails due to context cancellation/deadline. - Add tests asserting log level behavior for provider errors vs context cancellation.
- Introduce a minimal
slog.Handlerto capture log records in tests.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pkg/nodeauth/jwt/node_jwt_authenticator.go | Downgrades logging to WARN for context-canceled/deadline errors from the provider. |
| pkg/nodeauth/jwt/node_jwt_authenticator_test.go | Adds log-capture handler and new tests verifying log levels for provider error vs context cancellation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "error", err, | ||
| ) | ||
| if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) { | ||
| v.logger.Warn("Node validation skipped: context cancelled", |
There was a problem hiding this comment.
Addressed: message now reads "Node validation skipped: context canceled or deadline exceeded" (covers both).
| if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) { | ||
| v.logger.Warn("Node validation skipped: context cancelled", | ||
| "csaPubKey", hex.EncodeToString(publicKey), | ||
| "error", err, | ||
| ) |
There was a problem hiding this comment.
Addressed: added TestNodeJWTAuthenticator_AuthenticateJWT_ProviderDeadlineExceededError (line 545) covering context.DeadlineExceeded path.
| func (h *captureHandler) WithAttrs([]slog.Attr) slog.Handler { return h } | ||
| func (h *captureHandler) WithGroup(string) slog.Handler { return h } |
There was a problem hiding this comment.
Addressed: WithAttrs and WithGroup now return a new captureHandler with merged attrs/groups (not the same handler). See lines in node_jwt_authenticator_test.go.
Key Decisions (SHARED-2644)
Related PRs:
|
f95c05b to
f35e344
Compare
11eadf3 to
7b7656a
Compare
9c02c64 to
5a7b58c
Compare
…thenticator SHARED-2644: De-noise context cancellation errors in node auth path. - Log WARN (not ERROR) when IsNodePubKeyTrusted returns context.Canceled or context.DeadlineExceeded - Keep ERROR for genuine provider failures - Add contextErr structured field for observability - Add test cases for both Canceled and DeadlineExceeded paths - Fix captureHandler.WithAttrs/WithGroup to satisfy slog.Handler contract
5a7b58c to
c225417
Compare
| @@ -0,0 +1,52 @@ | |||
| package beholder | |||
There was a problem hiding this comment.
Is this a leftover from stale branch head? Does not seem relevant to this change
There was a problem hiding this comment.
Not a leftover — intentional, but separable from the log-noise fix.
Why it is here: PR #2081 moved DurableEvent, DurableEventStore, BatchInserter, DurableQueueObserver, DurableQueueStats from pkg/beholder to pkg/durableemitter. The chainlink node (core/services/workflows/) still imports these from pkg/beholder. Without this shim, Build Chainlink picks up additional compile errors on top of the pre-existing proto failures from #2080.
Why not type aliases: would require pkg/beholder to import pkg/durableemitter, which closes a cycle: durableemitter (tests) → servicetest → utils/tests → beholdertest → beholder → durableemitter.
To reproduce without the shim:
git revert HEAD -- pkg/beholder/durable_event_store.go
go get github.com/smartcontractkit/chainlink-common@<this PR SHA>
go mod tidy
make install-chainlink # in chainlink repo at develop
You will see undefined: beholder.DurableEvent, beholder.DurableEventStore, etc.
Options:
- Keep as-is — unblocks both issues in one merge
- Remove —
Build Chainlinkgains more failures on top of Bump chainlink-protos/cre/go for ConfidentialWorkflow proto restructure #2080 proto errors; proper fix tracked in chainlink PR #22562 - Split into a separate PR — cleanest separation of concerns
There was a problem hiding this comment.
Let's raise a separate PR. This is a completely different codeowner / maintainer than the owners of pkg/nodeauth.
There was a problem hiding this comment.
Done — split into #2095 (separate PR, separate codeowner review).
The backward-compat type declarations for DurableEventStore are split into their own PR (fix/beholder-backward-compat) per reviewer request to keep this PR focused on the log-noise fix.
|
Split done — removed This PR is now focused solely on the nodeauth log-noise fix. |
|
Hey @elatoskinas - all feedback addressed. Need one more approval to merge (2 required). SonarQube coverage metric is re-running - tests cover all new code paths (verified locally). |
Why is this change necessary?
AuthenticateJWTlogged ERROR forNode validation failedwhenIsNodePubKeyTrustedreturned context cancellation/deadline. These are expected during shutdown/timeout paths and should not page as high-severity failures.What is changing?
pkg/nodeauth/jwt/node_jwt_authenticator.go, log WARN (not ERROR) when provider error is context cancellation/deadlineHow was this tested?
go test -timeout 30s -run TestNodeJWTAuthenticator ./pkg/nodeauth/jwt/...