chore: gather spans from both ev-node and ev-reth in spamoor test#3099
chore: gather spans from both ev-node and ev-reth in spamoor test#3099
Conversation
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
|
Claude finished @chatton's task in 1m 53s —— View job Code ReviewOverall, this is a solid addition that brings real observability to the spamoor smoke test. The Jaeger integration is well-structured, the Issues1. Invalid OTLP protocol value —
|
Summary of ChangesHello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| } | ||
|
|
||
| // printTraceReport aggregates spans by operation name and prints a timing breakdown. | ||
| func printTraceReport(t testing.TB, label string, spans []traceSpan) { |
There was a problem hiding this comment.
adapted the existing function to work with both jaeger spans and the otel ones.
test/e2e/go.mod
Outdated
| ) | ||
|
|
||
| replace ( | ||
| github.com/celestiaorg/tastora => ../../../../celestiaorg/tastora |
There was a problem hiding this comment.
Code Review
This pull request is a great improvement to the spamoor smoke test. It introduces tracing with Jaeger for both ev-node and ev-reth, and adds assertions to ensure key spans are being generated. This will be very valuable for observability and debugging. The refactoring to create a common printTraceReport function and a traceSpan interface is well-executed and improves code reuse. I have one minor suggestion for improvement.
test/e2e/evm_spamoor_smoke_test.go
Outdated
| WithPrivateKey(TestPrivateKey) | ||
|
|
||
| ctx := t.Context() | ||
| ctx = t.Context() |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3099 +/- ##
==========================================
- Coverage 60.93% 60.90% -0.04%
==========================================
Files 113 113
Lines 11617 11617
==========================================
- Hits 7079 7075 -4
- Misses 3740 3743 +3
- Partials 798 799 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughAdds Jaeger-based tracing for e2e tests, refactors trace reporting to use a traceSpan adapter, adds optional reth full-node support to test environments, and updates test module dependencies (tastora, zap). Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Harness
participant Docker as Docker Network
participant Reth as Reth Node
participant EvNode as Ev Node
participant Jaeger as Jaeger Collector
Test->>Docker: start network (reth, evnode, jaeger)
Docker->>Reth: start (OTLP/HTTP endpoint env set to Jaeger)
Docker->>EvNode: start (tracing -> Jaeger endpoint)
Reth->>Jaeger: export traces (OTLP/HTTP)
EvNode->>Jaeger: export traces (OTLP/HTTP)
Test->>Jaeger: poll/wait for traces
Jaeger-->>Test: return traces (JSON)
Test->>Test: parse traces -> jaegerSpan -> traceSpan
Test->>Test: printTraceReport(spans)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
test/e2e/evm_test_common.go (1)
859-926: Span count in the header includes zero-duration spans excluded from the stats table.Line 907 prints
len(spans)as the span count, but the aggregation loop skips any span whered <= 0. If there are many zero-duration spans (e.g., spans with missing timestamps), the header count and the table rows diverge silently.🔧 Suggested fix — track the included count separately
+ var includedCount int for _, span := range spans { d := span.SpanDuration() if d <= 0 { continue } + includedCount++ name := span.SpanName() ... } - t.Logf("\n--- %s Trace Breakdown (%d spans) ---", label, len(spans)) + t.Logf("\n--- %s Trace Breakdown (%d spans, %d with positive duration) ---", label, len(spans), includedCount)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_test_common.go` around lines 859 - 926, In printTraceReport, the header uses len(spans) but the loop skips spans with d <= 0 so the header can misrepresent the number included; update the aggregation in printTraceReport to maintain a separate includedCount (increment when a span is processed and included in map m) and use that includedCount in the header and any summaries instead of len(spans); ensure references to spans, span.SpanDuration(), the stats map m and the header print remain consistent when replacing len(spans) with the new includedCount.test/e2e/evm_spamoor_smoke_test.go (2)
69-72:ethCliandrpcCliare not closed after use.Both
ethclient.Clientandrpc.Clienthold open network connections. In tests, unclosed clients accumulate until the process exits. Since these are test-scoped and containers are removed by cleanup, the practical impact is low, but registering a cleanup is cleaner.💡 Suggested cleanup registration
ethCli, err := env.RethNode.GetEthClient(ctx) require.NoError(t, err, "failed to get ethclient") +t.Cleanup(ethCli.Close) rpcCli, err := env.RethNode.GetRPCClient(ctx) require.NoError(t, err, "failed to get rpc client") +t.Cleanup(rpcCli.Close)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 69 - 72, The test opens ethclient and rpc client connections (ethCli from env.RethNode.GetEthClient and rpcCli from env.RethNode.GetRPCClient) but never closes them; register test cleanup to call ethCli.Close() and rpcCli.Close() (or their CloseContext equivalents) using t.Cleanup immediately after each successful creation so the ethclient.Client and rpc.Client network connections are closed when the test finishes.
138-140: Unconditional 60-second sleep is fragile for CI timing.
time.Sleep(60 * time.Second)waits a fixed duration regardless of actual transaction throughput. If spans accumulate faster on a beefy CI runner the sleep is wasteful; on a slow runner, 60 s may be insufficient. The downstreamWaitForTracesalready polls — consider polling block height instead of sleeping to bound the wait tightly.💡 Example polling approach
- // allow spamoor enough time to generate transaction throughput - // so that the expected tracing spans appear in Jaeger. - time.Sleep(60 * time.Second) + // Wait until reth has produced enough blocks for meaningful trace coverage. + require.Eventually(t, func() bool { + h, err := ethCli.BlockNumber(ctx) + return err == nil && h >= 50 + }, 3*time.Minute, 2*time.Second, "reth did not reach block 50 in time")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 138 - 140, Replace the unconditional time.Sleep(60 * time.Second) with a bounded polling loop that waits for the chain to produce enough blocks/transactions (or for WaitForTraces to succeed) up to a configurable timeout: remove the direct call to time.Sleep, poll the node RPC (or a helper like getBlockHeight / rpcClient.HeaderByNumber) until the block height or transaction count reaches the expected threshold, or call WaitForTraces in a retry loop with a short interval, and fail if the overall timeout elapses; ensure you use context/timeout and short sleep intervals (e.g., PollImmediate pattern) so the test is resilient and fast on both fast and slow CI runners.test/docker-e2e/go.mod (1)
7-7: Minor:test/docker-e2epins tastora at v0.15.0 whiletest/e2euses v0.16.0.If they share any interop paths (e.g., shared Docker helpers), the version skew could surface API drift. Worth keeping in sync unless there's an intentional reason to lag here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/docker-e2e/go.mod` at line 7, The go.mod in test/docker-e2e pins github.com/celestiaorg/tastora at v0.15.0 which differs from test/e2e (v0.16.0); update the version string for github.com/celestiaorg/tastora in the test/docker-e2e go.mod to v0.16.0 to keep both test suites aligned (or, if intentional, add a brief comment explaining why docker-e2e must remain at v0.15.0 to avoid unintended API drift).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Around line 161-170: The test currently reuses traceCtx/cancel created by
context.WithTimeout for both jg.External.WaitForTraces calls, which can exhaust
the 3-minute budget on the first call and flake the second; change the code to
create a fresh derived context for each service check (e.g., call
context.WithTimeout(ctx, 3*time.Minute) separately for the "ev-node-smoke"
WaitForTraces and again for the "ev-reth" WaitForTraces), use separate cancel
functions for each and ensure each cancel is deferred immediately after
creation, and replace usages of the shared traceCtx with the new per-call
contexts when calling jg.External.WaitForTraces.
---
Nitpick comments:
In `@test/docker-e2e/go.mod`:
- Line 7: The go.mod in test/docker-e2e pins github.com/celestiaorg/tastora at
v0.15.0 which differs from test/e2e (v0.16.0); update the version string for
github.com/celestiaorg/tastora in the test/docker-e2e go.mod to v0.16.0 to keep
both test suites aligned (or, if intentional, add a brief comment explaining why
docker-e2e must remain at v0.15.0 to avoid unintended API drift).
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Around line 69-72: The test opens ethclient and rpc client connections (ethCli
from env.RethNode.GetEthClient and rpcCli from env.RethNode.GetRPCClient) but
never closes them; register test cleanup to call ethCli.Close() and
rpcCli.Close() (or their CloseContext equivalents) using t.Cleanup immediately
after each successful creation so the ethclient.Client and rpc.Client network
connections are closed when the test finishes.
- Around line 138-140: Replace the unconditional time.Sleep(60 * time.Second)
with a bounded polling loop that waits for the chain to produce enough
blocks/transactions (or for WaitForTraces to succeed) up to a configurable
timeout: remove the direct call to time.Sleep, poll the node RPC (or a helper
like getBlockHeight / rpcClient.HeaderByNumber) until the block height or
transaction count reaches the expected threshold, or call WaitForTraces in a
retry loop with a short interval, and fail if the overall timeout elapses;
ensure you use context/timeout and short sleep intervals (e.g., PollImmediate
pattern) so the test is resilient and fast on both fast and slow CI runners.
In `@test/e2e/evm_test_common.go`:
- Around line 859-926: In printTraceReport, the header uses len(spans) but the
loop skips spans with d <= 0 so the header can misrepresent the number included;
update the aggregation in printTraceReport to maintain a separate includedCount
(increment when a span is processed and included in map m) and use that
includedCount in the header and any summaries instead of len(spans); ensure
references to spans, span.SpanDuration(), the stats map m and the header print
remain consistent when replacing len(spans) with the new includedCount.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
test/docker-e2e/go.sumis excluded by!**/*.sumtest/e2e/go.sumis excluded by!**/*.sum
📒 Files selected for processing (5)
test/docker-e2e/go.modtest/e2e/evm_contract_bench_test.gotest/e2e/evm_spamoor_smoke_test.gotest/e2e/evm_test_common.gotest/e2e/go.mod
| traceCtx, cancel := context.WithTimeout(ctx, 3*time.Minute) | ||
| defer cancel() | ||
| ok, err := jg.External.WaitForTraces(traceCtx, "ev-node-smoke", 1, 2*time.Second) | ||
| require.NoError(t, err, "error while waiting for Jaeger traces; UI: %s", jg.External.QueryURL()) | ||
| require.True(t, ok, "expected at least one trace in Jaeger; UI: %s", jg.External.QueryURL()) | ||
|
|
||
| // Also wait for traces from ev-reth and print a small sample. | ||
| ok, err = jg.External.WaitForTraces(traceCtx, "ev-reth", 1, 2*time.Second) | ||
| require.NoError(t, err, "error while waiting for ev-reth traces; UI: %s", jg.External.QueryURL()) | ||
| require.True(t, ok, "expected at least one trace from ev-reth; UI: %s", jg.External.QueryURL()) |
There was a problem hiding this comment.
traceCtx is shared between both WaitForTraces calls — budget is not reset between them.
If ev-node-smoke traces arrive near the 3-minute limit, almost no budget remains for the ev-reth check, which could flake. Consider giving each service its own derived context with a fresh timeout.
🛡️ Suggested fix
- traceCtx, cancel := context.WithTimeout(ctx, 3*time.Minute)
- defer cancel()
- ok, err := jg.External.WaitForTraces(traceCtx, "ev-node-smoke", 1, 2*time.Second)
+ evNodeCtx, evNodeCancel := context.WithTimeout(ctx, 3*time.Minute)
+ defer evNodeCancel()
+ ok, err := jg.External.WaitForTraces(evNodeCtx, "ev-node-smoke", 1, 2*time.Second)
require.NoError(t, err, "error while waiting for Jaeger traces; UI: %s", jg.External.QueryURL())
require.True(t, ok, "expected at least one trace in Jaeger; UI: %s", jg.External.QueryURL())
- ok, err = jg.External.WaitForTraces(traceCtx, "ev-reth", 1, 2*time.Second)
+ evRethCtx, evRethCancel := context.WithTimeout(ctx, 3*time.Minute)
+ defer evRethCancel()
+ ok, err = jg.External.WaitForTraces(evRethCtx, "ev-reth", 1, 2*time.Second)
require.NoError(t, err, "error while waiting for ev-reth traces; UI: %s", jg.External.QueryURL())
require.True(t, ok, "expected at least one trace from ev-reth; UI: %s", jg.External.QueryURL())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| traceCtx, cancel := context.WithTimeout(ctx, 3*time.Minute) | |
| defer cancel() | |
| ok, err := jg.External.WaitForTraces(traceCtx, "ev-node-smoke", 1, 2*time.Second) | |
| require.NoError(t, err, "error while waiting for Jaeger traces; UI: %s", jg.External.QueryURL()) | |
| require.True(t, ok, "expected at least one trace in Jaeger; UI: %s", jg.External.QueryURL()) | |
| // Also wait for traces from ev-reth and print a small sample. | |
| ok, err = jg.External.WaitForTraces(traceCtx, "ev-reth", 1, 2*time.Second) | |
| require.NoError(t, err, "error while waiting for ev-reth traces; UI: %s", jg.External.QueryURL()) | |
| require.True(t, ok, "expected at least one trace from ev-reth; UI: %s", jg.External.QueryURL()) | |
| evNodeCtx, evNodeCancel := context.WithTimeout(ctx, 3*time.Minute) | |
| defer evNodeCancel() | |
| ok, err := jg.External.WaitForTraces(evNodeCtx, "ev-node-smoke", 1, 2*time.Second) | |
| require.NoError(t, err, "error while waiting for Jaeger traces; UI: %s", jg.External.QueryURL()) | |
| require.True(t, ok, "expected at least one trace in Jaeger; UI: %s", jg.External.QueryURL()) | |
| // Also wait for traces from ev-reth and print a small sample. | |
| evRethCtx, evRethCancel := context.WithTimeout(ctx, 3*time.Minute) | |
| defer evRethCancel() | |
| ok, err = jg.External.WaitForTraces(evRethCtx, "ev-reth", 1, 2*time.Second) | |
| require.NoError(t, err, "error while waiting for ev-reth traces; UI: %s", jg.External.QueryURL()) | |
| require.True(t, ok, "expected at least one trace from ev-reth; UI: %s", jg.External.QueryURL()) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/evm_spamoor_smoke_test.go` around lines 161 - 170, The test
currently reuses traceCtx/cancel created by context.WithTimeout for both
jg.External.WaitForTraces calls, which can exhaust the 3-minute budget on the
first call and flake the second; change the code to create a fresh derived
context for each service check (e.g., call context.WithTimeout(ctx,
3*time.Minute) separately for the "ev-node-smoke" WaitForTraces and again for
the "ev-reth" WaitForTraces), use separate cancel functions for each and ensure
each cancel is deferred immediately after creation, and replace usages of the
shared traceCtx with the new per-call contexts when calling
jg.External.WaitForTraces.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
test/e2e/evm_spamoor_smoke_test.go (1)
150-159: SharedtraceCtxtimeout budget between bothWaitForTracescalls remains unaddressed.If the first
WaitForTracescall consumes most of the 3-minute budget, the second call forev-rethwill have insufficient time and may flake. Consider using separate contexts with independent timeouts as suggested previously.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 150 - 159, The test currently creates a single context traceCtx with a 3-minute timeout and reuses it for two jg.External.WaitForTraces calls (traceCtx, cancel), which can cause the second call (ev-reth) to flake if the first consumes the budget; change to create independent contexts for each call (e.g., traceCtx1/cancel1 and traceCtx2/cancel2 via context.WithTimeout) and pass them to jg.External.WaitForTraces so each WaitForTraces has its own timeout and defer cancel for each.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Line 10: The import "strings" in the evm_spamoor_smoke_test.go import block is
unused and causes a compile error; remove "strings" from the import list (or use
it in the test if intended) so the import block only contains used packages—look
for the import statement in evm_spamoor_smoke_test.go and delete the "strings"
entry.
- Around line 68-72: The variables ethCli and rpcCli returned from
env.RethNode.GetEthClient(ctx) and env.RethNode.GetRPCClient(ctx) are declared
and error-checked but never used, which causes Go compile errors; either remove
the calls and checks entirely if not needed, or retain the calls but assign the
results to blank identifiers (e.g., use _ instead of ethCli/rpcCli) while
keeping the require.NoError checks to only validate connectivity; update the
calls wherever ethCli or rpcCli appear (currently none) to avoid unused variable
errors.
- Around line 40-47: The environment variable OTEL_EXPORTER_OTLP_TRACES_PROTOCOL
is set to an invalid value ("http") inside the WithRethOpts block; update the
string passed to b.WithEnv in the WithRethOpts(func(b *reth.NodeBuilder) { ...
}) call to use "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf" so the OTLP
HTTP exporter uses the correct http/protobuf protocol for the traces endpoint.
---
Duplicate comments:
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Around line 150-159: The test currently creates a single context traceCtx with
a 3-minute timeout and reuses it for two jg.External.WaitForTraces calls
(traceCtx, cancel), which can cause the second call (ev-reth) to flake if the
first consumes the budget; change to create independent contexts for each call
(e.g., traceCtx1/cancel1 and traceCtx2/cancel2 via context.WithTimeout) and pass
them to jg.External.WaitForTraces so each WaitForTraces has its own timeout and
defer cancel for each.
| WithRethOpts(func(b *reth.NodeBuilder) { | ||
| b.WithEnv( | ||
| "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces", | ||
| "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http", | ||
| "RUST_LOG=info", | ||
| "OTEL_SDK_DISABLED=false", | ||
| ) | ||
| }), |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL valid values OpenTelemetry specification
💡 Result:
Per the OpenTelemetry specification, OTEL_EXPORTER_OTLP_TRACES_PROTOCOL MUST be one of these values:
grpchttp/protobufhttp/json[1]
(These are the same valid values documented in the OTLP exporter SDK configuration docs.) [2]
🏁 Script executed:
# Let's explore the context around the code and see how this is used
find . -type f -name "*.go" | head -20Repository: evstack/ev-node
Length of output: 603
🏁 Script executed:
# Search for other uses of OTEL_EXPORTER_OTLP_TRACES_PROTOCOL in the codebase
rg "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" -t go -B 2 -A 2Repository: evstack/ev-node
Length of output: 435
🏁 Script executed:
# Check the file structure and context around the line
cat -n test/e2e/evm_spamoor_smoke_test.go | sed -n '25,55p'Repository: evstack/ev-node
Length of output: 1530
Change OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http to http/protobuf.
Per the OpenTelemetry specification, OTEL_EXPORTER_OTLP_TRACES_PROTOCOL valid values are grpc, http/protobuf, and http/json. Plain http is not valid. Since this configuration uses an HTTP endpoint, change to http/protobuf, which is the standard protocol for OTLP over HTTP.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/e2e/evm_spamoor_smoke_test.go` around lines 40 - 47, The environment
variable OTEL_EXPORTER_OTLP_TRACES_PROTOCOL is set to an invalid value ("http")
inside the WithRethOpts block; update the string passed to b.WithEnv in the
WithRethOpts(func(b *reth.NodeBuilder) { ... }) call to use
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf" so the OTLP HTTP exporter
uses the correct http/protobuf protocol for the traces endpoint.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
test/e2e/evm_spamoor_smoke_test.go (2)
37-45:⚠️ Potential issue | 🟡 MinorUse a valid OTLP traces protocol value.
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=httpis not a valid OTLP protocol value; usehttp/protobuf(orhttp/json) to ensure the exporter works as expected.✅ Suggested fix
- "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http", + "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",Valid values for OTEL_EXPORTER_OTLP_TRACES_PROTOCOL in OpenTelemetry OTLP exporter configuration?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 37 - 45, The OTLP traces protocol env var is invalid in the reth node setup; inside the setupCommonEVMEnv call where WithRethOpts configures reth.NodeBuilder you should change the OTEL_EXPORTER_OTLP_TRACES_PROTOCOL value from "http" to a valid protocol such as "http/protobuf" (or "http/json") so the exporter can negotiate correctly; update the string passed to b.WithEnv for the "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=" entry accordingly and run the tests to verify.
142-165:⚠️ Potential issue | 🟡 MinorSeparate timeouts for each Jaeger wait to avoid budget bleed.
A shared
traceCtxcan exhaust the timeout on the first wait and make the second wait flaky.✅ Suggested fix
- traceCtx, cancel := context.WithTimeout(ctx, 3*time.Minute) - defer cancel() - ok, err := jg.External.WaitForTraces(traceCtx, "ev-node-smoke", 1, 2*time.Second) + evNodeCtx, evNodeCancel := context.WithTimeout(ctx, 3*time.Minute) + defer evNodeCancel() + ok, err := jg.External.WaitForTraces(evNodeCtx, "ev-node-smoke", 1, 2*time.Second) require.NoError(t, err, "error while waiting for Jaeger traces; UI: %s", jg.External.QueryURL()) require.True(t, ok, "expected at least one trace in Jaeger; UI: %s", jg.External.QueryURL()) - ok, err = jg.External.WaitForTraces(traceCtx, "ev-reth", 1, 2*time.Second) + evRethCtx, evRethCancel := context.WithTimeout(ctx, 3*time.Minute) + defer evRethCancel() + ok, err = jg.External.WaitForTraces(evRethCtx, "ev-reth", 1, 2*time.Second) require.NoError(t, err, "error while waiting for ev-reth traces; UI: %s", jg.External.QueryURL()) require.True(t, ok, "expected at least one trace from ev-reth; UI: %s", jg.External.QueryURL()) - evNodeTraces, err := jg.External.Traces(traceCtx, "ev-node-smoke", 10000) + evNodeTraces, err := jg.External.Traces(evNodeCtx, "ev-node-smoke", 10000) require.NoError(t, err, "failed to fetch ev-node-smoke traces from Jaeger") - evRethTraces, err := jg.External.Traces(traceCtx, "ev-reth", 10000) + evRethTraces, err := jg.External.Traces(evRethCtx, "ev-reth", 10000) require.NoError(t, err, "failed to fetch ev-reth traces from Jaeger")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 142 - 165, The test currently reuses a single traceCtx (and cancel) for multiple Jaeger waits which can exhaust the timeout on the first call; replace that single shared context with separate per-service timeouts so each jg.External.WaitForTraces and the subsequent jg.External.Traces call get their own context. Specifically, create a context.WithTimeout(ctx, 3*time.Minute) + defer cancel() for "ev-node-smoke" and a separate context.WithTimeout(ctx, 3*time.Minute) + defer cancel() for "ev-reth", and use those dedicated contexts when calling jg.External.WaitForTraces and jg.External.Traces to prevent budget bleed between calls.
🧹 Nitpick comments (1)
test/e2e/evm_spamoor_smoke_test.go (1)
132-134: Consider replacing the fixed 60s sleep with readiness polling.A polling approach (e.g., waiting for traces or metrics thresholds) can reduce test time and improve stability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 132 - 134, Replace the fixed time.Sleep(60 * time.Second) in evm_spamoor_smoke_test.go with a polling readiness check: implement a helper (e.g., waitForTraces or waitForCondition) that repeatedly queries the target (Jaeger API, metrics endpoint, or a span count function) at a short interval until a threshold is met or a timeout elapses, and call that helper instead of time.Sleep in the test; ensure the helper accepts maxTimeout and pollInterval parameters and returns an error on timeout so the test fails deterministically if traces never appear.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Around line 37-45: The OTLP traces protocol env var is invalid in the reth
node setup; inside the setupCommonEVMEnv call where WithRethOpts configures
reth.NodeBuilder you should change the OTEL_EXPORTER_OTLP_TRACES_PROTOCOL value
from "http" to a valid protocol such as "http/protobuf" (or "http/json") so the
exporter can negotiate correctly; update the string passed to b.WithEnv for the
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=" entry accordingly and run the tests to
verify.
- Around line 142-165: The test currently reuses a single traceCtx (and cancel)
for multiple Jaeger waits which can exhaust the timeout on the first call;
replace that single shared context with separate per-service timeouts so each
jg.External.WaitForTraces and the subsequent jg.External.Traces call get their
own context. Specifically, create a context.WithTimeout(ctx, 3*time.Minute) +
defer cancel() for "ev-node-smoke" and a separate context.WithTimeout(ctx,
3*time.Minute) + defer cancel() for "ev-reth", and use those dedicated contexts
when calling jg.External.WaitForTraces and jg.External.Traces to prevent budget
bleed between calls.
---
Nitpick comments:
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Around line 132-134: Replace the fixed time.Sleep(60 * time.Second) in
evm_spamoor_smoke_test.go with a polling readiness check: implement a helper
(e.g., waitForTraces or waitForCondition) that repeatedly queries the target
(Jaeger API, metrics endpoint, or a span count function) at a short interval
until a threshold is met or a timeout elapses, and call that helper instead of
time.Sleep in the test; ensure the helper accepts maxTimeout and pollInterval
parameters and returns an error on timeout so the test fails deterministically
if traces never appear.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
test/e2e/evm_spamoor_smoke_test.go (2)
39-45:⚠️ Potential issue | 🟠 MajorUse a valid OTLP traces protocol value.
Line 42 setsOTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http, which is not a standard OTLP protocol value and may be rejected by the exporter. Please switch to a valid value such ashttp/protobuf.🛠️ Proposed fix
- "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http", + "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",OTEL_EXPORTER_OTLP_TRACES_PROTOCOL valid values http/protobuf http/json grpc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 39 - 45, The OTLP protocol env var set in the WithRethOpts block is invalid: change the OTEL_EXPORTER_OTLP_TRACES_PROTOCOL value passed in b.WithEnv (currently "http") to a valid OTLP protocol such as "http/protobuf" (or "http/json"/"grpc") so the exporter accepts it; locate the WithRethOpts(func(b *reth.NodeBuilder) { ... b.WithEnv(... "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=...") }) and update that string accordingly.
142-165:⚠️ Potential issue | 🟡 MinorGive each Jaeger query its own timeout budget.
traceCtxis reused for both services and both fetches, so the first wait can consume most of the 3‑minute budget and make later checks flaky. Use per‑service contexts.🛠️ Proposed fix
- traceCtx, cancel := context.WithTimeout(ctx, 3*time.Minute) - defer cancel() - ok, err := jg.External.WaitForTraces(traceCtx, "ev-node-smoke", 1, 2*time.Second) + evNodeCtx, evNodeCancel := context.WithTimeout(ctx, 3*time.Minute) + defer evNodeCancel() + ok, err := jg.External.WaitForTraces(evNodeCtx, "ev-node-smoke", 1, 2*time.Second) require.NoError(t, err, "error while waiting for Jaeger traces; UI: %s", jg.External.QueryURL()) require.True(t, ok, "expected at least one trace in Jaeger; UI: %s", jg.External.QueryURL()) - ok, err = jg.External.WaitForTraces(traceCtx, "ev-reth", 1, 2*time.Second) + evRethCtx, evRethCancel := context.WithTimeout(ctx, 3*time.Minute) + defer evRethCancel() + ok, err = jg.External.WaitForTraces(evRethCtx, "ev-reth", 1, 2*time.Second) require.NoError(t, err, "error while waiting for ev-reth traces; UI: %s", jg.External.QueryURL()) require.True(t, ok, "expected at least one trace from ev-reth; UI: %s", jg.External.QueryURL()) - evNodeTraces, err := jg.External.Traces(traceCtx, "ev-node-smoke", 10000) + evNodeTraces, err := jg.External.Traces(evNodeCtx, "ev-node-smoke", 10000) require.NoError(t, err, "failed to fetch ev-node-smoke traces from Jaeger") evNodeSpans := extractSpansFromTraces(evNodeTraces) printTraceReport(t, "ev-node-smoke", toTraceSpans(evNodeSpans)) - evRethTraces, err := jg.External.Traces(traceCtx, "ev-reth", 10000) + evRethTraces, err := jg.External.Traces(evRethCtx, "ev-reth", 10000) require.NoError(t, err, "failed to fetch ev-reth traces from Jaeger")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_spamoor_smoke_test.go` around lines 142 - 165, traceCtx is reused for both Jaeger queries causing the first call to drain the shared 3-minute budget; create separate per-service contexts (e.g., traceCtxNode, traceCtxReth) using context.WithTimeout for each WaitForTraces and Traces call and use their own cancel functions (defer cancel() immediately after creation) so each service ("ev-node-smoke", "ev-reth") gets its own timeout budget for WaitForTraces and Traces instead of sharing traceCtx.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/e2e/evm_spamoor_smoke_test.go`:
- Around line 39-45: The OTLP protocol env var set in the WithRethOpts block is
invalid: change the OTEL_EXPORTER_OTLP_TRACES_PROTOCOL value passed in b.WithEnv
(currently "http") to a valid OTLP protocol such as "http/protobuf" (or
"http/json"/"grpc") so the exporter accepts it; locate the WithRethOpts(func(b
*reth.NodeBuilder) { ... b.WithEnv(... "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=...")
}) and update that string accordingly.
- Around line 142-165: traceCtx is reused for both Jaeger queries causing the
first call to drain the shared 3-minute budget; create separate per-service
contexts (e.g., traceCtxNode, traceCtxReth) using context.WithTimeout for each
WaitForTraces and Traces call and use their own cancel functions (defer cancel()
immediately after creation) so each service ("ev-node-smoke", "ev-reth") gets
its own timeout budget for WaitForTraces and Traces instead of sharing traceCtx.
Overview
Note: depends on celestiaorg/tastora#181This PR updates the spamoor test to send telemetry to a deployed jaeger instance and performs an assertion that basic spans were detected.
Summary by CodeRabbit
Tests
Chores