Skip to content

Conversation

@maru-ava
Copy link
Contributor

@maru-ava maru-ava commented Sep 11, 2025

Why this should be merged

Setting STACK_TRACE_ERRORS=1 will configure tmpnet to include stack traces with errors it originates. This is intended to aid in debugging by indicating not just the location of the failure but also the chain of callers.

Inspired by: golang/go#63358

How this works

  • Add stacktrace package and ensure all tmpnet errors pass through its functions
  • Said functions ensure that the stack trace is collected at the deepest point in the call chain for maximum traceability

How this was tested

  • CI
  • tmpnet default behavior
$ tmpnetctl start-network --node-count=0
Error: --node-count must be greater than 0 but got 0
  • tmpnet with stacktraces
$ STACK_TRACE_ERRORS=1 tmpnetctl start-network --node-count=0
Error: --node-count must be greater than 0 but got 0
Stack trace:
/home/user/work/avalanchego/a_tmpnet-stacktrace/tests/fixture/tmpnet/tmpnetctl/main.go:76: main.main.func2
/home/user/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:985: github.com/spf13/cobra.(*Command).execute
/home/user/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:1117: github.com/spf13/cobra.(*Command).ExecuteC
/home/user/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:1041: github.com/spf13/cobra.(*Command).Execute
/home/user/work/avalanchego/a_tmpnet-reuse-state/tests/fixture/tmpnet/tmpnetctl/main.go:322: main.main
/home/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.9.linux-arm64/src/runtime/proc.go:272: runtime.main
/home/user/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.9.linux-arm64/src/runtime/asm_arm64.s:1223: runtime.goexit

Need to be documented in RELEASES.md?

N/A

@maru-ava maru-ava self-assigned this Sep 11, 2025
@maru-ava maru-ava added testing This primarily focuses on testing tooling Build, test and development tooling labels Sep 11, 2025
@maru-ava maru-ava moved this to In Progress 🏗️ in avalanchego Sep 11, 2025
switch {
case errors.Is(err, ErrUnrecoverableNodeHealthCheck):
return fmt.Errorf("%w for node %q", err, n.NodeID)
return stacktrace.Errorf("node %q saw unrecoverable health check: %w", n.NodeID, err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to reorder the args to ensure the error was last so that the stacktrace could be added to it.

@maru-ava maru-ava marked this pull request as ready for review September 11, 2025 05:44
@Copilot Copilot AI review requested due to automatic review settings September 11, 2025 05:44
@maru-ava maru-ava changed the title [tmpnet] Add optional stacktraces to errors originating from tmpnet [tmpnet] Add optional stack traces to errors originating from tmpnet Sep 11, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds optional stacktraces to errors originating from tmpnet by introducing a new stacktrace package and integrating it throughout the tmpnet codebase. When STACK_TRACE_ERRORS=1 is set, errors will include stacktraces collected at the deepest point in the call chain for debugging assistance.

Key changes:

  • Introduces a new tests/fixture/stacktrace package with New, Errorf, and Wrap functions
  • Replaces standard error functions (fmt.Errorf, errors.New, err returns) with stacktrace equivalents throughout tmpnet
  • Updates documentation to explain the stacktrace feature

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

File Description
tests/fixture/stacktrace/stacktrace.go New stacktrace package implementing error wrapping with stack traces
tests/fixture/tmpnet/README.md Documents the new stack trace error feature
Multiple tmpnet files Replaces error handling with stacktrace equivalents
Comments suppressed due to low confidence (1)

tests/fixture/tmpnet/start_kind_cluster.go:1

  • [nitpick] This TODO comment is unrelated to the stacktrace changes and appears to have been accidentally included in the diff. Consider removing it or addressing it in a separate PR.
// Copyright (C) 2019-2025, Ava Labs, Inc. All rights reserved.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@maru-ava maru-ava moved this from In Progress 🏗️ to In Review 🔎 in avalanchego Sep 11, 2025
Setting STACK_TRACE_ERROR=1 will configure tmpnet to include stacktraces
with errors it originates. This is intended to aid in debugging by
indicating not just the location of the failure but also chain of
callers.
Copy link
Contributor

@samliok samliok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapping the error in stacktrace is nice 👍 but don't you sort of get the same result if you fmt the error before returning it? ex.

return nil, stacktrace.Wrap(err)

vs

return nil, fmt.Errorf("failed to do something %w", err)

@joshua-kim
Copy link
Contributor

joshua-kim commented Sep 11, 2025

but don't you sort of get the same result if you fmt the error before returning it? ex.

It's is similar, but you just get a long error string and have to reconstruct the call stack by searching for error messages (e.g failed to foo: failed to bar: failed to baz). It also doesn't handle deduping, so if you have similar error messages in multiple locations you might have some ambiguity. If you want a true stacktrace dump (like the ones that other languages support), to my understanding you just have to implement them yourself and this is considered idiomatic although it may look odd at first glance.

var stackTraceErrors bool

func init() {
if os.Getenv("STACK_TRACE_ERRORS") == "1" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if tmpnet has other env vars it supports, but should we prefix them with something like TMPNET_ or TMPNET_DEBUG_?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally avoided prefixing with TMPNET because it's not a tmpnet-specific thing. tmpnet is only the first adopter, but there's no reason for this library to be restricted to it (hence not putting it under tests/fixture/tmpnet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I think this env var is ideal, just that I think it's good enough for now.

}
}

return StackTraceError{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to export this type? It seems like we only use this type as the error interface downstream anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to err on the side of exporting when it comes to library functionality in support of testing to minimize friction with downstream repos like coreth and subnet-evm.

Co-authored-by: rodrigo <77309055+RodrigoVillar@users.noreply.github.com>
Signed-off-by: maru <maru.newby@avalabs.org>
@joshua-kim joshua-kim added this pull request to the merge queue Sep 11, 2025
Merged via the queue into master with commit a5ee7a4 Sep 11, 2025
35 checks passed
@joshua-kim joshua-kim deleted the tmpnet-stacktrace branch September 11, 2025 21:28
@github-project-automation github-project-automation bot moved this from In Review 🔎 to Done 🎉 in avalanchego Sep 11, 2025
felipemadero pushed a commit that referenced this pull request Sep 15, 2025
…4262)

Signed-off-by: maru <maru.newby@avalabs.org>
Co-authored-by: rodrigo <77309055+RodrigoVillar@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing This primarily focuses on testing tooling Build, test and development tooling

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants