Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Correct errors.WithStack behaviour #984

Merged
merged 8 commits into from
Dec 20, 2022

Conversation

AndrewSisley
Copy link
Contributor

Relevant issue(s)

Resolves #983

Description

Were two bugs here - firstly the number of stack frames skipped was incorrect for WithStack regardless to compile flags. Secondly when compiling with optimisations the New function was inlined resulting in a different stack depth - the tests omitted the checks that caught this as the original author (me) falsely assumed it was the race flag doing weird stuff - this will likely have been visible in production builds since it went in (unlike WithStack which is newer).

@AndrewSisley AndrewSisley added bug Something isn't working area/errors Related to the internal management or design of our error handling system action/no-benchmark Skips the action that runs the benchmark. labels Dec 19, 2022
@AndrewSisley AndrewSisley added this to the DefraDB v0.4 milestone Dec 19, 2022
@AndrewSisley AndrewSisley requested a review from a team December 19, 2022 20:16
@AndrewSisley AndrewSisley self-assigned this Dec 19, 2022
Different entry points result in a different depth - this results in a failing test with or without a `race` flag.

Also, when compiling with optimizations enabled the compiler would inline the `New` function resulting in an incorrect result (the --race flag must do this).
@codecov
Copy link

codecov bot commented Dec 19, 2022

Codecov Report

Merging #984 (bf3d13d) into develop (664d510) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #984   +/-   ##
========================================
  Coverage    57.77%   57.77%           
========================================
  Files          172      172           
  Lines        19455    19455           
========================================
  Hits         11241    11241           
  Misses        7225     7225           
  Partials       989      989           
Impacted Files Coverage Δ
errors/errors.go 100.00% <100.00%> (ø)

They arent inlined now as they do stuff in them, and that seems to be all the go compiler cares about, but this should help prevent the introduction of new bugs should this change
errors/errors.go Outdated
// This function will not be inlined by the compiler as it will spoil any stacktrace
// generated.
//go:noinline
func withStackTrace(message string, additionalStackDepthToSkip int, keyvals ...KV) *defraError {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: I see why you've introduce this new parameter but I think we should avoid it. If we add error creation functions in the future, it requires knowledge of what this means which I think shouldn't be necessary.

What I would do instead is remove the newError function and use withStackTrace in its place. With a note on the function doc string that this has to be called within the body of the error creating function, we can leave the depth to a fixed number and avoid future dev confusion.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wouldn't protect against the same problem being introduced by new functions, and hides the bug in the same way that it was hidden before. The new parameter forces visibility of the relationship between parent-child functions and the resultant number. It also gives us a natural route to expose this parameter if we ever find the need to.

If we add error creation functions in the future, it requires knowledge of what this means

This knowledge is required either way if they want a correct stacktrace (unless it accidentally has the same depth, which is exactly what caused this bug in the first place)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cognitive requirements are much lower if we don't need to calculate the depth where we are at when writing the function. Asking to have the withStackTrace function within the error creating function is trivial. Plus our tests should catch a wrong depths with the change you've made no?

newError can probably go regardless of keeping the parameter or not.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Dec 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus our tests should catch a wrong depths with the change you've made no?

Not for new public functions - they would require new tests that explicitly test the stacktrace. Assuming that those will always be written is wishful thinking IMO.

Cognitive requirements are much lower...

The cognitive requirements exist either way, they were just hidden in the old version. Which is pretty much the only reason the bug existed (plus poor testing ofc - somewhat validating my 'wishful thinking' point above).

newError could be dropped sure, and withStackTrace should probably be renamed, but that is out of scope from this PR, and would not prevent the bug from reappearing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newError could be dropped sure, and withStackTrace should probably be renamed, but that is out of scope from this PR, and would not prevent the bug from reappearing.

I would say adding the additionalStackDepthToSkip parameter doesn't prevent the bug from happening either. Someone new to the errors package creating a new error creating function would probably just copy what is written in another function and possibly have the wrong value for the parameter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is really worth the effort, it is an extremely localized point of contention and I feel like the level of push back received is grossly disproportionate to any downsides of the approach taken in the PR

The thing is that although this is localized for the purpose of the discussion, I see the outcome of discussions like this to have a very broad impact as it will influence our (at least mine) approach to other implementations. I also think about new devs looking at this package (since it's small and approachable) to observe our coding style and them seeing either the opinionated approach or the flexible one and then copying the observed approach for their own work. This is why I'm pushing back and I'm not convinced by your explanation (doesn't mean it's not valid). If someone else prefers your approach, I'll be happy to approve the PR :)

Copy link
Member

@jsimnz jsimnz Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone through this thread a few times now, read the old code, the new PR code, the references pkg/errors code Fred linked, a few other go error pkgs that include stack behavior, etc.

In the majority of cases I prefer "explicit" convention that is documented, compared to complete freedom, however this package has a fairly small surface area, and although the cognitive load is certainly higher to understand the skipStack parameters, any time you touch the code in this package, it's a manageable price, for the clarity it provides. So I think its OK to keep as is.

However, I also think that its within scope to drop newError and rely soley on withStackTrace. Lastly I'd like to see better docstring on withStackTrace to make it very clear the current behavior of depthToSkip and how to use it as a caller (even if it may seem obvious)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get on board with that change.

Extra nitpick: depthToSkip would be better than additionalStackDepthToSkip if you don't mind changing it.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll add some docs.

depthToSkip feels like it looses a lot of info, but I guess additional is internal to withStackTrace and Stack is somewhat implicit given the function name - will change.

  • doc
  • rename

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also removed newError

errors/errors.go Outdated
// This function will not be inlined by the compiler as it will spoil any stacktrace
// generated.
//go:noinline
func newError(message string, additionalStackDepthToSkip int, keyvals ...KV) *defraError {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: If newError will be kept, would be nice to have a line of documentation for the additionalStackDepthToSkip parameter.

Copy link
Collaborator

@fredcarle fredcarle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for apply the changes Andy :)

@AndrewSisley AndrewSisley merged commit 39348a6 into develop Dec 20, 2022
@AndrewSisley AndrewSisley deleted the sisley/fix/I983-with-stack branch December 20, 2022 21:24
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
* Correct typo

* Remove --race ommision from TestWithStack

* Correctly handle varying stack depth generated within this library

Different entry points result in a different depth - this results in a failing test with or without a `race` flag.

Also, when compiling with optimizations enabled the compiler would inline the `New` function resulting in an incorrect result (the --race flag must do this).

* Include ommited test assertions

* Block inlining of other stack-sensitive functions

They arent inlined now as they do stuff in them, and that seems to be all the go compiler cares about, but this should help prevent the introduction of new bugs should this change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/no-benchmark Skips the action that runs the benchmark. area/errors Related to the internal management or design of our error handling system bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WithStack omits the calling line from callstack
4 participants