Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: use first transient error when checking for flakes #124403

Merged

Conversation

renatolabs
Copy link
Contributor

@renatolabs renatolabs commented May 20, 2024

Previously, roachtest would only look at the outermost error in a chain that matched a TransientError (or ErrorWithOwnership) when checking for flakes. However, that is in most cases not what we want: if a transient error wraps another transient error, the actual reason for the failure is the original (wrapped) error.

Informs: #123887

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@renatolabs renatolabs force-pushed the rc/roachtest-multiple-transient-errors branch from 5efd8e3 to 84e0bb6 Compare May 20, 2024 05:34
@renatolabs renatolabs marked this pull request as ready for review May 20, 2024 09:03
@renatolabs renatolabs requested a review from a team as a code owner May 20, 2024 09:03
@renatolabs renatolabs requested review from nameisbhaskar and vidit-bhat and removed request for a team May 20, 2024 09:03
matched = true
err = errors.Unwrap(err)
if err == nil {
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or keep going? What if the next occurrence can be unwrapped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors.Unwrap(err) returning nil means the err passed doesn't wrap any other error, so there's no "next occurrence". But maybe I misunderstand what you're trying to say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of "multi-errors" (more in the comment for UnwrapOnce). Either way, Unwrap returns nil, in this case, so I suppose those errors aren't very likely.

Copy link
Member

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Previously, roachtest would only look at the outermost error in a
chain that matched a `TransientError` (or `ErrorWithOwnership`) when
checking for flakes. However, that is in most cases *not* what we
want: if a transient error wraps another transient error, the actual
reason for the failure is the original (wrapped) error.

Informs: cockroachdb#123887

Release note: None
@renatolabs renatolabs force-pushed the rc/roachtest-multiple-transient-errors branch from 84e0bb6 to e24022b Compare May 21, 2024 05:21
@renatolabs
Copy link
Contributor Author

TFTR!

bors r=srosenberg

@craig
Copy link
Contributor

craig bot commented May 21, 2024

Build failed (retrying...):

@craig craig bot merged commit 7807ee2 into cockroachdb:master May 21, 2024
22 checks passed
@renatolabs renatolabs deleted the rc/roachtest-multiple-transient-errors branch May 23, 2024 06:51
@renatolabs
Copy link
Contributor Author

blathers backport 24.1 23.2

Copy link

blathers-crl bot commented May 23, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from e24022b to blathers/backport-release-23.2-124403: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

renatolabs added a commit to renatolabs/cockroach that referenced this pull request Sep 11, 2024
This PR is very similar to cockroachdb#124403, but applying the same logic when
finding `ErrorWithOwnership` instances (instead of `TransientErorr`
instances). Specifically, we look for the innermost instance of an
error with ownership to decide what ownership to apply. The idea is
that ownership should be assigned based on the "first" error observed
during the test.

Fixes: cockroachdb#130469

Release note: None
craig bot pushed a commit that referenced this pull request Sep 11, 2024
130508: roachtest: find error ownership in the innermost error in the chain r=srosenberg a=renatolabs

This PR is very similar to #124403, but applying the same logic when finding `ErrorWithOwnership` instances (instead of `TransientErorr` instances). Specifically, we look for the innermost instance of an error with ownership to decide what ownership to apply. The idea is that ownership should be assigned based on the "first" error observed during the test.

Fixes: #130469

Release note: None

Co-authored-by: Renato Costa <renato@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants