-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdctest: perform txn retries in validator #43215
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First commit is #43200
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/cmd/roachtest/test.go, line 277 at r4 (raw file):
// get any stack traces. if err, ok := args[0].(error); ok { msg = fmt.Sprintf("%+v", err)
@knz help me out here
I really want stack traces in these errors, but currently they look pretty bad (see example below). Without doing something about it, the output I think would be worse than the current one without the stack traces. Or perhaps you'll teach me to do something else.
For example, below is a pq: restart transaction: TransactionRetryWithProtoRefreshError
wrapped a couple of times.
I'd like to see the messages of all the wrapping errors plus the message of the original TransactionRetryWithProtoRefreshError
one on the first line, and then all the stack traces (perhaps with the respective messages again between them). I'd also get rid of the "error with attached stack trace:" lines; they add little.
printed with %s
:
monitor failure: monitor task failed: CDC failed: fingerprintValidator failed: pq: restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_TIMESTAMP_CACHE_REJECTED_POSSIBLE_REPLAY): "sql txn" meta={id=4b231b10 key=/Table/54/1/500/0 pri=0.01371671 epo=0 ts=1576532273.889446872,1 min=1576532273.877883066,0 seq=2} rw=true stat=PENDING rts=1576532273.877883066,0 wto=false max=1576532273.946122576,0
printed with %+v
:
error with attached stack trace:
main.(*monitor).WaitE
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2157
main.(*monitor).Wait
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2165
main.runCDCBank
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:378
main.registerCDC.func8
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:649
main.(*testRunner).runTest.func2
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:700
runtime.goexit
/Users/andrei/goroot/src/runtime/asm_amd64.s:1357
- monitor failure:
- error with attached stack trace:
main.(*monitor).wait.func2
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2210
runtime.goexit
/Users/andrei/goroot/src/runtime/asm_amd64.s:1357
- monitor task failed:
- error with attached stack trace:
main.runCDCBank.func2.1
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:307
main.runCDCBank.func2
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:363
main.(*monitor).Go.func1
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2145
github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1
/Users/andrei/work/src/github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup/errgroup.go:57
runtime.goexit
/Users/andrei/goroot/src/runtime/asm_amd64.s:1357
- CDC failed:
- pq: restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_TIMESTAMP_CACHE_REJECTED_POSSIBLE_REPLAY): "sql txn" meta={id=33aa1def key=/Table/54/1/994/0 pri=0.02000490 epo=0 ts=1576533090.201346685,1 min=1576533090.155239137,0 seq=2} rw=true stat=PENDING rts=1576533090.155239137,0 wto=false max=1576533090.201346685,0
fingerprintValidator failed
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*fingerprintValidator).applyRowUpdate.func1
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:351
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*fingerprintValidator).applyRowUpdate
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:430
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*fingerprintValidator).NoteResolved
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:484
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.Validators.NoteResolved
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:548
github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*CountValidator).NoteResolved
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:594
main.runCDCBank.func2
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:362
main.(*monitor).Go.func1
/Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2145
github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1
/Users/andrei/work/src/github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup/errgroup.go:57
runtime.goexit
/Users/andrei/goroot/src/runtime/asm_amd64.s:1357
Can you help a brother out? Or would you rather I figure it out myself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're going to need this:
diff --git a/pkg/cmd/roachtest/cluster_test.go b/pkg/cmd/roachtest/cluster_test.go
index 2d04694bd7..cc4075210e 100644
--- a/pkg/cmd/roachtest/cluster_test.go
+++ b/pkg/cmd/roachtest/cluster_test.go
@@ -20,7 +20,7 @@ import (
"time"
"github.com/cockroachdb/cockroach/pkg/testutils"
- "github.com/pkg/errors"
+ "github.com/cockroachdb/errors"
"github.com/stretchr/testify/require"
)
@@ -148,7 +148,7 @@ func TestClusterMonitor(t *testing.T) {
// If wait terminates, context gets canceled.
err := m.wait(`true`)
- if err != context.Canceled {
+ if !errors.Is(err, context.Canceled) {
t.Errorf(`expected context canceled, got: %+v`, err)
}
})
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 2 of 2 files at r3, 1 of 1 files at r4, 1 of 1 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/cmd/roachtest/test.go, line 277 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
@knz help me out here
I really want stack traces in these errors, but currently they look pretty bad (see example below). Without doing something about it, the output I think would be worse than the current one without the stack traces. Or perhaps you'll teach me to do something else.
For example, below is a
pq: restart transaction: TransactionRetryWithProtoRefreshError
wrapped a couple of times.
I'd like to see the messages of all the wrapping errors plus the message of the originalTransactionRetryWithProtoRefreshError
one on the first line, and then all the stack traces (perhaps with the respective messages again between them). I'd also get rid of the "error with attached stack trace:" lines; they add little.printed with
%s
:monitor failure: monitor task failed: CDC failed: fingerprintValidator failed: pq: restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_TIMESTAMP_CACHE_REJECTED_POSSIBLE_REPLAY): "sql txn" meta={id=4b231b10 key=/Table/54/1/500/0 pri=0.01371671 epo=0 ts=1576532273.889446872,1 min=1576532273.877883066,0 seq=2} rw=true stat=PENDING rts=1576532273.877883066,0 wto=false max=1576532273.946122576,0
printed with
%+v
:error with attached stack trace: main.(*monitor).WaitE /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2157 main.(*monitor).Wait /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2165 main.runCDCBank /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:378 main.registerCDC.func8 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:649 main.(*testRunner).runTest.func2 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:700 runtime.goexit /Users/andrei/goroot/src/runtime/asm_amd64.s:1357 - monitor failure: - error with attached stack trace: main.(*monitor).wait.func2 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2210 runtime.goexit /Users/andrei/goroot/src/runtime/asm_amd64.s:1357 - monitor task failed: - error with attached stack trace: main.runCDCBank.func2.1 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:307 main.runCDCBank.func2 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:363 main.(*monitor).Go.func1 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2145 github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1 /Users/andrei/work/src/github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup/errgroup.go:57 runtime.goexit /Users/andrei/goroot/src/runtime/asm_amd64.s:1357 - CDC failed: - pq: restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_TIMESTAMP_CACHE_REJECTED_POSSIBLE_REPLAY): "sql txn" meta={id=33aa1def key=/Table/54/1/994/0 pri=0.02000490 epo=0 ts=1576533090.201346685,1 min=1576533090.155239137,0 seq=2} rw=true stat=PENDING rts=1576533090.155239137,0 wto=false max=1576533090.201346685,0 fingerprintValidator failed github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*fingerprintValidator).applyRowUpdate.func1 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:351 github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*fingerprintValidator).applyRowUpdate /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:430 github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*fingerprintValidator).NoteResolved /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:484 github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.Validators.NoteResolved /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:548 github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest.(*CountValidator).NoteResolved /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/ccl/changefeedccl/cdctest/validator.go:594 main.runCDCBank.func2 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cdc.go:362 main.(*monitor).Go.func1 /Users/andrei/work/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/cluster.go:2145 github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup.(*Group).Go.func1 /Users/andrei/work/src/github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup/errgroup.go:57 runtime.goexit /Users/andrei/goroot/src/runtime/asm_amd64.s:1357
Can you help a brother out? Or would you rather I figure it out myself?
my 2c is that I want the order of the errors reversed but I don't need the message grouping.
06f2a52
to
0ec6cc4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're going to need this:
Done, thanks.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @knz)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @andreimatei)
pkg/cmd/roachtest/test.go, line 277 at r4 (raw file):
I'd like to see [...]
If you file an issue on cockroachdb/errors
with an example of your desired output (like, copy-paste the error above and re-arrange it in the way you want it to look like) I can see what I can do. Ben suggested that we (I) invest in improving this anyway.
One way I'm seeing this move forward is to have %+v
print a "summary" that includes the things that you prefer to see in the common case (like your suggestion above) and keep %#v
for the more verbose report, or something like that.
Wrap and attach stack traces to various CDC testing code. Release note: None
It takes a while and it's nice to see in the UI. Release note: None
Before this patch, when an error was passed to t.Fatal(err), it was formatted with "%s". Now it will be formatted with "%+v", which includes stack traces. Without the stack trace, it's often hard to say where an error was generated. Note that a succinct stack for the t.Fatal() call was already printed, but that's different from the error's origin. Release note: None
Before this patch, fingerprintValidator.applyRowUpdate() was missing a retry loop around its sql txn. But turns out that the txn was not needed at all since we're running a single statement. Fixes cockroachdb#42773 Release note: None
0ec6cc4
to
4627894
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/cmd/roachtest/test.go, line 277 at r4 (raw file):
Previously, knz (kena) wrote…
I'd like to see [...]
If you file an issue on
cockroachdb/errors
with an example of your desired output (like, copy-paste the error above and re-arrange it in the way you want it to look like) I can see what I can do. Ben suggested that we (I) invest in improving this anyway.One way I'm seeing this move forward is to have
%+v
print a "summary" that includes the things that you prefer to see in the common case (like your suggestion above) and keep%#v
for the more verbose report, or something like that.
ok, we'll take it to a separate issue
43215: cdctest: perform txn retries in validator r=andreimatei a=andreimatei Before this patch, fingerprintValidator.applyRowUpdate() was missing a retry loop around its sql txn. But turns out that the txn was not needed at all since we're running a single statement. Fixes #42773 43250: cmd/roachtest: improve sqlsmith r=mjibson a=mjibson Co-authored-by: Andrei Matei <andrei@cockroachlabs.com> Co-authored-by: Matt Jibson <matt.jibson@gmail.com>
Build succeeded |
43304: release-19.2: cdctest: perform txn retries in validator r=andreimatei a=andreimatei Backport 4/4 commits from #43215. /cc @cockroachdb/release --- Before this patch, fingerprintValidator.applyRowUpdate() was missing a retry loop around its sql txn. But turns out that the txn was not needed at all since we're running a single statement. Fixes #42773 Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Before this patch, fingerprintValidator.applyRowUpdate() was missing a
retry loop around its sql txn. But turns out that the txn was not needed
at all since we're running a single statement.
Fixes #42773