Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: schemachange/index/tpcc/w=1000 failed #69100

Closed
cockroach-teamcity opened this issue Aug 18, 2021 · 9 comments
Closed

roachtest: schemachange/index/tpcc/w=1000 failed #69100

cockroach-teamcity opened this issue Aug 18, 2021 · 9 comments
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@cockroach-teamcity
Copy link
Member

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 472008d077a4282b7a5b0557ee6b6b5c12f75586:

The test failed on branch=master, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Aug 18, 2021
@blathers-crl blathers-crl bot added the T-sql-schema-deprecated Use T-sql-foundations instead label Aug 18, 2021
@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 4d6d79d8fcba33fc273297472d3c5e373b3c2b6c:

		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.makeIndexAddTpccTest.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/schemachange.go:312
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
		Wraps: (2) monitor failure
		Wraps: (3) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).wait.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/monitor.go:172
		Wraps: (4) monitor task failed
		Wraps: (5) attached stack trace
		  -- stack trace:
		  | main.init
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/monitor.go:81
		  | runtime.doInit
		  | 	/usr/local/go/src/runtime/proc.go:6309
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:208
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (6) t.Fatal() was called
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError

	cluster.go:1249,context.go:89,cluster.go:1237,test_runner.go:866: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3323450-1629354339-67-n5cpu16 --oneshot --ignore-empty-nodes: exit status 1 2: 11776
		4: 12223
		3: 12055
		1: dead (exit status 137)
		5: skipped
		Error: UNCLASSIFIED_PROBLEM: 1: dead (exit status 137)
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:225
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (3) 1: dead (exit status 137)
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 7897f24246bef3cb94f9f4bfaed474ecaa9fdee6:

The test failed on branch=master, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 11e0a4da82124e70e772a009011ca7a4007bff85:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/schemachange/index/tpcc/w=1000/run_1
	monitor.go:128,tpcc.go:279,schemachange.go:312,test_runner.go:777: monitor failure: unexpected node event: 1: dead (exit status 137)
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/monitor.go:116
		  | main.(*monitorImpl).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/monitor.go:124
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runTPCC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpcc.go:279
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.makeIndexAddTpccTest.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/schemachange.go:312
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (2) monitor failure
		Wraps: (3) unexpected node event: 1: dead (exit status 137)
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString

	cluster.go:1249,context.go:89,cluster.go:1237,test_runner.go:866: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3333872-1629526832-75-n5cpu16 --oneshot --ignore-empty-nodes: exit status 1 1: dead (exit status 137)
		4: 11622
		2: 11265
		3: 12000
		5: skipped
		Error: UNCLASSIFIED_PROBLEM: 1: dead (exit status 137)
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:225
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (3) 1: dead (exit status 137)
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ d18da6c092bf1522e7a6478fe3973817e318c247:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/schemachange/index/tpcc/w=1000/run_1
	monitor.go:128,tpcc.go:279,schemachange.go:312,test_runner.go:777: monitor failure: unexpected node event: 4: dead (exit status 137)
		(1) attached stack trace
		  -- stack trace:
		  | main.(*monitorImpl).WaitE
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/monitor.go:116
		  | main.(*monitorImpl).Wait
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/monitor.go:124
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runTPCC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpcc.go:279
		  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.makeIndexAddTpccTest.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/schemachange.go:312
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (2) monitor failure
		Wraps: (3) unexpected node event: 4: dead (exit status 137)
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString

	cluster.go:1249,context.go:89,cluster.go:1237,test_runner.go:866: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-3335718-1629614492-71-n5cpu16 --oneshot --ignore-empty-nodes: exit status 1 5: skipped
		4: dead (exit status 137)
		1: 11528
		3: 11287
		2: 11502
		Error: UNCLASSIFIED_PROBLEM: 4: dead (exit status 137)
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1173
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:281
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:856
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:960
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:897
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2107
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:225
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1371
		Wraps: (3) 4: dead (exit status 137)
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 61bd543ba7288c8f0eed6cddded7b219c9d1fcd4:

The test failed on branch=master, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

craig bot pushed a commit that referenced this issue Aug 23, 2021
69049: sql: implement crdb_internal.transaction_statistics r=ajwerner,maryliag a=Azhng

Depends on #68715 and #69022

# First Commit 

roachpb,sql,server: add TransactionFingerprintID to
 roachpb.CollectedTransactionStatistics

Previously, roachpb.CollectedTransactionStatistics does not contain
TransactionFingerprintID. This results in SQLStatusServer's
/_status/statements endpoint not able to return transaction
fingerprint id in the response.
This commit adds transaction fingerprint id into CollectedTransactionStatistics
and simplifies the API in multiple layers.

Release note: None

# Second Commit 

sql: introduce crdb_internal.transaction_statistics virtual table

This commit introduces crdb_internal.transaction_statistics virtual
table that exposes both cluster-wide in-memory transaction statistics
as well as persited transaction statistics. This new virtual table
will be used to replace crdb_internal.node_transaction_statistics
virtual table, which only surface node-local in-memory transaction
statsitics.

Follow up to #68715

Release justification: Category 4

Release note (sql change): introduced new crdb_internal.transaction_statistics
 virtual table that surfaces both cluster-wide in-memory transaction statistics
 as well as persisted transaction statistics.

69151: sql/resolver: wrong error when db does not exist for virtual schemas r=ajwerner a=fqazi

Fixes: #68060

Previously, when a database did not exist under a virtual
schema the code would fall through do a normal look up.
Before a recent refactor of the some of the internals,
we accidentally changed behavior, so that an undefined
relation error was returned. To address this, this patch
adds a check inside the resolver layer to return the correct
error instead of falling through.

Release note: None

69212: workload: fix pgx error cast in kv95 and schemachange r=RichardJCai a=rafiss

fixes #69189
#69100 is failing but i'm not sure if this is the cause. (cc @ajwerner)

This was done incorrectly after the recent upgrade to pgx4.
`pgconn.PgError` does not implement `error`, but `*pgconn.PgError`
does.

Release justification: test only change
Release note: None

69223: changefeedccl: Rework webhook sink flushing implementation. r=miretskiy a=miretskiy

Stop relying on wait group to implement flush logic in webhook sink.
The wait group does not respect context cancellation.  Because of that,
it is possible that the caller blocks, waiting for Flush to complete,
while immediately after blocking, the context is cancelled.
When this happens the go routines running responsible for processing
the messages may terminate prior to decrementing wait group counts.

Instead of relying on waitgroup, use synchronization provided by the
channels themselves, and introduce a new type of worker request (flush)
which correctly flushes and waits for flush to complete, while respecting
context cancellation.

Fixes #69175

Release Notes: None

Co-authored-by: Azhng <archer.xn@gmail.com>
Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@ajwerner
Copy link
Contributor

double OOM on this last one 🤯

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 46cef2c6f0b36ba2f7d551c8ab017832c1b9d592:

The test failed on branch=master, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

See: CI job to stress roachtests

For the CI stress job, click the ellipsis (...) next to the Run button and fill in: * Changes / Build branch: master * Parameters / `env.TESTS`: `^schemachange/index/tpcc/w=1000$` * Parameters / `env.COUNT`: <number of runs>

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@cockroach-teamcity
Copy link
Member Author

roachtest.schemachange/index/tpcc/w=1000 failed with artifacts on master @ 967ed00f80981ce8848a5e8144ee6fbd29bc95bb:

The test failed on branch=master, cloud=gce:
test timed out (see artifacts for details)
Reproduce

See: roachtest README

Same failure on other branches

/cc @cockroachdb/sql-schema

This test on roachdash | Improve this report!

@rafiss
Copy link
Collaborator

rafiss commented Aug 26, 2021

The test is no longer failing because #69313 went in.

but we'll use #69317 to track the OOM and tag that as a release blocker.

@rafiss rafiss closed this as completed Aug 26, 2021
SQL Foundations automation moved this from Triage to Closed Aug 26, 2021
@exalate-issue-sync exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023
@blathers-crl blathers-crl bot added this to Triage in SQL Foundations May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
SQL Foundations
  
Done [after migration]
Development

No branches or pull requests

3 participants