Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvcoord: setting kv.range_descriptor_cache.size=0 results in range iterator to retry endlessly #101011

Closed
DerZc opened this issue Apr 9, 2023 · 18 comments · Fixed by #101129
Closed
Labels
A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-kv KV Team X-blathers-untriaged blathers was unable to find an owner
Projects

Comments

@DerZc
Copy link

DerZc commented Apr 9, 2023

Describe the problem

When the cluster setting kv.range_descriptor_cache.size is set to 0 or a negative value, queries in SQL through the range iterator hang (e.g. retrieving the contents of crdb_internal.ranges_no_leases).

How to reproduce

>  SET CLUSTER SETTING kv.range_descriptor_cache.size=0;
> TABLE crdb_internal.ranges_no_leases;

Observe: the query hangs. A goroutine dump reveals the stack trace below. When extra logging is enabled, we see the following log trace:

I230410 16:38:07.938992 33452 kv/kvclient/kvcoord/range_iter.go:218 ⋮ [T1,n1,client=[::1]:59407,user=root] 703  range descriptor lookup failed: could not create scan bounds for range lookup: ‹"\x02\xff\xff\x00" is not valid range metadata key: body of meta key range lookup is > KeyMax›
I230410 16:38:08.852249 33452 kv/kvclient/kvcoord/range_iter.go:218 ⋮ [T1,n1,client=[::1]:59407,user=root] 704  range descriptor lookup failed: could not create scan bounds for range lookup: ‹"\x02\xff\xff\x00" is not valid range metadata key: body of meta key range lookup is > KeyMax›
I230410 16:38:09.817610 33452 kv/kvclient/kvcoord/range_iter.go:218 ⋮ [T1,n1,client=[::1]:59407,user=root] 705  range descriptor lookup failed: could not create scan bounds for range lookup: ‹"\x02\xff\xff\x00" is not valid range metadata key: body of meta key range lookup is > KeyMax›
I230410 16:38:10.741478 33452 kv/kvclient/kvcoord/range_iter.go:218 ⋮ [T1,n1,client=[::1]:59407,user=root] 706  range descriptor lookup failed: could not create scan bounds for range lookup: ‹"\x02\xff\xff\x00" is not valid range metadata key: body of meta key range lookup is > KeyMax›
I230410 16:38:11.864487 33452 kv/kvclient/kvcoord/range_iter.go:218 ⋮ [T1,n1,client=[::1]:59407,user=root] 707  range descriptor lookup failed: could not create scan bounds for range lookup: ‹"\x02\xff\xff\x00" is not valid range metadata key: body of meta key range lookup is > KeyMax›
I230410 16:38:12.779213 33452 kv/kvclient/kvcoord/range_iter.go:218 ⋮ [T1,n1,client=[::1]:59407,user=root] 708  range descriptor lookup failed: could not create scan bounds for range lookup: ‹"\x02\xff\xff\x00" is not valid range metadata key: body of meta key range lookup is > KeyMax›

Stack trace:

github.com/cockroachdb/cockroach/pkg/util/retry.(*Retry).Next(0x8b0c5bd40)›
        github.com/cockroachdb/cockroach/pkg/util/retry/retry.go:128 +0x13e›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*RangeIterator).Seek(0x8b0c5c3a0, {0x2532c10, 0x8b1a4b170}, {0x85c894f18, 0x4, 0xe8}, 0x0)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/range_iter.go:204 +0x47c›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).divideAndSendBatchToRanges(0x85cac3400, {0x2532c10, 0x8b1a4b170}, 0x8b00a7b00, {{0x85c894f18, 0x4, 0xe8}, {0x9ea638a, 0x1, 0x1}}, ...)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:1244 +0x226›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*DistSender).Send(0x85cac3400, {0x2532c10, 0x8b1a4b140}, 0x8b00a7b00)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go:871 +0x675›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnLockGatekeeper).SendLocked(0x85f798bb8, {0x2532c10, 0x8b1a4b140}, 0x8b00a7b00)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_lock_gatekeeper.go:82 +0x1e2›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnMetricRecorder).SendLocked(0x85f798b80, {0x2532c10?, 0x8b1a4b140?}, 0x0?)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_metric_recorder.go:47 +0xe2›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnCommitter).SendLocked(0x85f798b50, {0x2532c10, 0x8b1a4b140}, 0x8b00a7b00)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go:130 +0x63d›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).sendLockedWithRefreshAttempts(0x85f798a50, {0x2532c10, 0x8b1a4b140}, 0x8b00a7b00, 0x5)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:226 +0x283›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSpanRefresher).SendLocked(0x85f798a50, {0x2532c10, 0x8b1a4b140}, 0x8b00a7b00?)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go:154 +0xb3›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnPipeliner).SendLocked(0x85f798920, {0x2532c10, 0x8b1a4b140}, 0xe8?)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go:291 +0x125›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnSeqNumAllocator).SendLocked(0x85f798900?, {0x2532c10?, 0x8b1a4b140?}, 0x8b00a7b00?)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_seq_num_allocator.go:105 +0x82›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*txnHeartbeater).SendLocked(0x85f798850, {0x2532c10, 0x8b1a4b140}, 0x8b00a7b00)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_interceptor_heartbeater.go:246 +0x4a6›
github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord.(*TxnCoordSender).Send(0x85f798680, {0x2532c10, 0x8bd8b3980}, 0x8b00a7b00)›
        github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:530 +0x582›
github.com/cockroachdb/cockroach/pkg/kv.(*DB).sendUsingSender(0x85c9fcc60, {0x2532c10, 0x8bd8b3980}, 0x8b00a7b00, {0x8a6516b40, 0x85f798680})›
        github.com/cockroachdb/cockroach/pkg/kv/db.go:1001 +0xe7›
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Send(0x85cbeba40, {0x2532c10, 0x8bd8b3980}, 0x8b00a7b00)›
        github.com/cockroachdb/cockroach/pkg/kv/txn.go:1076 +0x209›
github.com/cockroachdb/cockroach/pkg/kv.sendAndFill({0x2532c10, 0x8bd8b3980}, 0x8b0789868, 0x85f799180)›
        github.com/cockroachdb/cockroach/pkg/kv/db.go:831 +0xf8›
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Run(0x85cbeba40, {0x2532c10, 0x8bd8b3980}, 0xcab340?)›
        github.com/cockroachdb/cockroach/pkg/kv/txn.go:689 +0x74›
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).scan(0x4?, {0x2532c10, 0x8bd8b3980}, {0xcab340, 0x8b176c408}, {0xcab340, 0x8b176c330}, 0x0, 0x8b?, 0x0)›
        github.com/cockroachdb/cockroach/pkg/kv/txn.go:563 +0xdc›
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Scan(...)›
        github.com/cockroachdb/cockroach/pkg/kv/txn.go:577›
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).Iterate(0x9ea638a?, {0x2532c10, 0x8bd8b3980}, {0xcab340?, 0x8b176c318?}, {0xcab340, 0x8b176c330}, 0x0, 0x8b0c5dbb8)›
        github.com/cockroachdb/cockroach/pkg/kv/txn.go:629 +0xb0›
github.com/cockroachdb/cockroach/pkg/util/rangedesc.(*impl).Scan.func1({0x2532c10, 0x8bd8b3980}, 0x0?)›
        github.com/cockroachdb/cockroach/pkg/util/rangedesc/rangedesc.go:145 +0x38e›
github.com/cockroachdb/cockroach/pkg/kv.runTxn.func1({0x2532c10?, 0x8bd8b3980?}, 0x6bfe368?)›
        github.com/cockroachdb/cockroach/pkg/kv/db.go:965 +0x27›
github.com/cockroachdb/cockroach/pkg/kv.(*Txn).exec(0x85cbeba40, {0x2532c10, 0x8bd8b3980}, 0x8b0c5dd60)›
        github.com/cockroachdb/cockroach/pkg/kv/txn.go:942 +0xa4›
github.com/cockroachdb/cockroach/pkg/kv.runTxn({0x2532c10, 0x8bd8b3980}, 0x8bd8b3980?, 0x50?)›
        github.com/cockroachdb/cockroach/pkg/kv/db.go:964 +0x6b›
github.com/cockroachdb/cockroach/pkg/kv.(*DB).TxnWithAdmissionControl(0x8b0789e18?, {0x2532c10, 0x8bd8b3980}, 0xbd7480?, 0x0?, 0x0?, 0x1?)›
        github.com/cockroachdb/cockroach/pkg/kv/db.go:927 +0xa7›
github.com/cockroachdb/cockroach/pkg/kv.(*DB).Txn(0x9ea6474?, {0x2532c10?, 0x8bd8b3980?}, 0x833f7a1d8?)›
        github.com/cockroachdb/cockroach/pkg/kv/db.go:902 +0x2d›
github.com/cockroachdb/cockroach/pkg/util/rangedesc.(*impl).Scan(0x85d13f6f0, {0x2532c10, 0x8bd8b3980}, 0x0, 0x8bdf1c7f0, {{0x0, 0x0, 0x0}, {0x9ea6474, 0x2, ...}}, ...)›
        github.com/cockroachdb/cockroach/pkg/util/rangedesc/rangedesc.go:119 +0x1c9›
github.com/cockroachdb/cockroach/pkg/util/rangedesc.(*impl).NewIterator(0x25b77a8?, {0x2532c10, 0x8bd8b3980}, {{0x0, 0x0, 0x0}, {0x9ea6474, 0x2, 0x2}})›
        github.com/cockroachdb/cockroach/pkg/util/rangedesc/rangedesc.go:210 +0x13b›
github.com/cockroachdb/cockroach/pkg/sql.glob..func72({0x2532c10, 0x8bd8b3980}, 0x8b2da8610, {0x40?, 0x0?}, 0x0?)›
        github.com/cockroachdb/cockroach/pkg/sql/crdb_internal.go:4101 +0x318›
github.com/cockroachdb/cockroach/pkg/sql.(*virtualDefEntry).getPlanInfo.func1({0x2532c10, 0x8bd8b3980}, 0x8b2da8610, {0x85e68f0c0?, 0xa?})›
        github.com/cockroachdb/cockroach/pkg/sql/virtual_schema.go:655 +0xa99›
github.com/cockroachdb/cockroach/pkg/sql.constructVirtualScan.func1({0x2532c10?, 0x8bd8b3980?}, 0x0?)›
        github.com/cockroachdb/cockroach/pkg/sql/exec_factory_util.go:250 +0x34›
github.com/cockroachdb/cockroach/pkg/sql.(*delayedNode).startExec(0x8b18568c0, {{0x2532c10?, 0x8bd8b3980?}, 0x85fd3d800?, 0x8b2da8610?})›

Original issue description

Describe the problem

Cockroachdb hang on the following program:

CREATE DATABASE database38;
USE database38;
CREATE TABLE t0 (c0 TIMESTAMP);
SHOW RANGES FROM TABLE t0;

I run the server on a single machine with command ./cockroach start-single-node --insecure, run the sql file with command cockroach sql --echo-sql --insecure --port 26257 --user root < database38.sql

Expected behavior
No hang.

Environment:

  • CockroachDB version [last commit version 85e41ca]
  • Server OS: [ubuntu 22.04]
  • Client app [CLI]

Jira issue: CRDB-26747

@DerZc DerZc added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Apr 9, 2023
@blathers-crl
Copy link

blathers-crl bot commented Apr 9, 2023

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I was unable to automatically find someone to ping.

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-untriaged blathers was unable to find an owner labels Apr 9, 2023
@xinhaoz
Copy link
Member

xinhaoz commented Apr 10, 2023

@knz I recall you've made a few improvements to SHOW RANGES recently -- any ideas on what this could be?

@knz
Copy link
Contributor

knz commented Apr 10, 2023

@DerZc I just tried this on my machine, and it seems to work.

Can you provide step-by-step instructions on how to reproduce?

Maybe a screen recording could also help.

@DerZc
Copy link
Author

DerZc commented Apr 10, 2023

@knz thank you for your response! This is a screen recording. Hope it can help.
https://drive.google.com/file/d/1OzpfF-49f3xo_mOLZBKtoFSt8W7F_Z39/view?usp=sharing

I run this program on Ubuntu 22.04.
I install go with apt apt-get install golang-go
To build cockroach, I run the following instructions:

# download the bazelisk to cockroach root folder
wget https://github.com/bazelbuild/bazelisk/releases/download/v1.16.0/bazelisk-linux-amd64
chmod 700 bazelisk-linux-amd64
./bazelisk-linux-amd64
sudo mv bazelisk-linux-amd64 /usr/local/bin/bazel

add the following content to .bazelrc.user in the root of the cockroach directory

build --config=crosslinux
build --config nolintonbuild
test --test_tmpdir=/tmp/user

run command echo "build --remote_cache=http://127.0.0.1:9867/" >> ~/.bazelrc

then

./dev doctor
./dev build short

@knz
Copy link
Contributor

knz commented Apr 10, 2023

@DerZc Thank you.

When the SQL shell hangs, please do the following:

  1. send the SIGQUIT signal to the server process (not client)
  2. share a copy of the log file with us (cockroach-data/cockroach.log)

Thank you

@DerZc
Copy link
Author

DerZc commented Apr 10, 2023

This is the log file I just generate
cockroach.log

@knz
Copy link
Contributor

knz commented Apr 10, 2023

I see in the log file that you have not send the SIGQUIT signal to the server.

You need to do this before you stop the server. Use the command kill -QUIT or pkill -QUIT.

@DerZc
Copy link
Author

DerZc commented Apr 10, 2023

so sorry I just miss understood. but I just delete the folder cockroach-data, then I can not reproduce this.

@knz
Copy link
Contributor

knz commented Apr 10, 2023

I just delete the folder cockroach-data, then I can not reproduce this.

Oh then I know exactly what was the cause of your issue:

  1. you initially created cockroach-data using a CockroachDB build from the master branch some time ago, perhaps 1-2 weeks ago
  2. then, you upgraded your crdb binary from the latest master branch
  3. then you observed the symptoms you reported above.

The real problem you are encountering is that it is not generally possible to reuse a cockroach-data directory across diffrent versions of the master branch. Data reuse is only possible when upgrading from one release build to another. The master branch does not produce release builds.

@DerZc
Copy link
Author

DerZc commented Apr 10, 2023

I just reproduce it. I have another program which triggered another assertion failure, I run this program first, then run the SHOW RANGES, it will hang. I paste the program here, could you please try it.

USE test;
DROP DATABASE IF EXISTS database4 CASCADE;
CREATE DATABASE database4;
USE database4;
--Don't send automatic bug reports\nSET CLUSTER SETTING debug.panic_on_failed_assertions = true;
SET CLUSTER SETTING diagnostics.reporting.enabled    = false;
SET CLUSTER SETTING diagnostics.reporting.send_crash_reports = false;
-- Disable the collection of metrics and hope that it helps performance\nSET CLUSTER SETTING sql.metrics.statement_details.enabled = 'off';
SET CLUSTER SETTING sql.metrics.statement_details.plan_collection.enabled = 'off';
SET CLUSTER SETTING sql.stats.automatic_collection.enabled = 'off';
SET CLUSTER SETTING timeseries.storage.enabled = 'off';
set experimental_enable_hash_sharded_indexes='on';
CREATE TABLE t0 (c0 BIT(98), c1 BYTES[] DEFAULT (ARRAY['La']), FAMILY "primary" (c0, c1));
CREATE TABLE t1 (c0 BOOL UNIQUE  PRIMARY KEY);
CREATE INDEX ON t1(c0);
UPSERT INTO t1 (c0) VALUES(false);

UPSERT INTO t0 (c1, rowid, c0) VALUES(ARRAY[], 740902581, B'00000001100011000101000011111011011011011111111011110110000100110101100011000100000110011100111010'), (ARRAY[], 913545507, B'01000101000111010010001000111010011111111010101010011010101001011111110001101110110101100110101111');
INSERT INTO t1 (c0) VALUES(true);
CREATE UNIQUE INDEX ON t0(rowid DESC, c0);
EXPLAIN (OPT) SELECT DISTINCT MAX(B'1001011110011100') FROM t1, t0 WHERE false HAVING MIN(true) OFFSET -1437157946;
EXPLAIN SELECT DISTINCT MIN(B'01101110000111100010111001010000101001111010000010010001111011111111011110010111111011110101000000100101000001000001011000000110110') FROM t1, t0 GROUP BY t0.c1 HAVING BOOL_OR(true);
SET SESSION SERIAL_NORMALIZATION='rowid';
CREATE INDEX ON t1(c0);
UPSERT INTO t0 (c1) VALUES(ARRAY[]);
UPSERT INTO t1 (c0) VALUES(false), (true);

EXPLAIN SELECT DISTINCT MAX(B'011101011111001011111100011') FROM t0 WHERE ((0.5746270247934203) IS NOT NAN) GROUP BY CASE  WHEN ((t0.c1)<=(t0.c1)) THEN TIMESTAMP '1969-12-18T09:18:03'  WHEN (t0.c0) IN (t0.c0, t0.c0, t0.c0) THEN NULLIF(TIMESTAMP '1969-12-28T04:59:18', TIMESTAMP '1970-01-02T01:29:29') END HAVING MIN(((t0.c0) IS NOT NULL)) LIMIT NULL;
UPSERT INTO t0 (rowid) VALUES(-410353781);
CREATE INDEX ON t1(c0 ASC);
SET CLUSTER SETTING kv.range_descriptor_cache.size=-6285611420679191815;
INSERT INTO t0 (c1, rowid, c0) VALUES(NULL, -382358637, B'11010110010011110110011010010000000110111111000000111111000001101100011111001110001111001110010000'), (ARRAY[], 1834202064, B'00100011110000001011101101000111101100100001111100011100110011011000110000100100010011100111001010');
CREATE INDEX ON t1(c0);
UPSERT INTO t0 (rowid, c0) VALUES(-733569892, B'01011001000101111111110000101010100100001110100000010101001001101001111011101111110010010101011000');
DELETE FROM t0;
CREATE VIEW v0(c0) AS SELECT DISTINCT MIN(NULL) FROM t0 CROSS JOIN t1 @{FORCE_INDEX=t1_c0_idx2,DESC} WHERE true LIMIT 1917325762 OFFSET -1379351615;
UPDATE t1 SET c0=CASE  WHEN false THEN ((t1.c0) IN (t1.c0) < ALL (false, t1.c0, ((t1.c0) IS NOT NULL), t1.c0))  WHEN (TIMESTAMP '1969-12-29T04:37:26' > SOME (TIMESTAMP '1970-01-20T22:43:59', ((TIMESTAMP '1970-01-19T23:30:19') :::TIMESTAMP))) THEN NULLIF(true, NULL) END WHERE false;
UPSERT INTO t0 (rowid) VALUES(1449929209), (987686117);
EXPLAIN SELECT MAX(B'1001010010000000000000100010101000101101101011011110001101111111110011100000001000001010101111110000000') FROM t0;
UPSERT INTO t1 (c0) VALUES(true);
EXPLAIN (TYPES) SELECT DISTINCT MIN(B'0000010111010111110001110101011110001011100111110010010101011000110110111111011010000111001100011101011000100001111100010111011101110110010'), MIN(NULL), BIT_OR(699808671) FROM t0 WHERE ((t0.c0)!=(t0.c0)) HAVING MIN(false) LIMIT -290038478;
CREATE INDEX ON t1(c0);
INSERT INTO t1 (c0) VALUES(true) ON CONFLICT (c0) DO  NOTHING ;
UPSERT INTO t1 (c0) VALUES(false);
EXPLAIN SELECT DISTINCT MIN(B'0100000111111100000111100011011001110101101111010000111100001010001110101100111111011001100101010000001001010010000100010110011001000000011111000001011110') FROM t1, t0;
CREATE UNIQUE INDEX ON t1(c0 ASC);
EXPLAIN SELECT MAX(((CASE  WHEN false THEN B'101000110011000101110001101100000101010110100101010000011110011010110100011001101111111000111011100100' ELSE B'001111101111001010101000000110001101101101001111110111111001101101001111101001110111110111010010111100' END) :::BIT(102))) FROM t0 WHERE false GROUP BY ((((CASE  WHEN true THEN 1857525835 ELSE -1 END)/(((-1311620431)/(2057627817)))))/(((-453254286)*(IF(NULL, NULL, 1972544939))))) HAVING BOOL_AND(CASE  WHEN ((((true) :::BOOL))AND(true)) THEN false  WHEN (NOT ((t0.rowid) IN (t0.rowid))) THEN false END);
UPSERT INTO t1 (c0) VALUES(false), (true);
INSERT INTO t0 (rowid) VALUES(615867158) ON CONFLICT (c0, rowid) DO  NOTHING ;
SET SESSION BYTEA_OUTPUT=base64;
UPSERT INTO t1 (c0) VALUES(false);
TRUNCATE t1 RESTRICT;

(for send the SIGQUIT signal, do you mean kill the server with kill -QUIT from another command line window)

@knz
Copy link
Contributor

knz commented Apr 10, 2023

do you mean kill the server with kill -QUIT from another command line window

yes this is what I mean

@DerZc
Copy link
Author

DerZc commented Apr 10, 2023

cockroach.log
I just got a new log file, does this one make sense?

@knz
Copy link
Contributor

knz commented Apr 10, 2023

yes thanks.

Also, I notice in your test input the following statement

SET CLUSTER SETTING kv.range_descriptor_cache.size=-6285611420679191815

what are you trying to achieve with this? This is not a supported statement.

@DerZc
Copy link
Author

DerZc commented Apr 10, 2023

I am trying a new black-box method for logic bugs in DBMS. This test case was generated by sqlancer

@knz
Copy link
Contributor

knz commented Apr 10, 2023

Ok with the example input you provided, I am also able to reproduce the problem.

Note that part of the cause is a different issue. The example input you have shared here results in a schema corruption error. I extracted it here: #101123.

@knz knz added A-kv-client Relating to the KV client and the KV interface. T-kv KV Team labels Apr 10, 2023
@blathers-crl blathers-crl bot added this to Incoming in KV Apr 10, 2023
@knz knz changed the title Hang on SHOW RANGES kvcoord: setting kv.range_descriptor_cache.size=0 results in range iterator to retry endlessly Apr 10, 2023
@knz
Copy link
Contributor

knz commented Apr 10, 2023

I have found the cause. I updated the issue text above accordingly.

@DerZc Note that it is not generally safe to set cluster settings randomly. I recommend you configure sqllance to either:

  • not generate SET CLUSTER SETTING at all
  • use a specific list of "safe" cluster settings and only change those.

@knz
Copy link
Contributor

knz commented Apr 10, 2023

@DerZc let me change my opinion from the above, upon advice by @nvanbenschoten .

It would be interesting for us to know of bugs that happen only when sqllance produces SET CLUSTER SETTINGS statements.

So perhaps you can keep this feature enabled, but when an issue happens please be sure to report all cluster setting changes that have been generated.

craig bot pushed a commit that referenced this issue Apr 10, 2023
99719: server, ui: remove txn_fingerprint_id in stmts overview query, stop aggregating on UI r=xinhaoz a=xinhaoz

See individual commits.

Fixes: #99708
Fixes: #99390

https://www.loom.com/share/d1a3a45f3c08425daab3a1fb9af01e37

100921: multitenant: add can_check_consistency capability r=knz a=ecwall

Fixes #100951

This adds support for secondary tenants to use crdb_internal.check_consistency.

The functionality is guarded by a new can_check_consistency capability that is
disabled by default.

Release note: None

100976: roachtest: sometimes use pgwire cancellation in sqlsmith r=yuzefovich a=yuzefovich

This commit makes it so that we use the pgwire cancellation mechanism of timing out queries in 50% cases. Previously, we relied on the statement timeout, but now that we support the context cancellation via the pgwire, we should use that too.

Epic: None

Release note: None

101128: clisqlclient: only print the warning when there are secondary tenants r=rafiss a=knz

The previous change in this area forgot to test the negative condition. Oops!

Release note: None
Epic: CRDB-23559

101129: kv: enforce minimum value for kv.range_descriptor_cache.size r=knz a=nvanbenschoten

Closes #101011.

This commit adds a minimum value for the kv.range_descriptor_cache.size cluster setting. The minimum value is set to 64, which avoid a cache that is too small to be useful and could thrash.

Release note: None

Co-authored-by: Xin Hao Zhang <xzhang@cockroachlabs.com>
Co-authored-by: Evan Wall <wall@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig craig bot closed this as completed in 9cddbc2 Apr 10, 2023
KV automation moved this from Incoming to Closed Apr 10, 2023
@DerZc
Copy link
Author

DerZc commented Apr 11, 2023

I got it. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-kv KV Team X-blathers-untriaged blathers was unable to find an owner
Projects
KV
Closed
Development

Successfully merging a pull request may close this issue.

3 participants