SELECT query sometimes taking a lot of time in 3 node cluster #25151

debraj-manna · 2018-04-29T06:23:50Z

Crossposting from forum

I am having a 3 node cockroach cluster in a dev set-up.

On a table like below

root@:26257/vnera> show create table config_str;
+------------+--------------------------------------------------------------------------+
|   Table    |                               CreateTable                                |
+------------+--------------------------------------------------------------------------+
| config_str | CREATE TABLE config_str (                                                |
|            |                                                                          |
|            |     customerid INTEGER NOT NULL,                                         |
|            |                                                                          |
|            |     objecttype INTEGER NOT NULL,                                         |
|            |                                                                          |
|            |     objectid BIGINT NOT NULL,                                            |
|            |                                                                          |
|            |     propertyname STRING NOT NULL,                                        |
|            |                                                                          |
|            |     timeinstant BIGINT NOT NULL,                                         |
|            |                                                                          |
|            |     updatetimestamp TIMESTAMP WITH TIME ZONE NULL DEFAULT now(),         |
|            |                                                                          |
|            |     value STRING NULL,                                                   |
|            |                                                                          |
|            |     CONSTRAINT "primary" PRIMARY KEY (customerid ASC, objecttype ASC,    |
|            | propertyname ASC, objectid ASC, timeinstant DESC),                       |
|            |                                                                          |
|            |     INDEX cid_ot_prop_oid_time_uts (customerid ASC, objecttype ASC,      |
|            | propertyname ASC, objectid ASC, timeinstant ASC, updatetimestamp ASC),   |
|            |                                                                          |
|            |     INDEX cid_uts (customerid ASC, updatetimestamp ASC),                 |
|            |                                                                          |
|            |     INDEX uts_desc_cid_ot (updatetimestamp DESC, customerid ASC,         |
|            | objecttype ASC),                                                         |
|            |                                                                          |
|            |     INDEX uts_cid_ot (updatetimestamp ASC, customerid ASC, objecttype    |
|            | ASC),                                                                    |
|            |                                                                          |
|            |     INDEX ot_uts_cid (objecttype ASC, updatetimestamp ASC, customerid    |
|            | ASC),                                                                    |
|            |                                                                          |
|            |     FAMILY "primary" (customerid, objecttype, objectid, propertyname,    |
|            | timeinstant, updatetimestamp, value)                                     |
|            |                                                                          |
|            | )                                                                        |
+------------+--------------------------------------------------------------------------+
(1 row)

Time: 97.108731ms

Sometimes the query like below is taking more than 20 seconds.

SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot where  updateTimestamp >= 1524899796::TIMESTAMPTZ; 
|    1010735 |        515 |          1989444141 | _name        | 1524881400000 | 2018-04-28 07:16:40.312+00:00 |
|      10686 |        546 |          1520650600 | _name        | 1524880111147 | 2018-04-28 07:16:40.284+00:00 |
|    1010714 |        515 |          1128009733 | _name        | 1524876900000 | 2018-04-28 07:16:39.297+00:00 |
|    1010714 |        515 |          1128009733 | _denorm      | 1524876900000 | 2018-04-28 07:16:38.876+00:00 |
|      10686 |        546 |          1520650600 | _canonical   | 1524880111147 | 2018-04-28 07:16:36.336+00:00 |
+------------+------------+---------------------+--------------+---------------+-------------------------------+
(58877 rows)

Time: 20.65102417s

The output pprof & cockroach logs from each node is attached.

curl -X GET "http://localhost:8091/debug/pprof/profile?seconds=30" > pprof.ttf

Load on cluster is about 600 reads per second and about 150 inserts per sec.

Number of rows in table.

root@:26257/vnera> select count(*) from config_str;
+---------+
|  count  |
+---------+
| 2028627 |
+---------+
(1 row)

Time: 51.387208194s

In my log I am seeing the below trace of warning sometimes

W180427 22:23:11.568057 137 storage/store.go:3857  [n2,s2] handle raft ready: 0.6s [processed=0]
W180427 22:23:11.569376 144 storage/store.go:3857  [n2,s2] handle raft ready: 0.6s [processed=0]
W180427 22:23:11.583835 131 storage/engine/rocksdb.go:1755  batch [5/412/0] commit took 549.832341ms (>500ms):
goroutine 131 [running]:
runtime/debug.Stack(0xf2a33f, 0xed27599cf, 0x0)
        /usr/local/go/src/runtime/debug/stack.go:24 +0xa7
github.com/cockroachdb/cockroach/pkg/storage/engine.(*rocksDBBatch).commitInternal(0xc42a344000, 0x0, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/engine/rocksdb.go:1756 +0x128
github.com/cockroachdb/cockroach/pkg/storage/engine.(*rocksDBBatch).Commit(0xc42a344000, 0xed2759900, 0x0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/engine/rocksdb.go:1679 +0x6fe
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).applyRaftCommand(0xc428e5fc00, 0x2612180, 0xc42c299380, 0xc425a00e38, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:5037 +0x4d5
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).processRaftCommand(0xc428e5fc00, 0x2612180, 0xc42c299380, 0xc425a00e38, 0x8, 0x7, 0x9558d, 0x100000001, 0x1, 0xd, ...)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:4742 +0x55a
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleRaftReadyRaftMuLocked(0xc428e5fc00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:3658 +0x12c8
github.com/cockroachdb/cockroach/pkg/storage.(*Store).processRequestQueue.func1(0x2612180, 0xc434acd320, 0xc428e5fc00, 0x2612180)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3799 +0x109
github.com/cockroachdb/cockroach/pkg/storage.(*Store).withReplicaForRequest(0xc4204b2c00, 0x2612180, 0xc434acd320, 0xc427cab040, 0xc430cc9ed0, 0x0)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3121 +0x135
github.com/cockroachdb/cockroach/pkg/storage.(*Store).processRequestQueue(0xc4204b2c00, 0x2612180, 0xc42090a030, 0x94)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3787 +0x229
github.com/cockroachdb/cockroach/pkg/storage.(*raftScheduler).worker(0xc4201c0e10, 0x2612180, 0xc42090a030)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/scheduler.go:226 +0x21b
github.com/cockroachdb/cockroach/pkg/storage.(*raftScheduler).Start.func2(0x2612180, 0xc42090a030)
        /go/src/github.com/cockroachdb/cockroach/pkg/storage/scheduler.go:166 +0x3e
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker.func1(0xc42031a2e0, 0xc420920000, 0xc42031a2d0)
        /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:192 +0xe9
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker
        /go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:185 +0xad

Environment

Cockroach Version 2.0.0
OS - Ubuntu 14
nproc - 4

Cockroach Nodes were started like below

cockroach start --insecure --advertise-host=platform2 --port=26257 --http-port=8091 --http-host 0.0.0.0 --store=path=/var/lib/cockroach/data --log-dir=/var/log/cockroach --join=platform1:26257,platform2:26257,platform3:26257 --max-offset=5000ms --attrs=ram:2gb,cpu:1

Some sample data

root@:26257/vnera> select * from config_str limit 2;
+------------+------------+-----------+--------------+---------------+-------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| customerid | objecttype | objectid  | propertyname |  timeinstant  |        updatetimestamp        |                                                                                                                                                                                                              value                                                                                                                                                                                                               |
+------------+------------+-----------+--------------+---------------+-------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|          0 |        832 |  86272338 | _denorm      | 1524770211742 | 2018-04-26 19:16:51.812+00:00 | CmZbRDdDWFNEWiwgVlJOSS1ORVctREVWLCB2cm5pLXBsYXRmb3JtLjEwLjE1My4xOTAuMjA1LCBJVkI0SFZQLCB2cm5pLCBQTEFURk9STSwgZGVicmFqY29ja3JvYWNoY2x1c3Rlcl0SDjA6ODMyOjg2MjcyMzM4IgtkaWQ6RDdDWFNEWiIQZW52OlZSTkktTkVXLURFViIhaG9zdDp2cm5pLXBsYXRmb3JtLjEwLjE1My4xOTAuMjA1IgtpaWQ6SVZCNEhWUCIIaW5mOnZybmkiDXJvbGU6UExBVEZPUk0iHHNldHVwOmRlYnJhamNvY2tyb2FjaGNsdXN0ZXIiDHNrdTpwbGF0Zm9ybSIPc291cmNlOkxhdW5jaGVyIgt0eXBlOm9ucHJlbQ==                 |
|          0 |        832 | 128703658 | _denorm      | 1524805856498 | 2018-04-27 05:12:21.75+00:00  | CmxbRDdDWFNEWiwgVlJOSS1ORVctREVWLCBwbGF0Zm9ybTIuMTAuMTUzLjE4OS4xNDYsIElQN1RTR1osIHZybmksIFBMQVRGT1JNLCBkZWJyYWpfY29ja3JvYWNoX2NsdXN0ZXItY2x1c3Rlcl0SDzA6ODMyOjEyODcwMzY1OCILZGlkOkQ3Q1hTRFoiEGVudjpWUk5JLU5FVy1ERVYiHWhvc3Q6cGxhdGZvcm0yLjEwLjE1My4xODkuMTQ2IgtpaWQ6SVA3VFNHWiIIaW5mOnZybmkiDXJvbGU6UExBVEZPUk0iJnNldHVwOmRlYnJhal9jb2Nrcm9hY2hfY2x1c3Rlci1jbHVzdGVyIgxza3U6cGxhdGZvcm0iD3NvdXJjZTpMYXVuY2hlciILdHlwZTpvbnByZW0= |
+------------+------------+-----------+--------------+---------------+-------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(2 rows)

Time: 350.907612ms

cockroach_logs.zip

The text was updated successfully, but these errors were encountered:

jordanlewis · 2018-05-01T14:27:19Z

What kind of disks are you running on? This message

W180427 22:23:11.583835 131 storage/engine/rocksdb.go:1755  batch [5/412/0] commit took 549.832341ms (>500ms):

means that RocksDB is taking a long time to sync to disk. Perhaps the IO load is too high for your disk configuration. You can use tools like iostat to dig in to more details around IO performance.

debraj-manna · 2018-05-03T05:19:19Z

To verify this again I created a new set-up again with 5 node cluster. I am not seeing the high IO log message now that you are referring to. But the query is still slow.

root@:26257/vnera> SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot where  updateTimestamp >= 1525329807::TIMESTAMPTZ    order by updateTimestamp desc  limit 2147483647 ;
+------------+------------+----------+--------------+-------------+-----------------+
| customerid | objecttype | objectid | propertyname | timeinstant | updatetimestamp |
+------------+------------+----------+--------------+-------------+-----------------+
+------------+------------+----------+--------------+-------------+-----------------+
(0 rows)

Time: 5.674688445s

root@:26257/vnera> explain SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot where  updateTimestamp >= 1525329807::TIMESTAMPTZ    order by updateTimestamp desc  limit 2147483647 ;
+----------------+-------+----------------------------------+
|      Tree      | Field |           Description            |
+----------------+-------+----------------------------------+
| limit          |       |                                  |
|  └── render    |       |                                  |
|       └── scan |       |                                  |
|                | table | config_str@uts_desc_cid_ot       |
|                | spans | -/2018-05-03T06:43:26.999999999Z |
|                | limit | 2147483647                       |
+----------------+-------+----------------------------------+
(6 rows)

root@:26257/vnera> show trace for SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot where  updateTimestamp >= 1525329807::TIMESTAMPTZ    order by updateTimestamp desc  limit 2147483647 ;
+----------------------------------+-------------------+-------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
|            timestamp             |        age        |                        message                        |                        tag                         | loc |             operation             | span |
+----------------------------------+-------------------+-------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
| 2018-05-03 04:57:41.376251+00:00 | 0s                | === SPAN START: sql txn ===                           |                                                    |     | sql txn                           |    0 |
| 2018-05-03 04:57:41.377359+00:00 | 1ms107µs705ns     | === SPAN START: session recording ===                 |                                                    |     | session recording                 |    5 |
| 2018-05-03 04:57:41.377363+00:00 | 1ms111µs373ns     | === SPAN START: starting plan ===                     |                                                    |     | starting plan                     |    1 |
| 2018-05-03 04:57:41.37738+00:00  | 1ms128µs821ns     | === SPAN START: consuming rows ===                    |                                                    |     | consuming rows                    |    2 |
| 2018-05-03 04:57:41.377405+00:00 | 1ms154µs30ns      | Scan /Table/54/4{-/1921-08-31T17:16:32Z}              | [n2,client=127.0.0.1:48934,user=root]              |     | sql txn                           |    0 |
| 2018-05-03 04:57:41.377417+00:00 | 1ms165µs341ns     | === SPAN START: dist sender ===                       |                                                    |     | dist sender                       |    3 |
| 2018-05-03 04:57:41.378225+00:00 | 1ms973µs993ns     | querying next range at /Table/54/4                    | [client=127.0.0.1:48934,user=root,txn=241f1d52,n2] |     | dist sender                       |    3 |
| 2018-05-03 04:57:41.378253+00:00 | 2ms1µs983ns       | r199: sending batch 1 Scan to (n2,s2):5               | [client=127.0.0.1:48934,user=root,txn=241f1d52,n2] |     | dist sender                       |    3 |
| 2018-05-03 04:57:41.378279+00:00 | 2ms28µs139ns      | sending request to local server                       | [client=127.0.0.1:48934,user=root,txn=241f1d52,n2] |     | dist sender                       |    3 |
| 2018-05-03 04:57:41.378292+00:00 | 2ms41µs130ns      | === SPAN START: /cockroach.roachpb.Internal/Batch === |                                                    |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:41.3783+00:00   | 2ms48µs823ns      | 1 Scan                                                | [n2]                                               |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:41.378309+00:00 | 2ms58µs73ns       | read has no clock uncertainty                         | [n2]                                               |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:41.378317+00:00 | 2ms65µs385ns      | executing 1 requests                                  | [n2,s2]                                            |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:41.378325+00:00 | 2ms73µs697ns      | read-only path                                        | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:41.378333+00:00 | 2ms82µs272ns      | command queue                                         | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:41.378346+00:00 | 2ms94µs726ns      | waiting for 1 overlapping requests                    | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:46.816934+00:00 | 5s440ms682µs406ns | waited 5.438574924s for overlapping requests          | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:46.816939+00:00 | 5s440ms688µs204ns | waiting for read lock                                 | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:46.817016+00:00 | 5s440ms764µs355ns | read completed                                        | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:46.817096+00:00 | 5s440ms844µs791ns | plan completed execution                              | [n2,client=127.0.0.1:48934,user=root]              |     | consuming rows                    |    2 |
| 2018-05-03 04:57:46.817102+00:00 | 5s440ms850µs500ns | resources released, stopping trace                    | [n2,client=127.0.0.1:48934,user=root]              |     | consuming rows                    |    2 |
+----------------------------------+-------------------+-------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
(21 rows)

Time: 5.447201908s

Number of rows in table.

root@:26257/vnera> select count(*) from config_str;
+---------+
|  count  |
+---------+
| 7783342 |
+---------+
(1 row)

Time: 37.883083001s

The query is still slow.

@jordanlewis I can share with you the debug logs (cockroach debug zip) of all the nodes. But it is about 34 MB which github is not allowing me to upload. Do you have any place where I can send this log ?

debraj-manna · 2018-05-03T13:16:24Z

@jordanlewis The nodes are of m4.4xlarge AWS instance type.

jordanlewis · 2018-05-03T14:04:12Z

Thanks. You can email me, first name @ cockroachlabs.com. The lines

| 2018-05-03 04:57:41.378346+00:00 | 2ms94µs726ns      | waiting for 1 overlapping requests                    | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 04:57:46.816934+00:00 | 5s440ms682µs406ns | waited 5.438574924s for overlapping requests          | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |

are rather suspicious. It had to wait 5 seconds for overlapping requests before executing. After the wait, it finished very quickly.

What other load is on the system while you're doing this?

cc @nvanbenschoten anything else we can infer about what's going on from here?

debraj-manna · 2018-05-03T14:14:04Z

@jordanlewis I have emailed the logs with Subject - Cockroach Debug Log. Let me know if you have received it.

jordanlewis · 2018-05-03T14:20:25Z

Yep, got it. Thanks. What other queries are you running on the system when you experience these slow selects? Are the selects still slow without other load on the system?

debraj-manna · 2018-05-03T14:47:31Z

@jordanlewis There are other queries executing on the system.

Is there a way I can get the snapshots of the queries that are being executed on the cockroach DB then I can share those across?

debraj-manna · 2018-05-03T14:54:28Z

@jordanlewis The load on the system

debraj-manna · 2018-05-03T15:18:20Z

@jordanlewis - I stopped all sql queries to the cluster as you can see from the below image

The select is still slow

root@:26257/vnera> show trace for SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot where  updateTimestamp >= 1525349807::TIMESTAMPTZ    order by updateTimestamp desc  limit 2147483647 ;
+----------------------------------+-------------------+-------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
|            timestamp             |        age        |                        message                        |                        tag                         | loc |             operation             | span |
+----------------------------------+-------------------+-------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
| 2018-05-03 15:14:37.02014+00:00  | 0s                | === SPAN START: sql txn ===                           |                                                    |     | sql txn                           |    0 |
| 2018-05-03 15:14:37.020942+00:00 | 802µs422ns        | === SPAN START: session recording ===                 |                                                    |     | session recording                 |    5 |
| 2018-05-03 15:14:37.02095+00:00  | 809µs979ns        | === SPAN START: starting plan ===                     |                                                    |     | starting plan                     |    1 |
| 2018-05-03 15:14:37.020987+00:00 | 847µs317ns        | === SPAN START: consuming rows ===                    |                                                    |     | consuming rows                    |    2 |
| 2018-05-03 15:14:37.02103+00:00  | 890µs47ns         | Scan /Table/54/4{-/1921-08-31T11:43:12Z}              | [n2,client=127.0.0.1:59014,user=root]              |     | sql txn                           |    0 |
| 2018-05-03 15:14:37.021047+00:00 | 907µs465ns        | === SPAN START: dist sender ===                       |                                                    |     | dist sender                       |    3 |
| 2018-05-03 15:14:37.021065+00:00 | 925µs648ns        | querying next range at /Table/54/4                    | [client=127.0.0.1:59014,user=root,txn=1b3a3640,n2] |     | dist sender                       |    3 |
| 2018-05-03 15:14:37.021105+00:00 | 965µs92ns         | r199: sending batch 1 Scan to (n2,s2):5               | [client=127.0.0.1:59014,user=root,txn=1b3a3640,n2] |     | dist sender                       |    3 |
| 2018-05-03 15:14:37.021146+00:00 | 1ms5µs863ns       | sending request to local server                       | [client=127.0.0.1:59014,user=root,txn=1b3a3640,n2] |     | dist sender                       |    3 |
| 2018-05-03 15:14:37.021168+00:00 | 1ms28µs65ns       | === SPAN START: /cockroach.roachpb.Internal/Batch === |                                                    |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:37.021175+00:00 | 1ms35µs225ns      | 1 Scan                                                | [n2]                                               |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:37.021184+00:00 | 1ms43µs859ns      | read has no clock uncertainty                         | [n2]                                               |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:37.021194+00:00 | 1ms54µs761ns      | executing 1 requests                                  | [n2,s2]                                            |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:37.02121+00:00  | 1ms69µs880ns      | read-only path                                        | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:37.021225+00:00 | 1ms85µs479ns      | command queue                                         | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:37.021261+00:00 | 1ms120µs894ns     | waiting for 1 overlapping requests                    | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:40.130197+00:00 | 3s110ms57µs305ns  | waited 3.108919618s for overlapping requests          | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:40.130202+00:00 | 3s110ms62µs547ns  | waiting for read lock                                 | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:40.130644+00:00 | 3s110ms504µs790ns | read completed                                        | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-03 15:14:40.130738+00:00 | 3s110ms598µs84ns  | plan completed execution                              | [n2,client=127.0.0.1:59014,user=root]              |     | consuming rows                    |    2 |
| 2018-05-03 15:14:40.130747+00:00 | 3s110ms607µs30ns  | resources released, stopping trace                    | [n2,client=127.0.0.1:59014,user=root]              |     | consuming rows                    |    2 |
+----------------------------------+-------------------+-------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
(21 rows)

Time: 3.115785859s

root@:26257/vnera>

The number of rows in the table

root@:26257/vnera> select count(*) from config_str;
+---------+
|  count  |
+---------+
| 7783342 |
+---------+
(1 row)

Time: 1m6.89723258s

debraj-manna · 2018-05-04T04:13:00Z

@jordanlewis @nvanbenschoten - Let me know if you need any more info from my side and is there a way I can improve the performance of the cluster?

tbg · 2018-05-04T04:20:21Z

@debraj-manna do you see the same slowness if you narrow down your select to a single row? I.e. you run the slow query and then run it again with a predicate that explicitly only selects the first row. Is it still slow in that case, every time?

debraj-manna · 2018-05-04T04:37:14Z

@tschottdorf - Yes

root@:26257/vnera> SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot   order by updateTimestamp desc  limit 1 ;
+------------+------------+-----------+--------------+---------------+-------------------------------+
| customerid | objecttype | objectid  | propertyname |  timeinstant  |        updatetimestamp        |
+------------+------------+-----------+--------------+---------------+-------------------------------+
|    1010032 |      20002 | 246670649 | _canonical   | 1525316798367 | 2018-05-03 04:48:13.492+00:00 |
+------------+------------+-----------+--------------+---------------+-------------------------------+
(1 row)

Time: 6.709128359s

root@:26257/vnera> SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot    where updatetimestamp = '2018-05-03 04:48:13.492+00:00' and customerid = 1010032 and timeinstant = 1525316798367  order by updateTimestamp desc ;
+------------+------------+-----------+--------------+---------------+-------------------------------+
| customerid | objecttype | objectid  | propertyname |  timeinstant  |        updatetimestamp        |
+------------+------------+-----------+--------------+---------------+-------------------------------+
|    1010032 |      20002 | 246670649 | _canonical   | 1525316798367 | 2018-05-03 04:48:13.492+00:00 |
+------------+------------+-----------+--------------+---------------+-------------------------------+
(1 row)

Time: 6.113664768s

debraj-manna · 2018-05-04T17:18:32Z

@nvanbenschoten - Did you observe any misconfiguration of the cluster on my part?

nvanbenschoten · 2018-05-04T17:33:58Z

@debraj-manna could you provide the output of show trace for with that last point query? Like others have mentioned, the waiting for 1 overlapping requests trace looks like its the culprit of the slowdown. You said that you no longer have any other SQL queries hitting the table, right?

If that's the case, we can take a look at tracing. To do this, first run SET CLUSTER SETTING trace.debug.enable = true. Then we can take a look at https://<node2-hostname>:8080/debug/requests?fam=tracing&b=-1&exp=1 to explore what requests are concurrently running on node 2.

nvanbenschoten · 2018-05-04T17:36:01Z

To help us get a bit more information, could you also provide a screenshot of the page https://<node2-hostname>:8080/#/reports/range/199?

debraj-manna · 2018-05-04T17:57:26Z

@nvanbenschoten - I have enabled all the SQL queries again. So all these reports are with other SQL queries running.

Please find attached the output of https://<node2-hostname>:8080/#/reports/range/199
Cockroach Console.pdf

On opening https://<node2-hostname>:8080/debug/requests?fam=tracing&b=-1&exp=1 it showed me an html page like below

Below is the output on clicking >=1s
-debug-requests.pdf

debraj-manna · 2018-05-04T18:19:47Z

@nvanbenschoten - I have stopped all queries again. So all below reports for node 2 are with no sql queries coming on cluster.

root@:26257/vnera> show trace for SELECT customerid, objecttype, objectid, propertyname, timeinstant, updateTimestamp FROM config_str@uts_desc_cid_ot    where updatetimestamp = '2018-05-03 04:48:13.492+00:00' and customerid = 1010032 and timeinstant = 1525316798367  order by updateTimestamp desc ;
+----------------------------------+------------------+--------------------------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
|            timestamp             |       age        |                                 message                                  |                        tag                         | loc |             operation             | span |
+----------------------------------+------------------+--------------------------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
| 2018-05-04 18:15:39.202504+00:00 | 0s               | === SPAN START: sql txn ===                                              |                                                    |     | sql txn                           |    0 |
| 2018-05-04 18:15:39.22116+00:00  | 18ms655µs490ns   | === SPAN START: session recording ===                                    |                                                    |     | session recording                 |    6 |
| 2018-05-04 18:15:39.221164+00:00 | 18ms660µs59ns    | === SPAN START: starting plan ===                                        |                                                    |     | starting plan                     |    1 |
| 2018-05-04 18:15:39.22121+00:00  | 18ms705µs795ns   | === SPAN START: consuming rows ===                                       |                                                    |     | consuming rows                    |    2 |
| 2018-05-04 18:15:39.22124+00:00  | 18ms735µs715ns   | Scan /Table/54/4/1921-08-31T19:11:45.507999999Z/101003{2-3}              | [n5,client=127.0.0.1:43220,user=root]              |     | sql txn                           |    0 |
| 2018-05-04 18:15:39.221254+00:00 | 18ms750µs296ns   | === SPAN START: dist sender ===                                          |                                                    |     | dist sender                       |    3 |
| 2018-05-04 18:15:39.221271+00:00 | 18ms767µs207ns   | querying next range at                                                   | [client=127.0.0.1:43220,user=root,txn=6eaed5f6,n5] |     | dist sender                       |    3 |
|                                  |                  | /Table/54/4/1921-08-31T19:11:45.507999999Z/1010032                       |                                                    |     |                                   |      |
| 2018-05-04 18:15:39.221302+00:00 | 18ms797µs990ns   | r199: sending batch 1 Scan to (n2,s2):5                                  | [client=127.0.0.1:43220,user=root,txn=6eaed5f6,n5] |     | dist sender                       |    3 |
| 2018-05-04 18:15:39.221327+00:00 | 18ms822µs854ns   | sending request to platform4:26257                                       | [client=127.0.0.1:43220,user=root,txn=6eaed5f6,n5] |     | dist sender                       |    3 |
| 2018-05-04 18:15:39.221346+00:00 | 18ms841µs542ns   | === SPAN START: /cockroach.roachpb.Internal/Batch ===                    |                                                    |     | /cockroach.roachpb.Internal/Batch |    4 |
| 2018-05-04 18:15:39.228307+00:00 | 25ms803µs284ns   | === SPAN START: /cockroach.roachpb.Internal/Batch ===                    |                                                    |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:39.228335+00:00 | 25ms830µs522ns   | 1 Scan                                                                   | [n2]                                               |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:39.22835+00:00  | 25ms846µs48ns    | executing 1 requests                                                     | [n2,s2]                                            |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:39.228361+00:00 | 25ms856µs465ns   | read-only path                                                           | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:39.22837+00:00  | 25ms866µs177ns   | command queue                                                            | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:39.22839+00:00  | 25ms885µs960ns   | waiting for 1 overlapping requests                                       | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:41.222462+00:00 | 2s19ms957µs730ns | plan completed execution                                                 | [n5,client=127.0.0.1:43220,user=root]              |     | consuming rows                    |    2 |
| 2018-05-04 18:15:41.222467+00:00 | 2s19ms962µs617ns | resources released, stopping trace                                       | [n5,client=127.0.0.1:43220,user=root]              |     | consuming rows                    |    2 |
| 2018-05-04 18:15:41.228174+00:00 | 2s25ms669µs923ns | waited 1.999774518s for overlapping requests                             | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:41.228178+00:00 | 2s25ms673µs896ns | waiting for read lock                                                    | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    5 |
| 2018-05-04 18:15:41.228611+00:00 | 2s26ms107µs274ns | read completed                                                           | [n2,s2,r199/5:/Table/54/{3/10101…-4/1921-…}]       |     | /cockroach.roachpb.Internal/Batch |    5 |
+----------------------------------+------------------+--------------------------------------------------------------------------+----------------------------------------------------+-----+-----------------------------------+------+
(21 rows)

Time: 2.025021119s

root@:26257/vnera>

-no-requests.pdf
Cockroach Console-no-req.pdf

Could you also explain what does the waiting for non overlapping requests mean ?

nvanbenschoten · 2018-05-04T18:21:30Z

@debraj-manna how large are these config_str rows? Could you give a rough estimate on the size range of the value columns?

nvanbenschoten · 2018-05-04T18:23:42Z

Also, what is the workload running on this config_str table. Is it all SELECTs or are you performingINSERTS/UPDATES on it as well? What is the approximate frequency and distribution of these operations?

debraj-manna · 2018-05-04T19:03:38Z

@nvanbenschoten - The value are not that huge.

root@:26257/vnera> select length(value) from config_str order by updatetimestamp desc  limit 10;
+--------+
| length |
+--------+
|   2260 |
|   4132 |
|   2689 |
|   3764 |
|   2487 |
|   2356 |
|   6567 |
|   6566 |
|    737 |
|   2516 |
+--------+
(10 rows)

Time: 6.802122388s

root@:26257/vnera> select length(value) from config_str order by updatetimestamp   limit 10;
+--------+
| length |
+--------+
|   1688 |
|     14 |
|   1050 |
|      9 |
|   1813 |
|     26 |
|   NULL |
|   1814 |
|     26 |
|   NULL |
+--------+
(10 rows)

Yes it is having inserts, updates along with selects

debraj-manna · 2018-05-04T19:15:31Z

@nvanbenschoten -Below is the current load on the 5 node cluster.

About 30% of these queries are on config_str table.

nvanbenschoten · 2018-05-04T19:36:36Z

Are these inserts and updates also taking multiple seconds? Also, have there been any recent logs on node 2 with the string r199/ in them?

debraj-manna · 2018-05-04T19:56:38Z

@nvanbenschoten No I am not seeing any recent logs with r199 in them. But my latest logs are flooding with message like below

I180504 19:46:18.661566 451906746 util/stop/stopper.go:291  [n5,ts-srv] stopper throttling task from ts.Server: query due to semaphore
I180504 19:46:18.661602 451906746 util/stop/stopper.go:291  [n5,ts-srv] stopper throttling task from ts.Server: query due to semaphore
I180504 19:46:18.661790 451906746 util/stop/stopper.go:291  [n5,ts-srv] stopper throttling task from ts.Server: query due to semaphore
I180504 19:46:18.662072 451906746 util/stop/stopper.go:291  [n5,ts-srv] stopper throttling task from ts.Server: query due to semaphore

Insert seems to be failing in node 2 tried doing a random insert

root@:26257/vnera> insert into config_str values (1, 2, 3, 'tuk', 4, now(), 'crdb');
pq: waiting on split that failed: split at key /Table/54/4/1921-08-31T23:34:53.345999999Z/1010079/1/"_denorm"/6867928089853697837/1525303307245 failed: command is too large: 180541100 bytes (max: 100000000)

debraj-manna · 2018-05-04T20:09:22Z

@nvanbenschoten - I have emailed you the debug logs at nathan @ cockroachlabs.com as it is about 43 MB which github is not allowing me to upload.

nvanbenschoten · 2018-05-04T20:20:00Z

@debraj-manna is this forum post written by someone working on the same database as you: https://forum.cockroachlabs.com/t/trying-to-split-same-key-range/1592/31? I ask because I see that your kv.raft.command.max_size setting is set to the same value as what we suggested in that post, and increasing that value even further (SET CLUSTER SETTING kv.raft.command.max_size='200000000';) should help with this issue as well. I suspect that the root cause of this performance problem is related to #25233.

debraj-manna · 2018-05-04T20:22:06Z

Yes it is me. Ok I will try to increase this further and get back to you.

…

On Sat 5 May, 2018, 1:50 AM Nathan VanBenschoten, ***@***.***> wrote: @debraj-manna <https://github.com/debraj-manna> is this forum post written by someone working on the same database as you: https://forum.cockroachlabs.com/t/trying-to-split-same-key-range/1592/31? I ask because I see that your kv.raft.command.max_size setting is set to the same value as what we suggested in that post, and increasing that value even further (SET CLUSTER SETTING kv.raft.command.max_size='200000000';) should help with this issue as well. I suspect that the root cause of this performance problem is related to #25233 <#25233>. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#25151 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHw8JFr0-HxLgkHQz-QJYk1zf99g1LtSks5tvLgBgaJpZM4Trr6f> .

debraj-manna · 2018-05-04T20:44:28Z

@nvanbenschoten - Do I have to restart the cluster after increasing the command size ?

nvanbenschoten · 2018-05-04T20:47:32Z

@debraj-manna, no you don't need to. However, it's propogated through a gossip protocol though, so it might take a few seconds to take effect.

debraj-manna · 2018-05-04T21:02:18Z

@nvanbenschoten - The performance seems to be improving. I will monitor this further and post back. But one query over the time the size of the index will grow so do I have to increase the command size further over time ?

nvanbenschoten · 2018-05-04T21:16:03Z

You shouldn't need to, but we've already seen this twice with your cluster so I'm not confident that it won't come back up again. To clarify, this is is only a problem with rare internal commands that are performing range splits, and it's something we've only seen in your cluster. We'll continue to monitor this issue in #25233, so I'll close this thread. Please re-open if you see performance degrade again.

debraj-manna · 2018-05-05T04:57:09Z

I am not able to reopen it. So commented on #25233 .

jordanlewis added C-performance Perf of queries or internals. Solution not expected to change functional behavior. O-community Originated from the community labels May 1, 2018

tbg assigned nvanbenschoten May 4, 2018

tbg added this to Milestone 1 (4/30-5/21-5/28) in KV May 4, 2018

nvanbenschoten closed this as completed May 4, 2018

KV automation moved this from Milestone 1 (4/30-5/21-5/28) to Finished (milestone 1, ends 5/28) May 4, 2018

debraj-manna mentioned this issue May 5, 2018

storage: splits can hit command is too large error #25233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SELECT query sometimes taking a lot of time in 3 node cluster #25151

SELECT query sometimes taking a lot of time in 3 node cluster #25151

debraj-manna commented Apr 29, 2018 •

edited

jordanlewis commented May 1, 2018

debraj-manna commented May 3, 2018 •

edited

debraj-manna commented May 3, 2018

jordanlewis commented May 3, 2018

debraj-manna commented May 3, 2018

jordanlewis commented May 3, 2018

debraj-manna commented May 3, 2018

debraj-manna commented May 3, 2018 •

edited

debraj-manna commented May 3, 2018

debraj-manna commented May 4, 2018

tbg commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018 •

edited

nvanbenschoten commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018 via email

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 5, 2018 •

edited

SELECT query sometimes taking a lot of time in 3 node cluster #25151

SELECT query sometimes taking a lot of time in 3 node cluster #25151

Comments

debraj-manna commented Apr 29, 2018 • edited

jordanlewis commented May 1, 2018

debraj-manna commented May 3, 2018 • edited

debraj-manna commented May 3, 2018

jordanlewis commented May 3, 2018

debraj-manna commented May 3, 2018

jordanlewis commented May 3, 2018

debraj-manna commented May 3, 2018

debraj-manna commented May 3, 2018 • edited

debraj-manna commented May 3, 2018

debraj-manna commented May 4, 2018

tbg commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018 • edited

nvanbenschoten commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018 via email

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 4, 2018

nvanbenschoten commented May 4, 2018

debraj-manna commented May 5, 2018 • edited

debraj-manna commented Apr 29, 2018 •

edited

debraj-manna commented May 3, 2018 •

edited

debraj-manna commented May 3, 2018 •

edited

debraj-manna commented May 4, 2018 •

edited

debraj-manna commented May 5, 2018 •

edited