Skip to content

nodetool cfstats hangs forever #5414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
asias opened this issue Dec 4, 2019 · 13 comments
Closed

nodetool cfstats hangs forever #5414

asias opened this issue Dec 4, 2019 · 13 comments
Assignees
Milestone

Comments

@asias
Copy link
Contributor

asias commented Dec 4, 2019

Scylla: 3.1.2
Nodetool: 35906df

Start two 2 nodes in the cluster

Inject data with:
$ scylla-bench -workload sequential -mode write -nodes 127.0.0.1 -rows-per-request 1 -partition-count 1000 -clustering-row-count 100 -clustering-row-size 1024000 -keyspace myks1024000 -table t1 -replication-factor 2

Then run nodetool. It never returns
$ nodetool cfstats

Other nodetool cmds work correctly:

$ nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns    Host ID                               Rack
UN  127.0.0.1  10.69 GB   256          ?       96aaf970-c8b8-414c-9eb4-71bd4c9be29b  rack1
UN  127.0.0.2  10.93 GB   256          ?       7a758917-1f30-4e35-9292-045ee29a0555  rack1
@slivne
Copy link
Contributor

slivne commented Dec 15, 2019

@amoskong so its clear:

  • did you run nodetool cfstats after scylla_bench has ended
    or
  • did you run nodetool cfstats while scylla_bench was running and loading data into the cluster

@asias
Copy link
Contributor Author

asias commented Dec 16, 2019

It was after bench was done to generate the load. When I run nodetool cfstats, zero load in the cluster.

@slivne slivne added the bug label Jan 2, 2020
@slivne slivne added this to the 3.3 milestone Jan 2, 2020
@slivne
Copy link
Contributor

slivne commented Jan 2, 2020

@amnonh can you please start the debugging and indicate where its stuck on scylla end

@slivne slivne modified the milestones: 3.3, 3.4 Feb 20, 2020
@amnonh
Copy link
Contributor

amnonh commented Feb 23, 2020

It didn't reproduce for me, it does take a long time, on my laptop it took 25s

@slivne
Copy link
Contributor

slivne commented Feb 25, 2020

cosing the issue if it reproduces - please open a new bug with exact time it took and how much time you waited (and if there were stalls in scylla)

@slivne slivne closed this as completed Feb 25, 2020
@let4be
Copy link

let4be commented Mar 2, 2020

Same issue here, after I did nodetool clearsnapshots and also dropped some no more used keyspaces via cqlsh nodetool cfstats ... hangs forever

scylla --version
3.2.1-0.20200122.e3e301906d5

If operation is legitimately should be taking a long time it would be nice to at least get some hint/warning from nodetool...

@slivne
Copy link
Contributor

slivne commented Mar 2, 2020

@let4be - agreed,

yet to understand if its long time or hangged can you please share the info on how long you waited for nodetool cfstats to return.

@slivne slivne reopened this Mar 2, 2020
@let4be
Copy link

let4be commented Mar 2, 2020

I waited over 20 min with zero success - process nodetool was in ps output, I then(after waiting without success) stopped it multiple times and started again - nothing helped(didn't wait 20 min in subsequent nodetool attempts tho).

When all this failed I completely took pressure off scylla and stopped all data producers, scylla cpu usage fell basically to zero but I still could not get a response from nodetool cfstats in a sane time(within several minutes).
Restarting scylla solved the issue for me, but I think it will come back eventually... I'd really like to avoid restarting production database(this was on a testing instance, so restarting was easy)

When it works I'm able to get nodetool cfstats ... response within 3-5 seconds on database with almost 0.5TB of data, so hanging for minutes seems like something went horribly wrong

(running single node scylla on very good hardware)

@slivne
Copy link
Contributor

slivne commented Apr 15, 2020

@amnonh ping

@slivne slivne modified the milestones: 4.0, 4.1 Apr 15, 2020
@amnonh
Copy link
Contributor

amnonh commented Apr 27, 2020

I'm still unable to reproduce. @let4be is is something you can reproduce easily?

If so, can you run scylla-jmx with logging enabled:

./scripts/scylla-jmx -Djava.util.logging.config.file=log.properties

$ cat log.properties


handlers=java.util.logging.FileHandler, java.util.logging.ConsoleHandler
.level=DEBUG
java.util.logging.ConsoleHandler.level=FINEST
java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter
confLogger.level=FINEST

java.util.logging.FileHandler.pattern=/home/amnon/scylla-jmx/log.log
java.util.logging.FileHandler.limit = 50000
java.util.logging.FileHandler.count = 1
java.util.logging.FileHandler.formatter = java.util.logging.SimpleFormatter

javax.*=FINEST
org.apache.cassandra.service.StorageService.level=FINEST
org.apache.cassandra.service.StorageProxy.level=FINEST
org.apache.cassandra.service.MessagingService.level=FINEST
org.apache.cassandra.db.commitlog.CommitLog.level=FINEST
org.apache.cassandra.gms.Gossiper.level=FINEST
org.apache.cassandra.locator.EndpointSnitchInfo.level=FINEST
org.apache.cassandra.gms.FailureDetector.level=FINEST
org.apache.cassandra.db.ColumnFamilyStore.level=FINEST
org.apache.cassandra.service.CacheService.level=FINEST
org.apache.cassandra.db.compaction.CompactionManager.level=FINEST

@slivne slivne modified the milestones: 4.1, 4.3 Jun 1, 2020
@slivne
Copy link
Contributor

slivne commented Jan 24, 2021

no response from users - if this happens - please provide requested information #5414 (comment)

@slivne slivne closed this as completed Jan 24, 2021
@yarongilor
Copy link

yarongilor commented Mar 11, 2021

reproduced in 4.4 2TB longevity during showTopPartitions nemesis.

nodetool cfstats got stuck, executing on show-top-partitions nemesis.

the nodetool cfstats commands executed until got stuck are:

~/Downloads/logs/sct-runner-ff466b17$ grep nodetool sct.log | grep cfstats
< t:2021-03-07 20:20:47,397 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1"...
< t:2021-03-07 20:20:50,303 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1" finished with status 0
< t:2021-03-07 20:20:50,303 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1' duration -> 2.9059647249996488 s
< t:2021-03-07 20:20:50,303 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1"...
< t:2021-03-07 20:20:52,515 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1" finished with status 0
< t:2021-03-07 20:20:52,515 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1' duration -> 2.2120117339982244 s
< t:2021-03-07 20:20:58,115 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats mview"...
< t:2021-03-07 20:21:02,364 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats mview" finished with status 0
< t:2021-03-07 20:21:02,364 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats mview' duration -> 4.24927025299985 s
< t:2021-03-07 20:21:02,364 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats mview"...
< t:2021-03-07 20:21:05,159 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats mview" finished with status 0
< t:2021-03-07 20:21:05,159 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats mview' duration -> 2.794435585001338 s
< t:2021-03-07 21:30:45,107 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1"...
< t:2021-03-07 21:30:48,671 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1" finished with status 0
< t:2021-03-07 21:30:48,671 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1' duration -> 3.563766963998205 s
< t:2021-03-08 02:58:54,828 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1"...
< t:2021-03-08 02:58:57,765 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1" finished with status 0
< t:2021-03-08 02:58:57,765 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1' duration -> 2.9365833699994255 s
< t:2021-03-08 04:03:59,588 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1"...
< t:2021-03-08 04:04:02,058 f:base.py         l:140  c:RemoteCmdRunner      p:DEBUG > Command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1" finished with status 0
< t:2021-03-08 04:04:02,059 f:cluster.py      l:2577 c:sdcm.cluster_aws     p:DEBUG > Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False): Command '/usr/bin/nodetool -u cassandra -pw cassandra  cfstats keyspace1.standard1' duration -> 2.4705862659975537 s
< t:2021-03-08 11:01:49,145 f:remote_base.py  l:520  c:RemoteCmdRunner      p:DEBUG > Running command "/usr/bin/nodetool -u cassandra -pw cassandra  cfstats "...

the full nemesis list before nodetool got stuck is:

~/Downloads/logs/sct-runner-ff466b17$ grep -v 'ChaosMonkey on target' sct.log | grep -i 'found coredump\|Set current_disruption\|segmentation\|<<<<<<<<<\|>>>>>>>>>\|p:ERROR > Traceback'
< t:2021-03-07 20:21:05,737 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-07 20:21:05,737 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-07 20:21:05,738 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-07 20:21:05,738 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method multiple_hard_reboot_node
< t:2021-03-07 20:21:05,740 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> MultipleHardRebootNode Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-07 20:21:05,739 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate_large_partition
< t:2021-03-07 20:21:05,741 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkeyLargePartition Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-07 20:21:05,740 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate
< t:2021-03-07 20:21:05,741 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-07 20:22:54,126 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-07 20:25:56,523 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-07 20:32:28,112 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> MultipleHardRebootNode Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-07 20:36:06,131 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method multiple_hard_reboot_node
< t:2021-03-07 20:52:55,844 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-07 20:52:55,845 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate_large_partition
< t:2021-03-07 20:52:55,845 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkeyLargePartition Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-07 20:55:58,076 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-07 20:55:58,077 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method mgmt_backup_specific_keyspaces
< t:2021-03-07 20:55:58,078 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ManagementBackupWithSpecificKeyspaces
< t:2021-03-07 20:55:58,543 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-07 20:55:58,546 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate_large_partition
< t:2021-03-07 20:55:58,546 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkeyLargePartition Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-07 20:57:24,984 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method truncate_large_partition
< t:2021-03-07 21:00:41,761 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method truncate_large_partition
< t:2021-03-07 21:06:07,504 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-07 21:06:07,505 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method stop_wait_start_scylla_server
< t:2021-03-07 21:06:07,505 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> StopWaitStartService Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-07 21:13:36,221 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method stop_wait_start_scylla_server
< t:2021-03-07 21:27:25,461 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-07 21:27:25,462 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method run_cdcstressor_tool
< t:2021-03-07 21:27:25,462 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> RunCDCStressorTool
< t:2021-03-07 21:27:26,804 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method run_cdcstressor_tool
< t:2021-03-07 21:30:41,876 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-07 21:30:41,877 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_refresh
< t:2021-03-07 21:30:41,877 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> Refresh keyspace1.standard1 on longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5
< t:2021-03-07 21:31:52,958 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method nodetool_refresh
< t:2021-03-07 21:43:36,686 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-07 21:43:36,686 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method repair_streaming_err
< t:2021-03-07 21:43:36,687 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> RepairStreamingErr
< t:2021-03-07 21:57:27,293 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-07 21:57:27,293 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method abort_repair
< t:2021-03-07 21:57:27,294 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> AbortRepairMonkey
< t:2021-03-07 22:01:58,391 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-07 22:01:58,392 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method toggle_table_ics
< t:2021-03-07 22:01:58,392 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ToggleTableICS
< t:2021-03-07 22:02:03,789 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-07 22:02:03,790 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method add_drop_column
< t:2021-03-07 22:02:04,054 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> AddDropColumnMonkey table keyspace1.standard1
< t:2021-03-07 22:12:12,011 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method add_drop_column
< t:2021-03-07 22:42:12,471 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-07 22:42:12,472 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method abort_repair
< t:2021-03-07 22:42:12,473 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> AbortRepairMonkey
< t:2021-03-07 22:43:35,112 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-07 23:13:35,584 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-07 23:13:35,585 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method mgmt_backup_specific_keyspaces
< t:2021-03-07 23:13:35,585 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ManagementBackupWithSpecificKeyspaces
< t:2021-03-07 23:13:36,058 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-07 23:13:36,059 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method hot_reloading_internode_certificate
< t:2021-03-07 23:13:36,059 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ServerSslHotReloadingNemesis
< t:2021-03-07 23:13:38,173 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method hot_reloading_internode_certificate
< t:2021-03-07 23:43:38,299 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-07 23:43:38,300 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method mgmt_repair_cli
< t:2021-03-07 23:43:38,300 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ManagementRepair
< t:2021-03-07 23:45:03,029 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 00:15:03,505 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 00:15:03,506 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method hot_reloading_internode_certificate
< t:2021-03-08 00:15:03,506 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ServerSslHotReloadingNemesis
< t:2021-03-08 00:15:05,390 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method hot_reloading_internode_certificate
< t:2021-03-08 00:45:05,862 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 00:45:05,863 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method major_compaction
< t:2021-03-08 00:45:05,863 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> MajorCompaction Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 01:24:02,187 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 01:54:02,653 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 01:54:02,654 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method modify_table
< t:2021-03-08 01:54:02,654 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ModifyTablePropertiesCaching Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 01:54:12,389 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method modify_table
< t:2021-03-08 01:58:48,752 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method major_compaction
< t:2021-03-08 02:24:12,876 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 02:24:12,877 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method corrupt_then_scrub
< t:2021-03-08 02:28:49,216 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-08 02:28:49,217 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method toggle_cdc_feature_properties_on_table
< t:2021-03-08 02:28:49,217 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ToggleCDCProperties
< t:2021-03-08 02:28:50,419 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method toggle_cdc_feature_properties_on_table
< t:2021-03-08 02:53:26,814 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method repair_streaming_err
< t:2021-03-08 02:58:50,935 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 02:58:50,936 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method mgmt_backup_specific_keyspaces
< t:2021-03-08 02:58:50,936 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ManagementBackupWithSpecificKeyspaces
< t:2021-03-08 02:58:51,414 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-08 02:58:51,416 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_refresh
< t:2021-03-08 02:58:51,416 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> Refresh keyspace1.standard1 on longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4
< t:2021-03-08 02:59:35,062 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 03:23:27,288 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 03:23:27,288 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method hard_reboot_node
< t:2021-03-08 03:23:27,289 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> HardRebootNode Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 03:27:06,345 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method hard_reboot_node
< t:2021-03-08 03:29:35,178 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 03:29:35,179 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate_large_partition
< t:2021-03-08 03:29:35,179 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkeyLargePartition Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 03:33:54,164 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method truncate_large_partition
< t:2021-03-08 03:57:06,812 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 03:57:06,812 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method grow_shrink_cluster
< t:2021-03-08 03:57:06,813 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> GrowCluster
< t:2021-03-08 04:03:54,619 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-08 04:03:54,620 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_refresh
< t:2021-03-08 04:03:54,620 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> Refresh keyspace1.standard1 on longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4
< t:2021-03-08 04:04:22,813 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 04:21:40,229 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 04:34:23,296 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 04:34:23,297 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method toggle_table_ics
< t:2021-03-08 04:34:23,297 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ToggleTableICS
< t:2021-03-08 04:34:33,158 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-08 04:34:33,159 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_cleanup
< t:2021-03-08 04:34:34,034 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NodetoolCleanupMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 04:51:40,722 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 04:51:40,722 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate_large_partition
< t:2021-03-08 04:51:40,722 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkeyLargePartition Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 04:56:19,921 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method truncate_large_partition
< t:2021-03-08 05:26:20,403 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 05:26:20,405 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method mgmt_backup
< t:2021-03-08 05:26:20,405 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ManagementBackup
< t:2021-03-08 05:26:20,885 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 05:26:20,887 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method modify_table
< t:2021-03-08 05:26:20,889 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ModifyTablePropertiesCompaction Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 05:26:33,537 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method modify_table
< t:2021-03-08 05:38:43,491 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NodetoolCleanupMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-2 [13.51.55.149 | 10.0.2.153] (seed: True)
< t:2021-03-08 05:56:33,651 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 05:56:33,652 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_cleanup
< t:2021-03-08 05:56:34,648 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NodetoolCleanupMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 06:52:21,331 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> ShrinkCluster
< t:2021-03-08 07:02:58,615 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NodetoolCleanupMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 07:03:00,418 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 07:25:08,681 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 07:33:00,912 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-6 [13.49.77.66 | 10.0.2.62] (seed: False)
< t:2021-03-08 07:33:00,913 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method snapshot_operations
< t:2021-03-08 07:33:00,913 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> SnapshotOperations
< t:2021-03-08 07:34:03,949 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method snapshot_operations
< t:2021-03-08 07:55:09,185 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-6 [13.49.77.66 | 10.0.2.62] (seed: False)
< t:2021-03-08 07:55:09,186 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method run_cdcstressor_tool
< t:2021-03-08 07:55:09,186 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> RunCDCStressorTool
< t:2021-03-08 07:55:14,153 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method run_cdcstressor_tool
< t:2021-03-08 08:04:04,433 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 08:04:04,434 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_cleanup
< t:2021-03-08 08:04:05,366 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NodetoolCleanupMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 08:04:07,318 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 08:25:14,622 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 08:25:14,622 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate
< t:2021-03-08 08:25:14,623 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkey Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 08:27:05,812 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method truncate
< t:2021-03-08 08:34:07,792 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 08:34:07,793 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method modify_table
< t:2021-03-08 08:34:07,793 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ModifyTablePropertiesCompression Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 08:34:22,288 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method modify_table
< t:2021-03-08 08:47:03,328 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method grow_shrink_cluster
< t:2021-03-08 09:04:22,832 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-08 09:04:22,833 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method no_corrupt_repair
< t:2021-03-08 09:04:22,834 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NoCorruptRepair Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-08 09:17:03,784 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 09:17:03,784 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method network_block
< t:2021-03-08 09:17:03,785 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> BlockNetwork
< t:2021-03-08 09:17:04,270 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> DisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 09:17:04,271 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.DisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method nodetool_decommission
< t:2021-03-08 09:17:04,272 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.DisruptiveMonkey: Set current_disruption -> Decommission Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-3 [13.48.13.176 | 10.0.3.108] (seed: True)
< t:2021-03-08 09:24:35,700 f:nemesis.py      l:2836 c:sdcm.nemesis         p:ERROR > Traceback (most recent call last):
< t:2021-03-08 09:54:36,342 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 09:54:36,343 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method truncate_large_partition
< t:2021-03-08 09:54:36,343 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> TruncateMonkeyLargePartition Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-1 [13.53.162.189 | 10.0.1.14] (seed: True)
< t:2021-03-08 10:01:42,134 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method truncate_large_partition
< t:2021-03-08 10:31:42,628 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-4 [13.48.190.35 | 10.0.2.114] (seed: False)
< t:2021-03-08 10:31:42,629 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method run_cdcstressor_tool
< t:2021-03-08 10:31:42,629 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> RunCDCStressorTool
< t:2021-03-08 10:31:47,327 f:nemesis.py      l:1018 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: <<<<<<<<<<<<<Finished random_disrupt_method run_cdcstressor_tool
< t:2021-03-08 11:01:47,800 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> NonDisruptiveMonkey on target node Node longevity-tls-2tb-4d-1dis-2nondis-4-db-node-ff466b17-5 [13.53.85.98 | 10.0.2.130] (seed: False)
< t:2021-03-08 11:01:47,801 f:nemesis.py      l:1008 c:sdcm.nemesis         p:INFO  > sdcm.nemesis.NonDisruptiveMonkey: >>>>>>>>>>>>>Started random_disrupt_method show_toppartitions
< t:2021-03-08 11:01:47,801 f:nemesis.py      l:566  c:sdcm.nemesis         p:DEBUG > sdcm.nemesis.NonDisruptiveMonkey: Set current_disruption -> ShowTopPartitions

Installation details
Scylla version (or git commit hash): 4.4.rc3-0.20210304.c2d924757 with build-id 1d0929cbdc3fd39b79157e974cd8d08cd2610afb
Cluster size: 5 nodes (i3en.3xlarge)
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-07a91c0d7ed3e47be (aws: eu-north-1)

Test: longevity-2tb-4days-1Dis-2NonDis-Nemesises-test
Test name: longevity_test.LongevityTest.test_custom_time
Test config file(s):

Issue description

====================================

PUT ISSUE DESCRIPTION HERE

====================================

Restore Monitor Stack command: $ hydra investigate show-monitor ff466b17-7870-41bb-ba3a-d5247f4827d7
Show all stored logs command: $ hydra investigate show-logs ff466b17-7870-41bb-ba3a-d5247f4827d7

Test id: ff466b17-7870-41bb-ba3a-d5247f4827d7

Logs:
db-cluster - https://cloudius-jenkins-test.s3.amazonaws.com/ff466b17-7870-41bb-ba3a-d5247f4827d7/20210309_154015/db-cluster-ff466b17.zip
loader-set - https://cloudius-jenkins-test.s3.amazonaws.com/ff466b17-7870-41bb-ba3a-d5247f4827d7/20210309_154015/loader-set-ff466b17.zip
monitor-set - https://cloudius-jenkins-test.s3.amazonaws.com/ff466b17-7870-41bb-ba3a-d5247f4827d7/20210309_154015/monitor-set-ff466b17.zip
sct-runner - https://cloudius-jenkins-test.s3.amazonaws.com/ff466b17-7870-41bb-ba3a-d5247f4827d7/20210309_154015/sct-runner-ff466b17.zip

Jenkins job URL

@yarongilor yarongilor reopened this Mar 11, 2021
@yarongilor
Copy link

according to @roydahan this issue can be re-closed since there's a newer open issue for that, (not sure which one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants