Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator crashes node with "Unexpected mutation fragment" on range tombstone change of clustering row #10553

Closed
fruch opened this issue May 12, 2022 · 15 comments
Assignees
Labels
Milestone

Comments

@fruch
Copy link
Contributor

fruch commented May 12, 2022

Installation details

Kernel version: 5.13.0-1022-aws
Scylla version (or git commit hash): 5.1.dev-0.20220504.b26a3da584cc with build-id ab2a33a30756c1513f4c516cd272291e75acec0e
Cluster size: 6 nodes (i3.large)
Scylla running with shards number (live nodes):
longevity-harry-2h-fix-cass-db-node-eddd82cc-1 (16.171.62.87 | 10.0.3.241): 2 shards
longevity-harry-2h-fix-cass-db-node-eddd82cc-2 (13.53.37.177 | 10.0.1.86): 2 shards
longevity-harry-2h-fix-cass-db-node-eddd82cc-3 (13.48.26.58 | 10.0.3.223): 2 shards
longevity-harry-2h-fix-cass-db-node-eddd82cc-4 (13.48.71.161 | 10.0.3.236): 2 shards
longevity-harry-2h-fix-cass-db-node-eddd82cc-5 (16.16.27.153 | 10.0.1.98): 2 shards
longevity-harry-2h-fix-cass-db-node-eddd82cc-6 (13.48.1.47 | 10.0.3.109): 2 shards
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0f0e4c1a732cd9815 (aws: eu-north-1)

Test: longevity-harry-2h-test
Test name: longevity_test.LongevityTest.test_custom_time
Test config file(s):

  • longevity-harry-2h.yaml

Issue description

While running cassandra-harry (New test for SCT)

after ~1 hour of run it fail with the following failure/abort:

2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !     ERR |  [shard 0] mutation_reader - [validator 0x60000024df18 for sstable writer /var/lib/scylla/data/harry/table0-c767e800d10511ec8ed64b4a3fb655b1/me-6-big-Data.db (harry.table0 c767e800-d105-11ec-8ed6-4b4a3fb655b1)] Unexpected mutation fragment: partition key {key: pk{000800000000e3104d8b01ea5a4848794142646941426469414264694142646941426469587971534a4e4371353131373635363130393130353232323231333131363133393137303231333234353132383135333936323036323332323332323633383230313438313830323231393832323431343830323132323130323336323231313231323031313931313932313430323330323339313633323035313031323238383032363536363532343131303331363038333837313232313539313238313339323131313838313939323331323138353437393234323138373130363333373831373330313337313031323330313538323237363931333732313432313133393538313637313930323438333631303739303536323236323032323439323632323332333533303230323139343232323131303234303230373139353432303833323832313934323533343931313331343631383534313135323831383433333132313136313934333032353131393731393231393031373432333331373331323231343032313231313933343135333934313734313637333332303631313131383632353531313531323333373233303135313535313833313537383631393832313931373131393031313832323436343230373130393231373232343134383139373331323432323535323232303604725a4848794142646941426469414264694142646941426469586d65456e79694d3233353834393735313035323432323039313331303438313430363932303032303831343331313332323331353731373732333431363831363132353134313237313136353539323430313335313232313732373631313231313832353032313731313532333233313137363432343531323838323534313935313136333339323232323533323339313332303032353239373235353139383131303131353732303031363631383234363134303730313531353332323532333131393931343132343731363737363138313733313832323436323434313831313336313031303631393835373131303232313132323531303832313335333535323132313332333331313234373637313337313536393232333033353231383133303231383132393132363139363234343139323136333230353139323131393134313639313735313334323338343732313031323834363139313134373831313932313538333431383732333531393939323234333131363231313535323436323534363532313931393331393037353234353733313535393137323232363233353535313338363232323631323532323031333639373235383932303538333232363233353131333130383232393234393232323133383232373632313733393835363231383531353431373631373831363431353631383331393031303032303632323332313132323733363137363733383437313230313733323035313139313837313739323430373631393732343132353432333431353232323231343136363231333233313831313539333239323133323739393931333430313335373731303831383536313538383732323332383235353531303838383137303235323530313034313134313139313233363832303932353038343432343631393232323631333431313332323231393539323231313130313431313933323432353131333435363138343230373231313636373231303831313431333432323931383932313532353538383636323430353732333431363031323933313630313335313236323334363036303234313234373235303135323335313834323135363832313535313937323039313836313938313531323530363432323031333539393231393235353537343336313131323135383134353736313136313335373632313138323032363431303432323731393232303737313631373332333532313938353134353136303231373732313934323035333932313435383131333237323336343132383139343132313134363632313936373331333132323232303531393932353037303231363132353138383338373739383439313332343132313232323037333630313438313839323139323533323039313930313735333539313931313435313038313436313735353230373234383932323036333235353335323131313637313834393032343231343031333332353332303934323734313638313330323037313535363431383533363935}, token:-7626975985082462929}: 
previous clustering row:{position: clustered,ckp{0008000000fcb636757c000400e11035},0}, current range tombstone change:{position: clustered,ckp{},-1}, 
at: 0x49f54be 0x49f59b0 0x49f5cb8 0x463ee62 0x180b1e4 0x180bb5e 0x1d6089c 0x1c2f2cb 0x1c2cc54 0x1c2aeb7 0x1c21a1a 0x1c1d8a2 0x1c1d133 0x48db8d1
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO | Aborting on shard 0.
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO | Backtrace:
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x465ad78
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x468b792
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x7f490b105a1f
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   /opt/scylladb/libreloc/libc.so.6+0x3d2a1
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   /opt/scylladb/libreloc/libc.so.6+0x268a3
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x463eecb
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x180b1e4
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x180bb5e
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1d6089c
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1c2f2cb
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1c2cc54
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1c2aeb7
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1c21a1a
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1c1d8a2
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x1c1d133
2022-05-11T09:13:00+00:00 longevity-harry-2h-fix-cass-db-node-264c7f6f-1 !    INFO |   0x48db8d1

It's 100% reproducible, failed 3 times on row, with exact same failure.

Ops made by cassandra-harry

CREATE TABLE IF NOT EXISTS harry.table0 (pk0000 bigint,pk0001 ascii,pk0002 ascii,ck0000 bigint,ck0001 float,static0000 double static,static0001 tinyint static,static0002 smallint static,static0003 tinyint static,static0004 tinyint static,regular0000 int,regular0001 smallint,regular0002 int,regular0003 float, PRIMARY KEY ((pk0000,pk0001,pk0002), ck0000, ck0001)) WITH  CLUSTERING ORDER BY (ck0000 ASC,ck0001 ASC);

Example of queries used to insert/update data:

LTS: 1986999. Pd -2097264832740263531. Cd 6919541666608472860. M 2. OpId: 5 Statement CompiledStatement{cql='UPDATE harry.table0 USING TIMESTAMP 1652289010295092 SET regular0000 = ?, regular0002 = ? WHERE pk0000 = ? AND pk0001 = ? AND pk0002 = ? AND ck0000 = ? AND ck0001 = ?;', bindings=1424096100,32238882,1659176127L,"ZHHyABdiABdiABdiABdiABdiEQRqdTHi2132017010324848531411821057234186170229421272554238140196211109701540249228117282318523725516170681891564517781160513918564100771825315719529351672321561247233130150232492321628422123106115147198155462471061923525211223937223632431611392282332431965192154125631271952551956276515011122477449687165249144691196518817659230196231202","ZHHyABdiABdiABdiABdiABdiygANchTA2382027819172111122242156117067118244207771331112306898145402077441921091032081353223716121418651751923811120243143128150213246115196997209587568711892011414416815534111107159521117620941931812381932316487392181641601003620930619813425514152461392427715383169205245014121793348211524420367157964209591381605621523111816524090221245306019216162692139615814258601765694231121158197208160392531251699318014994145210243235213111411721088176171575643180223759218241982546413010207392897921701510223819143991583945321661688913941613336203225169637040149167243163194262187015822318822363921932181130192242190144119824617014087311312187103192178401559837632066420919860128615410460206121114159268818639891013514624631262151520151472278816892081111591058231053814728165289622418201315518920517916816715610550154179161522498139",962192636934L,(float)1.6749614E-38}
LTS: 1986999. Pd -2097264832740263531. Cd 6591965067128628690. M 3. OpId: 6 Statement CompiledStatement{cql='INSERT INTO harry.table0 (pk0000,pk0001,pk0002,ck0000,ck0001,regular0002) VALUES (?, ?, ?, ?, ?, ?) USING TIMESTAMP 1652289010295092;', bindings=1659176127L,"ZHHyABdiABdiABdiABdiABdiEQRqdTHi2132017010324848531411821057234186170229421272554238140196211109701540249228117282318523725516170681891564517781160513918564100771825315719529351672321561247233130150232492321628422123106115147198155462471061923525211223937223632431611392282332431965192154125631271952551956276515011122477449687165249144691196518817659230196231202","ZHHyABdiABdiABdiABdiABdiygANchTA2382027819172111122242156117067118244207771331112306898145402077441921091032081353223716121418651751923811120243143128150213246115196997209587568711892011414416815534111107159521117620941931812381932316487392181641601003620930619813425514152461392427715383169205245014121793348211524420367157964209591381605621523111816524090221245306019216162692139615814258601765694231121158197208160392531251699318014994145210243235213111411721088176171575643180223759218241982546413010207392897921701510223819143991583945321661688913941613336203225169637040149167243163194262187015822318822363921932181130192242190144119824617014087311312187103192178401559837632066420919860128615410460206121114159268818639891013514624631262151520151472278816892081111591058231053814728165289622418201315518920517916816715610550154179161522498139",942667550085L,(float)2.3179071E-38,1840112489}

full log of all operation cassandra-harry was doing: (it's deflated to 22Gb):
https://cloudius-jenkins-test.s3.amazonaws.com/77d9f946-9eff-455e-ba63-e4211ff9d8e0/20220511_183754/operation.log.tar.gz

Coredump:

2022-05-11 22:20:44.089 <2022-05-11 22:16:52.000>: (CoreDumpEvent Severity.ERROR) period_type=one-time event_id=e379b5cc-3c96-4aa7-a2a2-65f5c79ae473 node=Node longevity-harry-2h-fix-cass-db-node-eddd82cc-1 [16.171.62.87 | 10.0.3.241] (seed: True)
corefile_url=https://storage.cloud.google.com/upload.scylladb.com/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000.gz
backtrace=           PID: 16904 (scylla)
UID: 113 (scylla)
GID: 119 (scylla)
Signal: 6 (ABRT)
Timestamp: Wed 2022-05-11 22:16:52 UTC (2min 1s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 100 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-1 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /scylla.slice/scylla-server.slice/scylla-server.service
Unit: scylla-server.service
Slice: scylla-server.slice
Boot ID: 19b4df5c9a4a4b9e92f475f4af343758
Machine ID: 3415a6f419fe479a89ebb7cce7e15f2e
Hostname: longevity-harry-2h-fix-cass-db-node-eddd82cc-1
Storage: /var/lib/systemd/coredump/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000
Message: Process 16904 (scylla) of user 113 dumped core.
Stack trace of thread 16905:
#0  0x00007fbda50782a2 raise (libc.so.6 + 0x3d2a2)
#1  0x00007fbda5061950 abort (libc.so.6 + 0x26950)
#2  0x000000000463eecc _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla + 0x443eecc)
#3  0x000000000180b1e5 _ZN12_GLOBAL__N_119on_validation_errorERN7seastar6loggerERKNS0_13basic_sstringIcjLj15ELb1EEE (scylla + 0x160b1e5)
#4  0x000000000180bb5f _ZN42mutation_fragment_stream_validating_filterclEN20mutation_fragment_v24kindE26position_in_partition_view (scylla + 0x160bb5f)
#5  0x0000000001d6089d _ZN8sstables14sstable_writer7consumeEO22range_tombstone_change (scylla + 0x1b6089d)
#6  0x0000000001c2f2cc _ZN22compact_mutation_stateIL19emit_only_live_rows0EL20compact_for_sstables1EE10do_consumeIN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerEEN7seastar10bool_classINS7_18stop_iteration_tagEEEO22range_tombstone_changeRT_RT0_ (scylla + 0x1a2f2cc)
#7  0x0000000001c2cc55 _ZN23flat_mutation_reader_v24impl26consume_pausable_in_threadISt17reference_wrapperINS0_16consumer_adapterI25compact_for_compaction_v2IN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerEEEENS_9no_filterEEEvT_T0_ (scylla + 0x1a2cc55)
#8  0x0000000001c2aeb8 _ZN23flat_mutation_reader_v217consume_in_threadI25compact_for_compaction_v2IN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerENS_9no_filterEEEDaT_T0_ (scylla + 0x1a2aeb8)
#9  0x0000000001c21a1b _ZN23flat_mutation_reader_v217consume_in_threadI25compact_for_compaction_v2IN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerEEEDaT_ (scylla + 0x1a21a1b)
#10 0x0000000001c1d8a3 _ZZZN8sstables10compaction7consumeEvENUl23flat_mutation_reader_v2E_clES1_ENUlvE_clEv (scylla + 0x1a1d8a3)
#11 0x0000000001c1d134 _ZN7seastar20noncopyable_functionIFvvEE17direct_vtable_forIZNS_5asyncIZZN8sstables10compaction7consumeEvENUl23flat_mutation_reader_v2E_clES7_EUlvE_JEEENS_8futurizeINSt13invoke_resultIT_JDpT0_EE4typeEE4typeENS_17thread_attributesEOSC_DpOSD_EUlvE_E4callEPKS2_ (scylla + 0x1a1d134)
#12 0x00000000048db8d2 _ZN7seastar14thread_context4mainEv (scylla + 0x46db8d2)
Stack trace of thread 16907:
#0  0x00007fbda5c3994c read (libpthread.so.0 + 0x1294c)
#1  0x00000000046ae285 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla + 0x44ae285)
#2  0x00000000046ae5c0 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC1EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEE3$_0E9_M_invokeERKSt9_Any_data (scylla + 0x44ae5c0)
#3  0x000000000463fa8b _ZN7seastar12posix_thread13start_routineEPv (scylla + 0x443fa8b)
#4  0x00007fbda5c302a5 start_thread (libpthread.so.0 + 0x92a5)
#5  0x00007fbda513b323 __clone (libc.so.6 + 0x100323)
Stack trace of thread 16906:
#0  0x00007fbda5c3994c read (libpthread.so.0 + 0x1294c)
#1  0x00000000046ae285 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla + 0x44ae285)
#2  0x00000000046ae5c0 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC1EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEE3$_0E9_M_invokeERKSt9_Any_data (scylla + 0x44ae5c0)
#3  0x000000000463fa8b _ZN7seastar12posix_thread13start_routineEPv (scylla + 0x443fa8b)
#4  0x00007fbda5c302a5 start_thread (libpthread.so.0 + 0x92a5)
#5  0x00007fbda513b323 __clone (libc.so.6 + 0x100323)
Stack trace of thread 16904:
#0  0x000000000143a991 _ZNK16compound_wrapperI21clustering_key_prefix26clustering_key_prefix_viewE4sizeERK6schema (scylla + 0x123a991)
#1  0x00000000014393e7 _ZNK10bound_view11tri_compareclERK21clustering_key_prefixiS3_i (scylla + 0x12393e7)
#2  0x0000000001789756 _ZN22mutation_reader_mergerclEv (scylla + 0x1589756)
#3  0x000000000178937f _ZN22mutation_reader_mergerclEv (scylla + 0x158937f)
#4  0x0000000001794afb _ZZN14merging_readerI22mutation_reader_mergerE11fill_bufferEvENKUlvE_clEv (scylla + 0x1594afb)
#5  0x0000000001793c1b _ZN14merging_readerI22mutation_reader_mergerE11fill_bufferEv (scylla + 0x1593c1b)
#6  0x0000000001c2ca26 _ZN23flat_mutation_reader_v24impl26consume_pausable_in_threadISt17reference_wrapperINS0_16consumer_adapterI25compact_for_compaction_v2IN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerEEEENS_9no_filterEEEvT_T0_ (scylla + 0x1a2ca26)
#7  0x0000000001c2aeb8 _ZN23flat_mutation_reader_v217consume_in_threadI25compact_for_compaction_v2IN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerENS_9no_filterEEEDaT_T0_ (scylla + 0x1a2aeb8)
#8  0x0000000001c21a1b _ZN23flat_mutation_reader_v217consume_in_threadI25compact_for_compaction_v2IN8sstables26compacted_fragments_writerE33noop_compacted_fragments_consumerEEEDaT_ (scylla + 0x1a21a1b)
#9  0x0000000001c1d8a3 _ZZZN8sstables10compaction7consumeEvENUl23flat_mutation_reader_v2E_clES1_ENUlvE_clEv (scylla + 0x1a1d8a3)
#10 0x0000000001c1d134 _ZN7seastar20noncopyable_functionIFvvEE17direct_vtable_forIZNS_5asyncIZZN8sstables10compaction7consumeEvENUl23flat_mutation_reader_v2E_clES7_EUlvE_JEEENS_8futurizeINSt13invoke_resultIT_JDpT0_EE4typeEE4typeENS_17thread_attributesEOSC_DpOSD_EUlvE_E4callEPKS2_ (scylla + 0x1a1d134)
#11 0x00000000048db8d2 _ZN7seastar14thread_context4mainEv (scylla + 0x46db8d2)
download_instructions=gsutil cp gs://[upload.scylladb.com/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000.gz](http://upload.scylladb.com/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000.gz) .
gunzip /var/lib/systemd/coredump/core.scylla.113.19b4df5c9a4a4b9e92f475f4af343758.16904.1652307412000000000000.gz

Restore Monitor Stack command: $ hydra investigate show-monitor eddd82cc-d745-4a4f-afc2-d8ab979c84aa
Restore monitor on AWS instance using Jenkins job
Show all stored logs command: $ hydra investigate show-logs eddd82cc-d745-4a4f-afc2-d8ab979c84aa

Test id: eddd82cc-d745-4a4f-afc2-d8ab979c84aa

Logs

grafana - https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220511_235515/grafana-screenshot-longevity-harry-2h-test-scylla-per-server-metrics-nemesis-20220511_235639-longevity-harry-2h-fix-cass-monitor-node-eddd82cc-1.png
grafana - https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220511_235515/grafana-screenshot-overview-20220511_235515-longevity-harry-2h-fix-cass-monitor-node-eddd82cc-1.png
db-cluster - https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220512_000756/db-cluster-eddd82cc.tar.gz
loader-set - https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220512_000756/loader-set-eddd82cc.tar.gz
monitor-set - https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220512_000756/monitor-set-eddd82cc.tar.gz
sct - https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220512_000756/sct-runner-eddd82cc.tar.gz](https://cloudius-jenkins-test.s3.amazonaws.com/eddd82cc-d745-4a4f-afc2-d8ab979c84aa/20220512_000756/sct-runner-eddd82cc.tar.gz

Jenkins job URL

@fruch fruch added the triage/master Looking for assignee label May 12, 2022
@slivne slivne added this to the 5.1 milestone May 15, 2022
@slivne
Copy link
Contributor

slivne commented May 15, 2022

We need a reproducer with a system up after this occurs - its a corruption - https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/longevity-harry-2h-test/7/

@fruch
Copy link
Contributor Author

fruch commented May 15, 2022

We need a reproducer with a system up after this occurs - its a corruption - https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/longevity-harry-2h-test/7/

here job (set to keep all the instances):
https://jenkins.scylladb.com/job/scylla-staging/job/fruch/job/longevity-harry-2h-test/9/

takes ~1.5 hour for it to get to it.

@fruch
Copy link
Contributor Author

fruch commented May 15, 2022

@slivne slivne added showstopper bug and removed triage/master Looking for assignee labels May 15, 2022
@fruch
Copy link
Contributor Author

fruch commented May 17, 2022

@bhalevy
Copy link
Member

bhalevy commented May 22, 2022

@mikolajsieluzycki please look into this.
Is this the same validation error you hit in unit testing?

@mikolajsieluzycki
Copy link
Contributor

@bhalevy Seems like exactly the same error

@mikolajsieluzycki
Copy link
Contributor

@fruch I've created a PR #10643 for an error that happened during my unit testing for an unrelated change that produces the same exception. What would be the easies way to verify that PR fixes this issue as well?

@fruch
Copy link
Contributor Author

fruch commented May 24, 2022

@fruch I've created a PR #10643 for an error that happened during my unit testing for an unrelated change that produces the same exception. What would be the easies way to verify that PR fixes this issue as well?

Having an AMI or RPMs with this fix.

@benipeled is the build in PRs create RPMs ? Is it being upload to S3 ?

Also can you point @mikolajsieluzycki to the jobs that should build him RPMs or AMIs from forks ?

@benipeled
Copy link
Contributor

@benipeled is the build in PRs create RPMs ? Is it being upload to S3 ?

The CI job doesn't archive RPMs, but logs (build & tests),
There is an open issue for archiving some artifacts - https://github.com/scylladb/scylla-pkg/issues/2893

Also can you point @mikolajsieluzycki to the jobs that should build him RPMs or AMIs from forks ?

BYO can be used for building RPM and AMI from a fork - https://jenkins.scylladb.com/view/master/job/scylla-master/job/byo/job/byo_build_tests_dtest/

@bhalevy
Copy link
Member

bhalevy commented Jun 8, 2022

@mikolajsieluzycki / @fruch can we close this issue with #10643?

@mikolajsieluzycki
Copy link
Contributor

Waiting for https://jenkins.scylladb.com/view/master/job/scylla-master/job/reproducers/job/longevity-harry-2h-test/lastBuild/console to finish (hopefully kicked it off correctly). According to the description the error should show up after 1.5h. It's over 2h since start so I'm cautiously optimistic.

@mikolajsieluzycki
Copy link
Contributor

The test finished successfully on master, I think it can be closed.

@bhalevy
Copy link
Member

bhalevy commented Jun 9, 2022

@fruch please consider closing this issue as per the above

@fruch
Copy link
Contributor Author

fruch commented Jun 9, 2022

If that test passed with master, then yes, closing this one

@fruch fruch closed this as completed Jun 9, 2022
@DoronArazii DoronArazii modified the milestones: 5.1, 5.0 Jul 7, 2022
@DoronArazii DoronArazii removed this from the 5.0 milestone Nov 8, 2022
@DoronArazii DoronArazii added this to the 5.1 milestone Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants