scylla start very slowly, Spend a lot of time for `Loading repair history` #16774

zey1996 · 2024-01-15T06:03:05Z

Installation details
Scylla version: 5.2.11-arm64
Cluster size: 4 Node
OS: CentOS
Hardware details (for performance issues) Delete if unneeded
Platform: kubernetes containerd
Hardware:
memory=320G
cpu:

Architecture:          aarch64
Byte Order:            Little Endian
CPU(s):                128
On-line CPU(s) list:   0-127
Thread(s) per core:    1
Core(s) per socket:    64
Socket(s):             2
NUMA node(s):          4
Model:                 0
CPU max MHz:           2600.0000
CPU min MHz:           200.0000
BogoMIPS:              200.00
L1d cache:             64K
L1i cache:             64K
L2 cache:              512K
L3 cache:              32768K
NUMA node0 CPU(s):     0-31
NUMA node1 CPU(s):     32-63
NUMA node2 CPU(s):     64-95
NUMA node3 CPU(s):     96-127
Flags:                 fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

Disks: 4 SSD, raid0

I use scylla-manager for repair. and I use tombstone_gc = {'mode':'repair'} on my table.
First the cluster runs for several days. and I want to rolling restart my cluster.
But I found scylla is very slow to start. It takes more than an hour
and i check the log, found this:

Looks like scylla is loading repair_history.
Then I check the system.repair_history, found out that this table has millions of records.
I try to learn the source code of scylla, but I don't found some code for clean this table,
I guess this table is used by gc. but it makes scylla start too slowly.
How can I fix it? or maybe I can do somethink to clean this table?

The text was updated successfully, but these errors were encountered:

mykaul · 2024-01-15T07:47:21Z

@asias - thoughts?

MyByte0 · 2024-01-22T09:38:37Z

@asias - thoughts?
That two point I found.

The repair_service::load_history() has get_tables_metadata().for_each_table_gently, I use get_tables_metadata().parallel_for_each_table replace. Is that ok?
There is no logic for table repair_history. Maybe need add ttl for this?

Using `parallel_for_each_table` instance of `for_each_table_gently` on `repair_service::load_history`, to reduced bootstrap time. Using uuid_xor_to_uint32 on repair load_history dispatch to shard. Ref: #16774 Closes #16927 * github.com:scylladb/scylladb: repair: resolve load_history shard load skew repair: accelerate repair load_history time

mykaul · 2024-02-20T11:54:41Z

Now that #16927 is in - what's left here?

zey1996 · 2024-02-21T09:47:30Z

Now that #16927 is in - what's left here?

#16927
It will make the records load faster. but I think we should control the num of records.
#17103

mykaul · 2024-03-10T16:19:27Z

I see there's work still on #17103 - I assume it might miss 6.0, shall I defer this to 6.1?

asias · 2024-04-08T01:44:54Z

Fixed by 99b7ccf. Closing.

mykaul added triage/oss area/repair labels Jan 15, 2024

MyByte0 mentioned this issue Jan 23, 2024

repair: accelerate repair load_history time #16927

Merged

mykaul added this to the 6.0 milestone Feb 20, 2024

mykaul modified the milestones: 6.0, 6.1 Mar 28, 2024

mykaul removed the triage/oss label Mar 28, 2024

asias closed this as completed Apr 8, 2024

mykaul modified the milestones: 6.1, 6.0 Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scylla start very slowly, Spend a lot of time for `Loading repair history` #16774

scylla start very slowly, Spend a lot of time for `Loading repair history` #16774

zey1996 commented Jan 15, 2024 •

edited

mykaul commented Jan 15, 2024

MyByte0 commented Jan 22, 2024 •

edited

mykaul commented Feb 20, 2024

zey1996 commented Feb 21, 2024

mykaul commented Mar 10, 2024

asias commented Apr 8, 2024

scylla start very slowly, Spend a lot of time for Loading repair history #16774

scylla start very slowly, Spend a lot of time for Loading repair history #16774

Comments

zey1996 commented Jan 15, 2024 • edited

mykaul commented Jan 15, 2024

MyByte0 commented Jan 22, 2024 • edited

mykaul commented Feb 20, 2024

zey1996 commented Feb 21, 2024

mykaul commented Mar 10, 2024

asias commented Apr 8, 2024

scylla start very slowly, Spend a lot of time for `Loading repair history` #16774

scylla start very slowly, Spend a lot of time for `Loading repair history` #16774

zey1996 commented Jan 15, 2024 •

edited

MyByte0 commented Jan 22, 2024 •

edited