-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remain expired sstables after nodetool compaction #2253
Comments
one more, |
@ban4785 Note that |
@tgrabiec do you mean nodetool cfstats in read count? that was zero. |
On Mon, Apr 10, 2017 at 3:36 AM, ban4785 ***@***.***> wrote:
i did one more test.
with same schema(SizeTiredCompaction),
1. i inserted data(setting 3,600 TTL, gc_grace_time 0) size was 244G.
2. one day after i commanded
nodetool compact foms
while small data loading(20,000 TPS).
*3. result was 244G -> 114G. almost full expired sstables were removed,
but some remain.*
/var/log/messages compaction info shows all full expired sstables were
compacting, but some remain.
*below image is reactor & request graph(using 210batch, so 100TPS * 201 =
20100 op/s)*
[image: compaction]
<https://cloud.githubusercontent.com/assets/20348151/24848594/6ecbe874-1e02-11e7-80ea-7faa506a1782.png>
*11:25 ~ 11:33 sstable files are full expired sstables*
[image: compaction_size]
<https://cloud.githubusercontent.com/assets/20348151/24848643/aa3bbb00-1e02-11e7-883a-983866636281.png>
*below is /var/log/messages compaction information.*
remain full expired sstable number : 19169 24025 26513 26865 27689 28681
28777 28857 28945 29049
Apr 10 13:19:31 swfosldb02 scylla: [shard 1] compaction - Compacting
[/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c0
00000000003/foms-fdc_unit_param_data-ka-29265-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29257-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29249-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29057-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29049-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29009-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28945-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-26513-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-26865-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-24025-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-19169-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-27689-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28601-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28681-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28745-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28777-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28857-Data.db:level=0, ]
remain full expired sstable number : 21191 22591 23655 23951 23999
Apr 10 13:19:31 swfosldb02 scylla: [shard 7] compaction - Compacting
[/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c0
00000000003/foms-fdc_unit_param_data-ka-29271-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-23967-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-23951-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-22591-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-21191-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29263-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-23655-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29255-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-23999-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-24023-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-24031-Data.db:level=0, ]
remain full expired sstable number : 28908 28996 29036 29164 29204 29228
29236
Apr 10 13:19:31 swfosldb02 scylla: [shard 4] compaction - Compacting
[/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c0
00000000003/foms-fdc_unit_param_data-ka-29268-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29252-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29236-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29260-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29228-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29220-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29212-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29204-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29028-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28980-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28996-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-28908-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29188-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29036-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29084-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29124-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29164-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c00
0000000003/foms-fdc_unit_param_data-ka-29172-Data.db:level=0, ]
*last i found live/disk size was different.*
Keyspace: foms
Read Count: 0
Read Latency: NaN ms.
Write Count: 93491385
Write Latency: 5.956687880920793E-6 ms.
Pending Flushes: 0
Table: fdc_unit_param_data
SSTable count: 72
SSTables in each level: [44/4]
Space used (live): 47705737940
Space used (total): 124197448276
Space used by snapshots (total): 0
Off heap memory used (total): 1301537788
SSTable Compression Ratio: 0.260821
Number of keys (estimate): 102226
Memtable cell count: 9153
Memtable data size: 1207578035
Memtable off heap memory used: 1299185664
Memtable switch count: 8
Local read count: 0
Local read latency: NaN ms
Local write count: 93491385
Local write latency: 0.000 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 0
Bloom filter off heap memory used: 1048672
Index summary off heap memory used: 1303452
Compression metadata off heap memory used: 0
Compacted partition minimum bytes: 447
Compacted partition maximum bytes: 14530764
Compacted partition mean bytes: 1747528
Average live cells per slice (last five minutes): 0.0
Maximum live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0
i want to know how to find live & dead sstables info.
you can use sstabledump (from scylla-tools-java) on each individual
sstable, but I suppose you want to work with stats that covers all sstables
of a column family.
BTW, I sent a RFC patch to mailing list '[RFC] compaction: do not write
expired cell as dead cell if it can be purged right away' that will
probably solve the issue you're facing with expired data surviving after
major compaction. The problem is that expired cell is unconditionally
converted into a tombstone cell. So a sstable full of expired cells will
result in a sstable full of tombstone cells after compaction. That's why
two 'nodetool compact' were needed to actually get rid of data.
… —
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#2253 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABWAc2q3dOhXjt7Y3vzP3R1Op2TS79UTks5ruc3zgaJpZM4MzvNk>
.
|
…d right away When compacting a fully expired sstable, we're not allowing that sstable to be purged because expired cell is *unconditionally* converted into a dead cell. Why not check if the expired cell can be purged instead using gc before and max purgeable timestamp? Currently, we need two compactions to get rid of a fully expired sstable which cells could have always been purged. look at this sstable with expired cell: { "partition" : { "key" : [ "2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 120, "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z", "ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true }, "cells" : [ { "name" : "country", "value" : "1" }, ] now this sstable data after first compaction: [shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79 (~65% of original) in 229ms = 0.000328997MB/s. { ... "rows" : [ { "type" : "row", "position" : 79, "cells" : [ { "name" : "country", "deletion_info" : { "local_delete_time" : "2017-04-09T17:07:12Z" }, "tstamp" : "2017-04-09T17:07:12.702597Z" }, ] now another compaction will actually get rid of data: compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original) in 1ms = 0MB/s. ~2 total partitions merged to 0 NOTE: It's a waste of time to wait for second compaction because the expired cell could have been purged at first compaction because it satisfied gc_before and max purgeable timestamp. Fixes #2249, #2253 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>
…d right away When compacting a fully expired sstable, we're not allowing that sstable to be purged because expired cell is *unconditionally* converted into a dead cell. Why not check if the expired cell can be purged instead using gc before and max purgeable timestamp? Currently, we need two compactions to get rid of a fully expired sstable which cells could have always been purged. look at this sstable with expired cell: { "partition" : { "key" : [ "2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 120, "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z", "ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true }, "cells" : [ { "name" : "country", "value" : "1" }, ] now this sstable data after first compaction: [shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79 (~65% of original) in 229ms = 0.000328997MB/s. { ... "rows" : [ { "type" : "row", "position" : 79, "cells" : [ { "name" : "country", "deletion_info" : { "local_delete_time" : "2017-04-09T17:07:12Z" }, "tstamp" : "2017-04-09T17:07:12.702597Z" }, ] now another compaction will actually get rid of data: compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original) in 1ms = 0MB/s. ~2 total partitions merged to 0 NOTE: It's a waste of time to wait for second compaction because the expired cell could have been purged at first compaction because it satisfied gc_before and max purgeable timestamp. Fixes #2249, #2253 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com> (cherry picked from commit a6f8f4f)
Installation details
Scylla version (or git commit hash): 1.6.1
Cluster size: 3
OS (RHEL/CentOS/Ubuntu/AWS AMI): CentOS 7.2
relate issue : #2249
Platform (physical/VM/cloud instance type/docker): physical
Hardware: sockets= cores= hyperthreading= memory= smp 8, m 20G
Disks: (SSD/HDD, count) SSD
test schema info
CREATE TABLE foms.data (
part_key text,
param_index int,
ts timestamp,
lt text,
product frozen,
spec frozen,
st text,
vl text,
PRIMARY KEY ((part_key, param_index), ts)
) WITH CLUSTERING ORDER BY (ts DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL","rows_per_partition":"ALL"}'
AND comment = ''
AND compaction = {'tombstone_threshold': '0.1', 'tombstone_compaction_interval': '600', 'unchecked_tombstone_compaction': 'true', 'class': 'SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 1.0
AND default_time_to_live = 3600
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 1
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
i inserted data(setting 3,600 TTL, gc_grace_time 0) size was 244G.
and 2hours after i loaded another data with nodetool compact.
i expected removing sstables which include all of expired data(after 3,600 TTL),
but nodetool compact command cannot remove expired sstables.
and after load, restart scylladb and nodetool compact again,
almost expired sstables are removed, but some sstables are remain.
i want to know why compaction cannot remove expired sstables.
this is compaction process with loading another data
nodetool compact and data load started 18:25 and nodetool compact ended 19:30.
nodetool compact foms
while loading data.-> compaction with tombstoned sstables, but almost tombstone sstables remain.
ex) /var/log/messages log
--shard 0 compaction start(13 sstables are expired sstables(ttl & gc_grace_time expire)
Apr 4 18:26:11 scylla: [shard 0] compaction - Compacting [
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-18944-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24672-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24240-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24264-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26688-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26944-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27088-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27104-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27120-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27128-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27136-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27144-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27160-Data.db:level=0,
]
--shard 0 compaction ends(13 sstables are expired sstables(ttl & gc_grace_time expire)
Apr 4 19:18:39 scylla: [shard 0] compaction - Compacted 13 sstables to [
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29248-Data.db:level=0,
]. 134866772819 bytes to 16991023631 (~12% of original) in 3147932ms = 5.14748MB/s. ~28272 total partitions merged to 5009.
--but all expired data sstables remain
(18:15~18:24 sstable files are expired sstable files)
nodetool compact foms
after loading data.-> compaction without expired sstables. only new sstables compact
ex) after load, i commaned nodetool compact. but only new sstables were compacting.
foms-fdc_unit_param_data-ka-29000(after 29000 was new sstables).
below is /var/log/messages log.
Apr 5 09:47:44 scylla: [shard 0] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31784-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31608-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29912-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29248-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31776-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31440-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31272-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-30592-Data.db:level=0, ]
-> almost expired sstables deleting but some expired sstables remain.
ex) shard 0 compact sstables(include expired sstable)
Apr 5 13:58:51 scylla: [shard 0] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27160-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27144-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27136-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24264-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24672-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31808-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-18944-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31800-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24240-Data.db:level=0, ]
after compaction, 240G size to 64G size. but some expired sstables remain.
(18:15~18:24 sstable files are expired sstable files)
include full /var/log/messages
messages.txt
The text was updated successfully, but these errors were encountered: