Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remain expired sstables after nodetool compaction #2253

Closed
ban4785 opened this issue Apr 5, 2017 · 6 comments
Closed

remain expired sstables after nodetool compaction #2253

ban4785 opened this issue Apr 5, 2017 · 6 comments
Assignees
Milestone

Comments

@ban4785
Copy link

ban4785 commented Apr 5, 2017

Installation details
Scylla version (or git commit hash): 1.6.1
Cluster size: 3
OS (RHEL/CentOS/Ubuntu/AWS AMI): CentOS 7.2
relate issue : #2249

Platform (physical/VM/cloud instance type/docker): physical
Hardware: sockets= cores= hyperthreading= memory= smp 8, m 20G
Disks: (SSD/HDD, count) SSD

test schema info
CREATE TABLE foms.data (
part_key text,
param_index int,
ts timestamp,
lt text,
product frozen,
spec frozen,
st text,
vl text,
PRIMARY KEY ((part_key, param_index), ts)
) WITH CLUSTERING ORDER BY (ts DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL","rows_per_partition":"ALL"}'
AND comment = ''
AND compaction = {'tombstone_threshold': '0.1', 'tombstone_compaction_interval': '600', 'unchecked_tombstone_compaction': 'true', 'class': 'SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 1.0
AND default_time_to_live = 3600
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 1
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

i inserted data(setting 3,600 TTL, gc_grace_time 0) size was 244G.
and 2hours after i loaded another data with nodetool compact.
i expected removing sstables which include all of expired data(after 3,600 TTL),
but nodetool compact command cannot remove expired sstables.
and after load, restart scylladb and nodetool compact again,
almost expired sstables are removed, but some sstables are remain.
i want to know why compaction cannot remove expired sstables.

this is compaction process with loading another data
nodetool compact and data load started 18:25 and nodetool compact ended 19:30.
default

  1. command nodetool compact foms while loading data.
    -> compaction with tombstoned sstables, but almost tombstone sstables remain.

ex) /var/log/messages log
--shard 0 compaction start(13 sstables are expired sstables(ttl & gc_grace_time expire)
Apr 4 18:26:11 scylla: [shard 0] compaction - Compacting [
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-18944-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24672-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24240-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24264-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26688-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26944-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27088-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27104-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27120-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27128-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27136-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27144-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27160-Data.db:level=0,
]

--shard 0 compaction ends(13 sstables are expired sstables(ttl & gc_grace_time expire)
Apr 4 19:18:39 scylla: [shard 0] compaction - Compacted 13 sstables to [
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29248-Data.db:level=0,
]. 134866772819 bytes to 16991023631 (~12% of original) in 3147932ms = 5.14748MB/s. ~28272 total partitions merged to 5009.

--but all expired data sstables remain
(18:15~18:24 sstable files are expired sstable files)
remainsstable

  1. command nodetool compact foms after loading data.
    -> compaction without expired sstables. only new sstables compact

ex) after load, i commaned nodetool compact. but only new sstables were compacting.
foms-fdc_unit_param_data-ka-29000(after 29000 was new sstables).
below is /var/log/messages log.
Apr 5 09:47:44 scylla: [shard 0] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31784-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31608-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29912-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29248-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31776-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31440-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31272-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-30592-Data.db:level=0, ]

  1. restart scylladb and nodetool compact foms
    -> almost expired sstables deleting but some expired sstables remain.

ex) shard 0 compact sstables(include expired sstable)
Apr 5 13:58:51 scylla: [shard 0] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27160-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27144-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27136-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24264-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24672-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31808-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-18944-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-31800-Data.db:level=0,
/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24240-Data.db:level=0, ]

after compaction, 240G size to 64G size. but some expired sstables remain.
(18:15~18:24 sstable files are expired sstable files)
compaction2

include full /var/log/messages
messages.txt

@ban4785
Copy link
Author

ban4785 commented Apr 5, 2017

one more,
i commanded truncate table and found expired sstables remain table directory.
only new sstables were moved to snapshot directory.
i guess scylladb can't recognizes sstable so cracked sstable meta system.

  1. only 2 sstables moved to snapshot directory.
    truncate2

  2. other sstables remain in table directory(after truncate situation).
    truncate1

@tgrabiec
Copy link
Contributor

tgrabiec commented Apr 6, 2017

@ban4785 Note that unchecked_tombstone_compaction is currently ignored, Scylla always acts as if it was false. Do you have active reads at the time you run nodetool compact ?

@ban4785
Copy link
Author

ban4785 commented Apr 6, 2017

@tgrabiec do you mean nodetool cfstats in read count? that was zero.
Local read count: 0
Local read latency: NaN ms
and while nodetool compact, there was no read operations.

@slivne slivne added this to the 1.8 milestone Apr 7, 2017
@ban4785
Copy link
Author

ban4785 commented Apr 10, 2017

i did one more test.
with same schema(SizeTiredCompaction),

  1. i inserted data(setting 3,600 TTL, gc_grace_time 0) size was 244G.
  2. one day after i commanded nodetool compact foms while small data loading(20,000 TPS).
    3. result was 244G -> 114G. almost full expired sstables were removed, but some remain.
    /var/log/messages compaction info shows all full expired sstables were compacting, but some remain.

below image is reactor & request graph(using 210batch, so 100TPS * 201 = 20100 op/s)
compaction

11:25 ~ 11:33 sstable files are full expired sstables
compaction_size

below is /var/log/messages compaction information.

log shows compacting full expired sstable number, but remain in filesystem : 19169 24025 26513 26865 27689 28681 28777 28857 28945 29049

Apr 10 13:19:31 scylla: [shard 1] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29265-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29257-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29249-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29057-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29049-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29009-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28945-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26513-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26865-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24025-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-19169-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-27689-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28601-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28681-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28745-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28777-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28857-Data.db:level=0, ]

log shows compacting full expired sstable number, but remain in filesystem : 21191 22591 23655 23951 23999

Apr 10 13:19:31 scylla: [shard 7] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29271-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-23967-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-23951-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-22591-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-21191-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29263-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-23655-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29255-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-23999-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24023-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-24031-Data.db:level=0, ]

log shows compacting full expired sstable number, but remain in filesystem : 28908 28996 29036 29164 29204 29228 29236

Apr 10 13:19:31 scylla: [shard 4] compaction - Compacting [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29268-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29252-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29236-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29260-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29228-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29220-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29212-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29204-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29028-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28980-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28996-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28908-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29188-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29036-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29084-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29124-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29164-Data.db:level=0, /scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29172-Data.db:level=0, ]

last i found live/disk size was different.
Keyspace: foms
Read Count: 0
Read Latency: NaN ms.
Write Count: 93491385
Write Latency: 5.956687880920793E-6 ms.
Pending Flushes: 0
Table: fdc_unit_param_data
SSTable count: 72
SSTables in each level: [44/4]
Space used (live): 47705737940
Space used (total): 124197448276

Space used by snapshots (total): 0
Off heap memory used (total): 1301537788
SSTable Compression Ratio: 0.260821
Number of keys (estimate): 102226
Memtable cell count: 9153
Memtable data size: 1207578035
Memtable off heap memory used: 1299185664
Memtable switch count: 8
Local read count: 0
Local read latency: NaN ms
Local write count: 93491385
Local write latency: 0.000 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 0
Bloom filter off heap memory used: 1048672
Index summary off heap memory used: 1303452
Compression metadata off heap memory used: 0
Compacted partition minimum bytes: 447
Compacted partition maximum bytes: 14530764
Compacted partition mean bytes: 1747528
Average live cells per slice (last five minutes): 0.0
Maximum live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0

i want to know how to find live & dead sstables info.

@ban4785
Copy link
Author

ban4785 commented Apr 10, 2017

plus above that case, i tested truncate and restart scylladb.
below is test process.

  1. truncate table
  2. check remain expired sstables(whether move to snapshot directory)
  3. restart ScyllaDB & cfstats table
  4. select data in truncate tables

1. truncate & /var/log/messages log. i found seastar exception logs.
Apr 10 16:17:11 swfosldb02 scylla: [shard 0] compaction - Compacting [/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1232-Data.db:level=0, /scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1224-Data.db:level=0, /scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1216-Data.db:level=0, /scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1208-Data.db:level=0, ]
Apr 10 16:17:11 scylla: [shard 0] compaction - Compacted 4 sstables to [/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1240-Data.db:level=0, ]. 13849 bytes to 13192 (~95% of original) in 16ms = 0.786304MB/s. ~1024 total partitions merged to 1.
Apr 10 16:17:12 swfosldb02 scylla: [shard 2] seastar - Exceptional future ignored: std::system_error (error system:32, Broken pipe)
Apr 10 16:17:12 scylla: [shard 1] seastar - Exceptional future ignored: std::system_error (error system:32, Broken pipe)
Apr 10 16:17:12 scylla: [shard 0] seastar - Exceptional future ignored: std::system_error (error system:32, Broken pipe)

Apr 10 16:17:12 scylla: [shard 0] compaction - Compacting [/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1256-Data.db:level=0, /scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1248-Data.db:level=0, /scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1264-Data.db:level=0, /scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1240-Data.db:level=0, ]
Apr 10 16:17:12 scylla: [shard 0] compaction - Compacted 4 sstables to [/scylla/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-1272-Data.db:level=0, ]. 13849 bytes to 13192 (~95% of original) in 15ms = 0.838725MB/s. ~1024 total partitions merged to 1.

2. after truncate, i found remain sstables that were not moved to snapshot directory.
truncate

3. restart ScyllaDB & nodetool cfstats foms.
i can see ScyllaDB uses full expired sstables.

Keyspace: foms
Read Count: 16
Read Latency: 0.0 ms.
Write Count: 162022
Write Latency: 2.487316537260372E-6 ms.
Pending Flushes: 0
Table: fdc_unit_param_data
SSTable count: 35
SSTables in each level: [35/4]
Space used (live): 76510685027
Space used (total): 76510685027

Space used by snapshots (total): 47897706766
Off heap memory used (total): 1631285
SSTable Compression Ratio: 0.266795
Number of keys (estimate): 63882
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 8
Local read count: 16
Local read latency: 0.000 ms
Local write count: 162022
Local write latency: 0.000 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 15440
Bloom filter off heap memory used: 1048672
Index summary off heap memory used: 582613
Compression metadata off heap memory used: 0
Compacted partition minimum bytes: 447
Compacted partition maximum bytes: 20924300
Compacted partition mean bytes: 4496254
Average live cells per slice (last five minutes): 0.0
Maximum live cells per slice (last five minutes): 0.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0

4. select data in truncated table and i can see data. that,s not expired data but loading data recently.
i guess nodetool compact foms while loading data makes some sstables(not deleted)
and truncate error.

truncate

@raphaelsc
Copy link
Member

raphaelsc commented Apr 10, 2017 via email

avikivity pushed a commit that referenced this issue Apr 13, 2017
…d right away

When compacting a fully expired sstable, we're not allowing that sstable
to be purged because expired cell is *unconditionally* converted into a
dead cell. Why not check if the expired cell can be purged instead using
gc before and max purgeable timestamp?

Currently, we need two compactions to get rid of a fully expired sstable
which cells could have always been purged.

look at this sstable with expired cell:
  {
    "partition" : {
      "key" : [ "2" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 120,
        "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z",
"ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true },
        "cells" : [
          { "name" : "country", "value" : "1" },
        ]

now this sstable data after first compaction:
[shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79
(~65% of original) in 229ms = 0.000328997MB/s.

  {
    ...
    "rows" : [
      {
        "type" : "row",
        "position" : 79,
        "cells" : [
          { "name" : "country", "deletion_info" :
{ "local_delete_time" : "2017-04-09T17:07:12Z" },
            "tstamp" : "2017-04-09T17:07:12.702597Z"
          },
        ]

now another compaction will actually get rid of data:
compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original)
in 1ms = 0MB/s. ~2 total partitions merged to 0

NOTE:
It's a waste of time to wait for second compaction because the expired
cell could have been purged at first compaction because it satisfied
gc_before and max purgeable timestamp.

Fixes #2249, #2253

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>
@slivne slivne closed this as completed Apr 26, 2017
avikivity pushed a commit that referenced this issue May 23, 2017
…d right away

When compacting a fully expired sstable, we're not allowing that sstable
to be purged because expired cell is *unconditionally* converted into a
dead cell. Why not check if the expired cell can be purged instead using
gc before and max purgeable timestamp?

Currently, we need two compactions to get rid of a fully expired sstable
which cells could have always been purged.

look at this sstable with expired cell:
  {
    "partition" : {
      "key" : [ "2" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 120,
        "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z",
"ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true },
        "cells" : [
          { "name" : "country", "value" : "1" },
        ]

now this sstable data after first compaction:
[shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79
(~65% of original) in 229ms = 0.000328997MB/s.

  {
    ...
    "rows" : [
      {
        "type" : "row",
        "position" : 79,
        "cells" : [
          { "name" : "country", "deletion_info" :
{ "local_delete_time" : "2017-04-09T17:07:12Z" },
            "tstamp" : "2017-04-09T17:07:12.702597Z"
          },
        ]

now another compaction will actually get rid of data:
compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original)
in 1ms = 0MB/s. ~2 total partitions merged to 0

NOTE:
It's a waste of time to wait for second compaction because the expired
cell could have been purged at first compaction because it satisfied
gc_before and max purgeable timestamp.

Fixes #2249, #2253

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com>
(cherry picked from commit a6f8f4f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants