nodetool compact #2249

infordb · 2017-04-04T02:32:55Z

version : scylla-server-1.6.1-20170218.2ea4da2.el7.centos.x86_64

we set follow
default_time_to_live = 3600
gc_grace_seconds = 0
size tiered compaction

After all the data had expired I ran nodetool compact
I expected all data will be removed.
But, By compaction, scylladb created one new sstable for each core
It's odd why all the files were not removed.
If I queryed the data with cql, all the data was not retrieved.

and I retried to run nodetool compact again
at this time all the sstable was removed..

<< log >>
Apr 4 08:56:24 scylla: [shard 5] compaction - Compacted 10 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28653-Data.db:level=0, ]. 19405795738 bytes to 4553631298 (~23% of original) in 364809ms = 11.904MB/s. ~14602 total partitions merged to 2356.
Apr 4 08:58:05 scylla: [shard 2] compaction - Compacted 6 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-25578-Data.db:level=0, ]. 51278523482 bytes to 10655948306 (~20% of original) in 466527ms = 21.7829MB/s. ~13303 total partitions merged to 2060.
Apr 4 08:59:04 scylla: [shard 6] compaction - Compacted 6 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26870-Data.db:level=0, ]. 58855197953 bytes to 8495292168 (~14% of original) in 525020ms = 15.4313MB/s. ~14324 total partitions merged to 2650.
Apr 4 09:00:46 scylla: [shard 7] compaction - Compacted 8 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26695-Data.db:level=0, ]. 67943217383 bytes to 8656503250 (~12% of original) in 627232ms = 13.1618MB/s. ~17656 total partitions merged to 3363.
Apr 4 09:00:54 scylla: [shard 3] compaction - Compacted 9 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26987-Data.db:level=0, ]. 70316600835 bytes to 8740187086 (~12% of original) in 635000ms = 13.1264MB/s. ~17894 total partitions merged to 3448.
Apr 4 09:09:25 scylla: [shard 4] compaction - Compacted 8 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26844-Data.db:level=0, ]. 105815522227 bytes to 12674166202 (~11% of original) in 1146437ms = 10.5431MB/s. ~20145 total partitions merged to 3796.
Apr 4 09:23:27 scylla: [shard 0] compaction - Compacted 10 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-21368-Data.db:level=0, ]. 106218801095 bytes to 11250135207 (~10% of original) in 1988265ms = 5.39614MB/s. ~22563 total partitions merged to 4840.
Apr 4 09:30:45 scylla: [shard 1] compaction - Compacted 13 sstables to [/scylla/data/foms/fdc_unit_param_data-87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29441-Data.db:level=0, ]. 153162522008 bytes to 26840595066 (~17% of original) in 2426474ms = 10.5491MB/s. ~28695 total partitions merged to 6236.

slivne · 2017-04-04T13:52:05Z

not clear when did you run compaction - after 3600 seconds of the last write ?

raphaelsc · 2017-04-04T15:28:19Z

On Mon, Apr 3, 2017 at 11:32 PM, Junghyun Park ***@***.***> wrote: we set follow default_time_to_live = 3600 gc_grace_seconds = 0 size tiered compaction After all the data had expired I ran nodetool compact I expected all data will be removed. But, By compaction, scylladb created one new sstable for each core It's strange why all the files were not removed. If I queryed the data with cql, all the data was not retrieved.

and I retried to run nodetool compact

at this time all the sstable was removed..

were you able to query data again here?

<< log >> Apr 4 08:56:24 swfosldb01 scylla: [shard 5] compaction - Compacted 10 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-28653-Data.db:level=0, ]. 19405795738 bytes to 4553631298 (~23% of original) in 364809ms = 11.904MB/s. ~14602 total partitions merged to 2356. Apr 4 08:58:05 swfosldb01 scylla: [shard 2] compaction - Compacted 6 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-25578-Data.db:level=0, ]. 51278523482 bytes to 10655948306 (~20% of original) in 466527ms = 21.7829MB/s. ~13303 total partitions merged to 2060. Apr 4 08:59:04 swfosldb01 scylla: [shard 6] compaction - Compacted 6 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26870-Data.db:level=0, ]. 58855197953 bytes to 8495292168 (~14% of original) in 525020ms = 15.4313MB/s. ~14324 total partitions merged to 2650. Apr 4 09:00:46 swfosldb01 scylla: [shard 7] compaction - Compacted 8 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26695-Data.db:level=0, ]. 67943217383 bytes to 8656503250 (~12% of original) in 627232ms = 13.1618MB/s. ~17656 total partitions merged to 3363. Apr 4 09:00:54 swfosldb01 scylla: [shard 3] compaction - Compacted 9 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26987-Data.db:level=0, ]. 70316600835 bytes to 8740187086 (~12% of original) in 635000ms = 13.1264MB/s. ~17894 total partitions merged to 3448. Apr 4 09:09:25 swfosldb01 scylla: [shard 4] compaction - Compacted 8 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-26844-Data.db:level=0, ]. 105815522227 bytes to 12674166202 (~11% of original) in 1146437ms = 10.5431MB/s. ~20145 total partitions merged to 3796. Apr 4 09:23:27 swfosldb01 scylla: [shard 0] compaction - Compacted 10 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-21368-Data.db:level=0, ]. 106218801095 bytes to 11250135207 (~10% of original) in 1988265ms = 5.39614MB/s. ~22563 total partitions merged to 4840. Apr 4 09:30:45 swfosldb01 scylla: [shard 1] compaction - Compacted 13 sstables to [/scylla/data/foms/fdc_unit_param_data- 87855d400ddc11e79d3c000000000003/foms-fdc_unit_param_data-ka-29441-Data.db:level=0, ]. 153162522008 bytes to 26840595066 (~17% of original) in 2426474ms = 10.5491MB/s. ~28695 total partitions merged to 6236.

could you please upload the full log? which scylla version are you running? what are the scylla settings? did you change smp count at some point?

…

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2249>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABWAczjS6aUv2XVNOqHO0FQxHHYAR-x4ks5rsavbgaJpZM4MyURi> .

infordb · 2017-04-05T01:11:59Z

@raphaelsc @slivne

After all the data had expired I ran nodetool compact
-> this mean after 3600 seconds I ran compact
when I ran nodetool compact , no write/read request
schema
CREATE TABLE xxxxxxxx (
part_key text,
param_index int,
ts timestamp,
lt text,
product frozen<fdc_product_udt>,
spec frozen<fdc_spec_udt>,
st text,
vl text,
PRIMARY KEY ((part_key, param_index), ts)
) WITH CLUSTERING ORDER BY (ts DESC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL","rows_per_partition":"ALL"}'
AND comment = ''
AND compaction = {'tombstone_threshold': '0.1', 'tombstone_compaction_interval': '600', 'unchecked_tombstone_compaction': 'true', 'class': 'SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 1.0
AND default_time_to_live = 3600
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 1
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
full compaction log
compaction_log.txt
we set smp to 8 , and naver changed smp
scylla version : scylla-server-1.6.1-20170218.2ea4da2.el7.centos.x86_64

infordb · 2017-04-06T04:40:45Z

@slivne @raphaelsc
Simple Test

create table
CREATE TABLE test.users (
user_id bigint PRIMARY KEY,
country text,
name text
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL","rows_per_partition":"ALL"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 'SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
insert data using ttl
cqlsh:test> insert into users(user_id, country, name) values(1,'11','111') using ttl 20;
flush & sstable2json after 20 seconds
[root@users-e9655b300df211e79d21000000000002]# sstable2json test-users-ka-1917950-Data.db
[
{"key": "1",
"cells": [["",1491452215,1491452214999171,"d"],
["country",1491452215,1491452214999171,"d"],
["name",1491452215,1491452214999171,"d"]]}
]
nodetool compact
Apr 6 13:17:35 scylla: [shard 6] compaction - Compacting [/home/scylla/data/test/users-e9655b300df211e79d21000000000002/test-users-ka-1917950-Data.db:level=0, ]
Apr 6 13:17:35 scylla: [shard 6] compaction - Compacted 1 sstables to [/home/scylla/data/test/users-e9655b300df211e79d21000000000002/test-users-ka-1917958-Data.db:level=0, ]. 118 bytes to 79 (~66% of original) in 292ms = 0.000258015MB/s. ~256 total partitions merged to 1.

Unexpectedly, scylla generated a new sstable

sstable2json after compaction
[root@fomstap04 users-e9655b300df211e79d21000000000002]# sstable2json test-users-ka-1917958-Data.db
[
{"key": "1",
"cells": [["country",1491452215,1491452214999171,"d"],
["name",1491452215,1491452214999171,"d"]]}
]

Unexpectedly, scylla deleted only the cql row marker

tgrabiec · 2017-04-06T12:26:51Z

I was not able to reproduce the problem by following the steps from the previous comment on a clean database.

After step 3 I get the same:

[
{"key": "1",
 "cells": [["",1491473264,1491473264315097,"d"],
           ["country",1491473264,1491473264315097,"d"],
           ["name",1491473264,1491473264315097,"d"]]}
]

But after step 4 all sstables are removed.

@infordb After step 4, do you have any sstables on disk apart from the ones which are the result of nodetool compact?

infordb · 2017-04-06T23:35:49Z

@tgrabiec There is only one which are the result of nodetool compact
my scylla version is scylla-server-1.6.1-20170218.2ea4da2.el7.centos.x86_64

slivne · 2017-04-12T19:43:24Z

@infordb I ran the test according to your instructions on the same exact version 1.6.1-20170218.2ea4da2 (using docker) and am not able to reproduce this as well - following nodetool compact there are no sstables

raphaelsc · 2017-04-12T20:45:30Z

On Wed, Apr 12, 2017 at 4:43 PM, Shlomi Livne ***@***.***> wrote: @infordb <https://github.com/infordb> I ran the test according to your instructions on the same exact version 1.6.1-20170218.2ea4da2 (using docker) and am not able to reproduce this as well - following nodetool compact there are no sstables

@slivne: '[RFC] compaction: do not write expired cell as dead cell if it can be purged right away' will fix this issue. I need to add some test cases for v2.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2249 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABWAcxECKb9XgZTXouF_VcrbKtps7uMkks5rvSlegaJpZM4MyURi> .

…d right away When compacting a fully expired sstable, we're not allowing that sstable to be purged because expired cell is *unconditionally* converted into a dead cell. Why not check if the expired cell can be purged instead using gc before and max purgeable timestamp? Currently, we need two compactions to get rid of a fully expired sstable which cells could have always been purged. look at this sstable with expired cell: { "partition" : { "key" : [ "2" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 120, "liveness_info" : { "tstamp" : "2017-04-09T17:07:12.702597Z", "ttl" : 20, "expires_at" : "2017-04-09T17:07:32Z", "expired" : true }, "cells" : [ { "name" : "country", "value" : "1" }, ] now this sstable data after first compaction: [shard 0] compaction - Compacted 1 sstables to [...]. 120 bytes to 79 (~65% of original) in 229ms = 0.000328997MB/s. { ... "rows" : [ { "type" : "row", "position" : 79, "cells" : [ { "name" : "country", "deletion_info" : { "local_delete_time" : "2017-04-09T17:07:12Z" }, "tstamp" : "2017-04-09T17:07:12.702597Z" }, ] now another compaction will actually get rid of data: compaction - Compacted 1 sstables to []. 79 bytes to 0 (~0% of original) in 1ms = 0MB/s. ~2 total partitions merged to 0 NOTE: It's a waste of time to wait for second compaction because the expired cell could have been purged at first compaction because it satisfied gc_before and max purgeable timestamp. Fixes #2249, #2253 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20170413001049.9663-1-raphaelsc@scylladb.com> (cherry picked from commit a6f8f4f)

As explained in scylladb#2249 , slapd takes a lot of memory due to the high max open files limit in the testing environment (needed by scylla). This patch implements the solution proposed to use prlimit with reduced limits only on slapd so it will consume (a lot!!!) less memory and will not fail on failure to allocate. Tests: ldap unit tests in debug and dev mode. Fixes scylladb#2249 Signed-off-by: Eliran Sinvani <eliransin@scylladb.com> Closes scylladb#2250

ban4785 mentioned this issue Apr 5, 2017

remain expired sstables after nodetool compaction #2253

Closed

tgrabiec self-assigned this Apr 6, 2017

tgrabiec added the bug label Apr 6, 2017

tzach added this to the 1.8 milestone Apr 13, 2017

tzach added the user request label Apr 13, 2017

avikivity closed this as completed in a6f8f4f Apr 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nodetool compact #2249

nodetool compact #2249

infordb commented Apr 4, 2017 •

edited

slivne commented Apr 4, 2017

raphaelsc commented Apr 4, 2017 via email

infordb commented Apr 5, 2017 •

edited

infordb commented Apr 6, 2017 •

edited

tgrabiec commented Apr 6, 2017

infordb commented Apr 6, 2017

slivne commented Apr 12, 2017

raphaelsc commented Apr 12, 2017 via email

nodetool compact #2249

nodetool compact #2249

Comments

infordb commented Apr 4, 2017 • edited

slivne commented Apr 4, 2017

raphaelsc commented Apr 4, 2017 via email

infordb commented Apr 5, 2017 • edited

infordb commented Apr 6, 2017 • edited

tgrabiec commented Apr 6, 2017

infordb commented Apr 6, 2017

slivne commented Apr 12, 2017

raphaelsc commented Apr 12, 2017 via email

infordb commented Apr 4, 2017 •

edited

infordb commented Apr 5, 2017 •

edited

infordb commented Apr 6, 2017 •

edited