Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data expired via TTL is returned with null values for a short period of time #5290

Closed
eyusim opened this issue Nov 16, 2019 · 4 comments
Closed
Assignees
Milestone

Comments

@eyusim
Copy link

@eyusim eyusim commented Nov 16, 2019

Installation details
Scylla version (or git commit hash): 2.1.6 / 3.1.1
OS (RHEL/CentOS/Ubuntu/AWS AMI): CentOS

We have noticed after upgrading from 2.0 -> 3.1.1 (starting with 2.1.6) that scylla will return rows that have been recently expired due to TTL with null values in non-key columns.

This issue happens in both 2.1.6 and 3.1.1.

Below is python code to reproduce the issue (just change the port in the Cluster constructor).

The code inserts 2 rows with same partition key, one row has a 5 second ttl and the other 1 hour. Continuous querying of the data will show that about 5 times the query returns 2 rows with the expired rows data having been wiped out but it's still being returned.

Running the code against a docker container for the latest cassandra or scylla 2.0.2 behaves as you would expect, with the result set eventually being 1 row and no missing data in-between.

from cassandra.cluster import Cluster
import time

cluster = Cluster(['localhost'], port=9043)

session = cluster.connect()
session.execute('drop keyspace if exists ttltest ')

session.execute("""create keyspace if not exists ttltest WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }""")
session.set_keyspace('ttltest')

session.execute("""
create table if not exists data_with_ttl(
            thekey text,
            num bigint,
            data text,
            PRIMARY KEY (thekey, num)
          ) 
""")

session.execute("insert into data_with_ttl (thekey, num, data) values('key1', 123, 'long live data') using TTL 3600")
session.execute("insert into data_with_ttl (thekey, num, data) values('key1', 456, 'quick expire data') using TTL 5")

tries = 0
while tries < 100:
    tries = tries + 1
    rows = session.execute("select * from data_with_ttl where thekey = 'key1'")
    resultSize = len(rows.current_rows)
    print "try number:", tries, "result size:", resultSize
    for row in rows:
        print '\t',  row.num, row.data
    if resultSize == 1:
        break
    time.sleep(0.2)

session.execute('drop keyspace ttltest')
@tzach

This comment has been minimized.

Copy link
Contributor

@tzach tzach commented Nov 17, 2019

Look like #4263

@slivne slivne added the CQL label Nov 17, 2019
@slivne

This comment has been minimized.

Copy link
Contributor

@slivne slivne commented Nov 17, 2019

@haaawk please assign to have someone look at this

@slivne slivne added the bug label Nov 17, 2019
@slivne slivne added this to the 3.2 milestone Nov 17, 2019
@slivne slivne added the User Request label Nov 17, 2019
piodul added a commit to piodul/scylla that referenced this issue Nov 18, 2019
Corrects condition on which a row was considered expired by its TTL.
The previous, invalid logic caused a situation in which a row inserted
with TTL changed all its non-key cells to null one second before the row
disappeared.

Fixes: scylladb#4263, scylladb#5290.

Tests: unit(dev)
piodul added a commit to piodul/scylla that referenced this issue Nov 19, 2019
This change corrects condition on which a row was considered expired by
its TTL.

The logic that decides when a row becomes expired was inconsistent with
the logic that decides if a single cell is expired. A single cell
becomes expired when `expiry_timestamp <= now`, while a row became
expired when `expiry_timestamp < now` (notice the strict inequality).
For rows inserted with TTL, this caused non-key cells to expire (change
their values to null) one second before the row disappeared. Now, row
expiry logic uses non-strict inequality.

Fixes: scylladb#4263, scylladb#5290.

Tests:
- unit(dev)
- python test described in issue scylladb#5290
avikivity added a commit that referenced this issue Nov 20, 2019
Merged patch set by Piotr Dulikowski:

This change corrects condition on which a row was considered expired by its
TTL.

The logic that decides when a row becomes expired was inconsistent with the
logic that decides if a single cell is expired. A single cell becomes expired
when expiry_timestamp <= now, while a row became expired when
expiry_timestamp < now (notice the strict inequality). For rows inserted
with TTL, this caused non-key cells to expire (change their values to null)
one second before the row disappeared. Now, row expiry logic uses non-strict
inequality.

Fixes #4263,
Fixes #5290.

Tests:

    unit(dev)
    python test described in issue #5290

(cherry picked from commit 9b9609c)
avikivity added a commit that referenced this issue Nov 20, 2019
Merged patch set by Piotr Dulikowski:

This change corrects condition on which a row was considered expired by its
TTL.

The logic that decides when a row becomes expired was inconsistent with the
logic that decides if a single cell is expired. A single cell becomes expired
when expiry_timestamp <= now, while a row became expired when
expiry_timestamp < now (notice the strict inequality). For rows inserted
with TTL, this caused non-key cells to expire (change their values to null)
one second before the row disappeared. Now, row expiry logic uses non-strict
inequality.

Fixes #4263,
Fixes #5290.

Tests:

    unit(dev)
    python test described in issue #5290

(cherry picked from commit 9b9609c)
avikivity added a commit that referenced this issue Nov 20, 2019
Merged patch set by Piotr Dulikowski:

This change corrects condition on which a row was considered expired by its
TTL.

The logic that decides when a row becomes expired was inconsistent with the
logic that decides if a single cell is expired. A single cell becomes expired
when expiry_timestamp <= now, while a row became expired when
expiry_timestamp < now (notice the strict inequality). For rows inserted
with TTL, this caused non-key cells to expire (change their values to null)
one second before the row disappeared. Now, row expiry logic uses non-strict
inequality.

Fixes #4263,
Fixes #5290.

Tests:

    unit(dev)
    python test described in issue #5290

(cherry picked from commit 9b9609c)
(cherry picked from commit 95acf71)
@avikivity

This comment has been minimized.

Copy link
Contributor

@avikivity avikivity commented Nov 20, 2019

Backported to 3.0, 3.1, 3.2.

@avikivity

This comment has been minimized.

Copy link
Contributor

@avikivity avikivity commented Dec 24, 2019

Already backported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.