Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secondary Indexes not working for clustering keys #4144

Closed
jwnx opened this issue Jan 24, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@jwnx
Copy link
Contributor

commented Jan 24, 2019

Installation details
Scylla version: 3.0.1
Cluster size: 1
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0e408f1b170700dd0

Steps to Reproduce:

  1. Use this yaml file:
keyspace: ks
keyspace_definition: |
  CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

table: example
table_definition: |
  CREATE TABLE example (
    p int,
    c1 int,
    c2 int,
    v1 int,
    v2 int,
    PRIMARY KEY (p, c1, c2))
    WITH compression = {}

extra_definitions:
  - CREATE INDEX ON example (c2);

columnspec:
  - name: p
    population: uniform(1..20M)
  - name: c1
    cluster: fixed(1000)
    population: gaussian(0..1000, 500, 250)
  - name: c2
    cluster: fixed(1000)
    population: gaussian(0..1000, 500, 250)
  - name: v1
    size: fixed(100)
    population: gaussian(0..10000, 5000, 5000)
  - name: v2
    size: fixed(100)
    population: gaussian(0..10000, 5000, 5000)

insert:
  partitions: fixed(1)  
  batchtype: UNLOGGED            
  select: fixed(2000)/2000    

queries:
   simple1:
      cql: select * from ks.example where c2 = ? and v1 = ? and v2 = ? ALLOW FILTERING;
      fields: samerow
  1. Run this in your favorite node
cassandra-stress user profile=cassandra-example.yaml n=10 ops\(insert=1\) no-warmup -node NODE_IP -rate 'threads=450' -mode 'native cql3 connectionsPerHost=30'

What happens

Data is there:

cqlsh> select * from ks.example LIMIT 10;

 p        | c1 | c2 | v1    | v2
----------+----+----+-------+------
 12987323 |  0 |  0 |  7655 |    0
 12987323 |  0 |  1 |  6148 | 4136
 12987323 |  0 | 12 | 10000 |    0
 12987323 |  0 | 20 |  2815 | 2609
 12987323 |  0 | 32 |  1724 | 3378
 12987323 |  0 | 34 |  4455 | 3766
 12987323 |  0 | 38 | 10000 | 5040
 12987323 |  0 | 42 |  6724 |    0
 12987323 |  0 | 53 |  5678 | 2841
 12987323 |  0 | 54 |  6622 |    0

View is there:

cqlsh> select * from system.built_views;

 keyspace_name | view_name
---------------+----------------------
            ks | example_c2_idx_index

And something went wrong :(

cqlsh> select * from ks.example where c2=12;

 p | c1 | c2 | v1 | v2
---+----+----+----+----

(0 rows)
cqlsh> select * from ks.example_c2_idx_index;

 c2 | idx_token | p | c1
----+-----------+---+----

(0 rows)

Populating with the same schema and creating a secondary index later also doesn't work.
This works, tho:

cqlsh> create keyspace ks with replication = {'class': 'SimpleStrategy', 'replication_factor': 1};                                                                                                                                                          
cqlsh> create table ks.example ( p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY (p, c1, c2));                                                                                                                                                           
cqlsh> insert into ks.example (p, c1, c2, v1, v2) values (1, 2, 3, 4, 5);                                                                                                                                                                                   
cqlsh> select * from ks.example;

 p | c1 | c2 | v1 | v2
---+----+----+----+----
 1 |  2 |  3 |  4 |  5

(1 rows)
cqlsh> create index on ks.example (c2);
cqlsh> select * from ks.example where c2=3;                                                                                                                                                                                                                 

 p | c1 | c2 | v1 | v2
---+----+----+----+----
 1 |  2 |  3 |  4 |  5

(1 rows)
@duarten

This comment has been minimized.

Copy link
Member

commented Jan 25, 2019

The issue is that we're not adding virtual columns to the materialized view created from the index definition. cassandra-stress is using UPDATE statements for which the row marker is missing, so the rows inserted into the view are not live.

Given

create table t (p int, c int, v int, primary key (p, c));
create materialized view t_mv as select c, p from t where p is not null and c is not null primary key (c, p);
create index on t(c);

Using INSERT, things work as expected:

cqlsh:ks> insert into t (p, c, v) values (1, 1, 1);
cqlsh:ks> select * from t_mv ;

 c | p
---+---
 1 | 1

(1 rows)
cqlsh:ks> select * from t_c_idx_index  ;

 c | idx_token          | p
---+--------------------+---
 1 | 0xc78499982ae4e0cf | 1

(1 rows)

However, using UPDATE:

cqlsh:ks> update t set v = 2 where p = 2 and c = 2;
cqlsh:ks> select * from t_mv ;

 c | p
---+---
 1 | 1
 2 | 2

(2 rows)
cqlsh:ks> select * from t_c_idx_index  ;

 c | idx_token          | p
---+--------------------+---
 1 | 0xc78499982ae4e0cf | 1

(1 rows)

/cc @slivne @eliransin @nyh @psarna

@duarten

This comment has been minimized.

Copy link
Member

commented Jan 25, 2019

Btw, the idx_index suffix is weird and redundant: #4146.

@duarten duarten added this to the 3.1 milestone Jan 25, 2019

@duarten

This comment has been minimized.

Copy link
Member

commented Jan 25, 2019

On the bright side, #3833 is trivially solved.

duarten added a commit to duarten/scylla that referenced this issue Jan 28, 2019

tests/secondary_index_test: Add reproducer for scylladb#4144
Signed-off-by: Duarte Nunes <duarte@scylladb.com>

avikivity added a commit that referenced this issue Feb 5, 2019

Merge "SI: Add virtual columns to underlying MV" from Duarte
"
Virtual columns are MV-specific columns that contribute to the
liveness of view rows. However, we were not adding those columns when
creating an index's underlying MV, causing indexes to miss base rows.

Fixes #4144
Branches: master, branch-3.0
"

Reviewed-by: Nadav Har'El <nyh@scylladb.com>

* 'sec-index/virtual-columns/v1' of https://github.com/duarten/scylla:
  tests/secondary_index_test: Add reproducer for #4144
  index/secondary_index_manager: Add virtual columns to MV

duarten added a commit that referenced this issue May 1, 2019

secondary index: expand test of secondary-index and UPDATE requests
The existing unit test test_secondary_index_contains_virtual_columns
reproduced a bug (issue #4144) with indexing of primary-key columns,
but we only actually tested clustering columns. In issue #4471 there
was a question whether we may still have a bug when indexing of
*partition-key* columns. This patch adds a test that verifies that
we don't, and this case works well too.

Refs #4144
Refs #4471

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190501113500.25900-1-nyh@scylladb.com>

duarten added a commit that referenced this issue May 1, 2019

Merge "SI: Add virtual columns to underlying MV" from Duarte
"
Virtual columns are MV-specific columns that contribute to the
liveness of view rows. However, we were not adding those columns when
creating an index's underlying MV, causing indexes to miss base rows.

Fixes #4144
Branches: master, branch-3.0
"

Reviewed-by: Nadav Har'El <nyh@scylladb.com>

* 'sec-index/virtual-columns/v1' of https://github.com/duarten/scylla:
  tests/secondary_index_test: Add reproducer for #4144
  index/secondary_index_manager: Add virtual columns to MV

(cherry picked from commit ebf1793)

amoskong pushed a commit to amoskong/scylla that referenced this issue May 8, 2019

Merge "SI: Add virtual columns to underlying MV" from Duarte
"
Virtual columns are MV-specific columns that contribute to the
liveness of view rows. However, we were not adding those columns when
creating an index's underlying MV, causing indexes to miss base rows.

Fixes scylladb#4144
Branches: master, branch-3.0
"

Reviewed-by: Nadav Har'El <nyh@scylladb.com>

* 'sec-index/virtual-columns/v1' of https://github.com/duarten/scylla:
  tests/secondary_index_test: Add reproducer for scylladb#4144
  index/secondary_index_manager: Add virtual columns to MV

(cherry picked from commit ebf1793)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.