Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternator: allow adding a GSI to a pre-existing table #11567

Open
nyh opened this issue Sep 18, 2022 · 1 comment
Open

Alternator: allow adding a GSI to a pre-existing table #11567

nyh opened this issue Sep 18, 2022 · 1 comment
Assignees
Labels
Milestone

Comments

@nyh
Copy link
Contributor

nyh commented Sep 18, 2022

This is a cleanup of issue #5022 which asked to implement the UpdateTable operation and its ability to add a GSI to an already-existing table

Whereas DynamoDB requires that any LSIs must be created up-front when the table is first created, DynamoDB does allow a user to add a GSI to a preexisting table - and also to remove a GSI from a table - using the UpdateTable operation. Alternator doesn't support these UpdateTable operations, and allowing them is the purpose of this issue.

We have xfailing tests for this issue in test_gsi.py: test_gsi_backfill and test_gsi_delete test adding a new GSI, and deleting a GSI, respectively.

The reason why we don't yet support this feature wasn't a small oversight, but rather a deep problem related to the interaction of Alternator's schema-less item representation, and our materialized-views implementation:

We implement a GSI as a materialized view. A materialized view's key must be a real Scylla column. But in Alternator, most attributes except the key attributes aren't a real Scylla column but rather elements of a single map :attrs, so such an attribute cannot be used as the GSI key. Today, when we create the GSI up-front, together with the table, we make the GSI key a real column. But when adding a GSI to an existing table, the GSI key would most likely be an element of the map and not a real column.

I think the best way to add this feature is as follows:

We can add to our materialized views implementation the ability to use as a key a specific member of a map, e.g., :attrs[hello]. This is related to the Cassandra 4.0 map value select feature #7751 (although Cassandra doesn't support this for materialized views) and to #5440 (although that was on CQL secondary-index, not materialized views). But in this issue there is one extra snag: In Alternator, the map's value isn't typed (e.g., it isn't an integer), it is a serialized JSON. So we'll need to add an ability for the materialized views code to run an Alternator-specific de-serialization function. This function will need deserialize the JSON and "skip" items whose value has the "wrong" type (the right type for this GSI's key is declared when adding the GSI).

We will need to use the new mechanism not just when adding a GSI to an existing table, but also when creating a new table with GSI. I.e., we cannot continue to create the GSI's key as a real Scylla column as we do today, not even in a new table. The reason is that if a GSI's key is a real column, it has a fixed type - but if we want to support later deleting the GSI, the user must be able after this deletion to use in this column values of any type, not just the type enforced while the GSI existed. Note that if we want, we can continue to use real Scylla columns for LSI keys, because unlike GSIs, LSIs cannot be added or deleted after creating the table.

Here is an alternative implementation idea, which I believe is worse: Adding a GSI to an existing table is already expected to be a fairly long process (it is called "backfilling" in DynamoDB, or "view building" in Scylla). So we can also spend time on moving data between the map and new a real column which will be the GSI key. The problem is that this process is slow, ugly, and race-prone (this copying process can race with expressions reading or modifying this column). We also need to reverse this process when a GSI is removed - because after a GSI is removed DynamoDB allows a user to set this column to any value of any type - something which isn't allowed on a real schema column that must have a pre-determined type.

When we implement adding a GSI to an existing table, we should remember that while the backfilling ("view building") is in progress, DescribeTable should return for this index IndexStatus=CREATING and Backfilling=True. We don't have a test for that yet. It's not obvious how to test for this, because the backfilling stage can be so short we won't notice it is happening. Perhaps we can use injection to ensure it is paused and only resumed when we want.

@nyh
Copy link
Contributor Author

nyh commented Sep 28, 2022

@fee-mendes noticed that if a user tries to add a GSI, using a GlobalSecondaryIndexUpdates parameter and also (as needed) a AttributeDefinitions, Alternator reports the unsupported "AttributeDefinitions" which is confusing and not explaining what is really not supported. We should improve the error message.

nyh added a commit to nyh/scylla that referenced this issue Sep 28, 2022
Due to issue scylladb#11567, Alternator do not yet support adding a GSI to an
existing table via UpdateTable with the GlobalSecondaryIndexUpdates
parameter.

However, currently, we print a misleading error message in this case,
complaining about the AttributeDefinitions parameter. This parameter
is also required with GlobalSecondaryIndexUpdates, but it's not the
main problem, and the user is likely to be confused why the error message
points to that specific paramter and what it means that this parameter
is claimed to be "not supported" (while it is supported, in CreateTable).
With this patch, we report that GlobalSecondaryIndexUpdates is not
supported.

This patch does not fix the unsupported feature - it just improves
the error message saying that it's not supported.

Refs scylladb#11567

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
denesb pushed a commit that referenced this issue Sep 29, 2022
Due to issue #11567, Alternator do not yet support adding a GSI to an
existing table via UpdateTable with the GlobalSecondaryIndexUpdates
parameter.

However, currently, we print a misleading error message in this case,
complaining about the AttributeDefinitions parameter. This parameter
is also required with GlobalSecondaryIndexUpdates, but it's not the
main problem, and the user is likely to be confused why the error message
points to that specific paramter and what it means that this parameter
is claimed to be "not supported" (while it is supported, in CreateTable).
With this patch, we report that GlobalSecondaryIndexUpdates is not
supported.

This patch does not fix the unsupported feature - it just improves
the error message saying that it's not supported.

Refs #11567

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11650
@DoronArazii DoronArazii added this to the 5.x milestone Jan 24, 2023
@nyh nyh self-assigned this May 4, 2023
nyh added a commit to nyh/scylla that referenced this issue May 6, 2023
We already have tests for the feature of adding or removing a GSI from
an existing table, which Alternator doesn't yet support (issue scylladb#11567).
In this patch we add another check, how after a GSI is added, you can
no longer add items with the wrong type for the indexed type, and after
removing a GSI, you can. The expanded tests pass on DynamoDB, and
obviously still xfail on Alternator because the feature is not yet
implemented.

Refs scylladb#11567.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue May 12, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue May 12, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue May 12, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue May 13, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue Jun 21, 2023
We already have tests for the feature of adding or removing a GSI from
an existing table, which Alternator doesn't yet support (issue scylladb#11567).
In this patch we add another check, how after a GSI is added, you can
no longer add items with the wrong type for the indexed type, and after
removing a GSI, you can. The expanded tests pass on DynamoDB, and
obviously still xfail on Alternator because the feature is not yet
implemented.

Refs scylladb#11567.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue Jun 21, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue Jul 6, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
xemul pushed a commit that referenced this issue Jul 13, 2023
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs #11567).

Fixes #13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #14556
MeepoYang pushed a commit to storage-scsg/scylladb that referenced this issue Apr 28, 2024
Add missing validation of the AttributeDefinitions parameter of the
CreateTable operation in Alternator. This validation isn't needed
for correctness or safety - the invalid entries would have been
ignored anyway. But this patch is useful for user-experience - the
user should be notified when the request is malformed instead of
ignoring the error.

The fix itself is simple (a new validate_attribute_definitions()
function, calling it in the right place), but much of the contents
of this patch is a fairly large set of tests covering all the
interesting cases of how AttributeDefinitions can be broken.
Particularly interesting is the case where the same AttributeName
appears more than once, e.g., attempting to give two different types
to the same key attribute - which is not allowed.

One of the new tests remains xfail even after this patch - it checks
the case that a user attempts to add a GSI to an existing table where
another GSI defined the key's type differently. This test can't
succeed until we allow adding GSIs to existing tables (Refs scylladb#11567).

Fixes scylladb#13870.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb#14556

Conflicts:
	alternator/executor.cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants