New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternator: evaluate the need to validate table name on each request #12538
Comments
The link above doesn't work for me, so here is one that does seem to work (I don't know if it's the same thing you wanted to link to): #12445 (comment) |
Yes, looks like links to comments in diff view don't persist well |
Unfortunately, it turns out that DynamoDB does check the validity the given table name in all requests, not just CreateTable. For example, consider the following test for def test_item_operations_improper_named_table(dynamodb):
with pytest.raises(ClientError, match='ValidationException'):
dynamodb.meta.client.put_item(TableName='non!existent!table',
Item={'a':{'S':'b'}})
with pytest.raises(ClientError, match='ValidationException'):
dynamodb.meta.client.put_item(TableName='xx',
Item={'a':{'S':'b'}}) This test attempts a It is arguable whether Alternator needs to be 100% compatible with the error handling of DynamoDB in esoteric error cases (this has no effect on successful requests). So if this extra check causes a significant slowdown of requests, by all means we should consider dropping the extra check. But if the slowdown caused by the extra check is tiny, maybe it's better not to break even this minor error-case compatibility with Cassandra, and leave the extra checking as is? @nuivall do you have an estimate how much of the request time in the worst case (e.g., small read request from cache) can be attributed to this extra check? |
Issue scylladb#12538 suggested that maybe Alternator shouldn't bother reporting an invalid table name in item operations like PutItem, and that it's enough to report that the table doesn't exist. But the test added in this patch shows that DynamoDB, like Alternator, reports the invalid table name in this case - not just that the table doesn't exist. That should make us think twice before acting on issue scylladb#12538. If we do what this issue recommended, this test will need to be fixed (e.g., to accept as correct both types of errors). Signed-off-by: Nadav Har'El <nyh@scylladb.com>
In #12608 (comment) @nuivall had a good idea: we can do the table-name validation only after (and if) we realize it doesn't exist, so in the usual case where the table exists, it isn't done. If we do that, we remain error-compatible with DynamoDB, but don't lose any performance (not even a tiny bit) in the usual successful case. Basically, we can remove the |
Currently This is nothing compared to |
Let's tackle the big ones first, then move back to this. I'm hopeful caching the query parse result will yield large improvements. |
btw, I see alternator grew its own expression language (primitive_condition/condition_expression). We should consider migrating to cql3 expressions (perhaps moved out to a top-level module), so that any improvements there (say, JIT compilation) are carried over. |
Yes, this is always a good approach in optimization, but sometimes if you already know how to easily shave off something small, there is no reason not to do it too. Of course it shouldn't have high priority.
Indeed. This is is #5023. |
Right. We had this before CQL had its ;-)
This is easier said than done. You already saw how many changes you needed to do just to change the CQL expressions to support two variants of "= NULL". The DynamoDB expressions are different in almost every respect from CQL expressions. The terminals (constants, columns and nested pieces of these columns) and their types are different, the error handling is different, the "bound variable" support is different. See Maybe it's something we can and should consider in the future, but I don't think it's urgent. One downside of Alternator's different expressions is that today we can only use these expressions (e.g., filtering) on the coordinator - we can't push these expressions to replicas, and so on. Today it's not a big problem because CQL doesn't do this either, but if one day it would - it would be good to have just one expression class, not two. |
Issue #12538 suggested that maybe Alternator shouldn't bother reporting an invalid table name in item operations like PutItem, and that it's enough to report that the table doesn't exist. But the test added in this patch shows that DynamoDB, like Alternator, reports the invalid table name in this case - not just that the table doesn't exist. That should make us think twice before acting on issue #12538. If we do what this issue recommended, this test will need to be fixed (e.g., to accept as correct both types of errors). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12608
Hi @nuivall @nyh @avikivity, I am a new contributor to scylladb and would love to work on this. The summary that I gathered from all the above comments is:
So, the changes need to be:
That's an high level implementation that I understood. Please let me know if I am missing something. Meanwhile, I will dig into code for more details! |
@harsh020 yes, your summary seems correct. Exception is in most (all?) places called |
@nuivall, Thank you for the info! Please let me have a dig at the code and get back to you! |
Prior to this `table_name` was validated for every request in `find_table_name` leading to unnecessary overhead (although small, but unnecessary). Now, the `table_name` is only validated while creation reqeust and in other requests iff the table does not exist (to keep compatibility with DynamoDB's exception). Fixes: scylladb#12538
Prior to this `table_name` was validated for every request in `find_table_name` leading to unnecessary overhead (although small, but unnecessary). Now, the `table_name` is only validated while creation reqeust and in other requests iff the table does not exist (to keep compatibility with DynamoDB's exception). Fixes: scylladb#12538
Prior to this `table_name` was validated for every request in `find_table_name` leading to unnecessary overhead (although small, but unnecessary). Now, the `table_name` is only validated while creation reqeust and in other requests iff the table does not exist (to keep compatibility with DynamoDB's exception). Fixes: scylladb#12538
Prior to this `table_name` was validated for every request in `find_table_name` leading to unnecessary overhead (although small, but unnecessary). Now, the `table_name` is only validated while creation reqeust and in other requests iff the table does not exist (to keep compatibility with DynamoDB's exception). Fixes: scylladb#12538
Prior to this `table_name` was validated for every request in `find_table_name` leading to unnecessary overhead (although small, but unnecessary). Now, the `table_name` is only validated while creation reqeust and in other requests iff the table does not exist (to keep compatibility with DynamoDB's exception). Fixes: scylladb#12538
Prior to this `table_name` was validated for every request in `find_table_name` leading to unnecessary overhead (although small, but unnecessary). Now, the `table_name` is only validated while creation reqeust and in other requests iff the table does not exist (to keep compatibility with DynamoDB's exception). Fixes: scylladb#12538
Issue scylladb#12538 suggested that maybe Alternator shouldn't bother reporting an invalid table name in item operations like PutItem, and that it's enough to report that the table doesn't exist. But the test added in this patch shows that DynamoDB, like Alternator, reports the invalid table name in this case - not just that the table doesn't exist. That should make us think twice before acting on issue scylladb#12538. If we do what this issue recommended, this test will need to be fixed (e.g., to accept as correct both types of errors). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb#12608
Currently we call
static void validate_table_name(const std::string& name)
on each request, in theory we should only check if table name is correct when creating it and after that don't need to validate it inexecutor::find_table
calls.This issue was brough up in https://github.com/scylladb/scylladb/pull/12445/files/0d89b21a1582e2da58083a8eccb2c25a37637ed2#r1062322506
The text was updated successfully, but these errors were encountered: