Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous DynamoDB SDK Validation Exception on a Query with a start key and a key condition with 'greater than or equals to' #2456

Closed
3 tasks done
kalpitad opened this issue Dec 29, 2020 · 6 comments
Labels
guidance Question that needs advice or information.

Comments

@kalpitad
Copy link

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug
I believe that I have hit an erroneous DynamoDB SDK Validation Exception on a Query with a start key and a key condition: Aws::DynamoDB::Errors::ValidationException: The provided starting key does not match the range key predicate. I think that the error is telling me that the start key does not meet the key condition, but it in fact does.

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-dynamodb

Version of Ruby, OS environment
Ruby 2.6; Amazon Linux EC2 Instance

To Reproduce (observed behavior)
The code queries a DynamoDB table looking for items that were created before a specific time. It pages through in reverse chronological order, getting 10 items at a time, and setting the exclusive_start_key appropriately to get the next set of items. The query stops when it hits the specified time @beginning.

def dynamodb_query_for_stuff(start_at)
  query_start_key = {}
  query_start_key["hashkey"] = {n: "#{@id}"}
  query_start_key["rangekey"] = {n: "#{start_at}"}
  key_condition_expression = "hashkey = :h and rangekey >= :t"
  key_expression_values = {
    ":h" => {n: "#{@id}"},
    ":t" => {n: "#{@beginning}"}
  }
  resp = Aws::DynamoDB::Client.new.query(
    table_name: "MyTable",
    limit: 10,
    key_condition_expression: key_condition_expression,
    expression_attribute_values: key_expression_values,
    scan_index_forward: false,
    exclusive_start_key: query_start_key
  )
  return resp
rescue => e
    Rails.logger.error {"Details: #{e.backtrace}"}
    return nil
end

Expected behavior
The key condition specifies that the range key should be greater than or equal to a value @beginning, and yet I get this validation error when the range key specified in the exclusive_start_key is equal to @beginning. In general, this query works like a charm (because the range key in the start key is greater than @beginning while my code pages through the query results), but once in a while when those two values become equal, I see this error in my logs. Is the ">=" not being interpreted correctly by the SDK or is my expectation of the condition wrong?

Screenshots
n/a

Additional context
Relevant parts of the stack trace:
["/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/seahorse/client/plugins/raise_response_errors.rb:15:in call'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/aws-sdk-core/plugins/jsonvalue_converter.rb:20:in call'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/aws-sdk-core/plugins/idempotency_token.rb:17:in call'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/aws-sdk-core/plugins/param_converter.rb:24:in call'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/aws-sdk-core/plugins/response_paging.rb:10:in call'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/seahorse/client/plugins/response_target.rb:23:in call'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-core-3.97.1/lib/seahorse/client/request.rb:70:in send_request'",
"/home/deploy/.bundler/myapp/ruby/2.6.0/gems/aws-sdk-dynamodb-1.48.0/lib/aws-sdk-dynamodb/client.rb:4068:in query'", ...

@kalpitad kalpitad added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Dec 29, 2020
@mullermp mullermp added guidance Question that needs advice or information. and removed bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 5, 2021
@mullermp
Copy link
Contributor

mullermp commented Jan 5, 2021

Sorry for the late response on this. Many of the SDK team were OOTO for vacation.

Is the ">=" not being interpreted correctly by the SDK or is my expectation of the condition wrong?

In general, these namespaced errors are returned by the service (Aws::DynamoDB::Errors::ValidationException), the SDK is not doing any additional logic around your query. If I had to guess, your expectation of the condition is wrong.

I'm not an expert with DynamoDB, but the docs suggest that you should use the LastEvaluatedKey from the previous result as exclusive_start_key.

The code queries a DynamoDB table looking for items that were created before a specific time. It pages through in reverse chronological order, getting 10 items at a time, and setting the exclusive_start_key appropriately to get the next set of items. The query stops when it hits the specified time @beginning.

Looking at your use case, if I'm understanding correctly, you're making multiple calls after updating some instance variables (@id and @beginning); Why not query for all items and use the SDK pagination feature? You can do something like:

resp = dynamodb.scan(..., limit: 10)
resp.next_page? #true/false if more data
resp = resp.next_page

@kalpitad
Copy link
Author

kalpitad commented Jan 9, 2021

Thanks, Matt. I'm glad that the team was OOTO and I had no expectations of a reply during the holidays. :)

To answer your question re: the SDK's pagination feature, my code sends results (with the LastEvaluatedKey) to a client and the client returns the start key back to me if/when it wants another page. Hence, I'm not in a position to be able to use next_page.

Does the stack trace indicate to you that this error is coming from the DynamoDB service and not the SDK? If so, perhaps I should raise this issue with AWS Customer Support? Could it be that the service is not interpreting the ">=" correctly in the query's key condition when it checks if the exclusive_start_key itself meets the condition?

@mullermp
Copy link
Contributor

Does the stack trace indicate to you that this error is coming from the DynamoDB service and not the SDK? If so, perhaps I should raise this issue with AWS Customer Support?

Yes - I believe this to be a service error. Although it's not defined in the model, we have a dynamic errors module that generates an error class on the fly based on the http response. If you are able to replicate locally, you can verify by checking the response after enabling http_wire_trace: true on your Client or using the executable aws-v3.rb -v. If this is the case, a customer support ticket is recommended.

@alextwoods
Copy link
Contributor

alextwoods commented Jan 11, 2021

I believe the issue here may be related to your usage of the exclusive_start_key. How are you setting that value? I believe it should be set only from the last_evaluated_key (see the query docs). Based on the sample code above, it looks like the exclusive_start_key and the :h parameter of the query are both being set to @id, which looks incorrect. I believe if that is the case, then you don't need to be setting the exclusive_start_key.

Are you doing anything to modify the table during you're queries? I believe the error means that DynamoDB cannot get the item that the start_key references in the query.

@kalpitad
Copy link
Author

kalpitad commented Feb 2, 2021

Thanks to both of you for helping me think through this. As far as I can tell, I am doing everything correct regarding the exclusive start key. Note: I can't take it immediately from the response because it round trips through my client (i.e. they tell me if they want another page).

In any case, I should be able to detect this specific situation and skip the query altogether. Because my key condition asks for results where the sort key is >= than a specific value and my paging is performed in descending order (scan_index_forward: false), then if the sort value in my exclusive start key is equal to that specific value, I've already hit the last page of my query results and there is no point in querying again.

I think that we can close this issue for now and I can re-open in the future if I have something new to add.

@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

3 participants