Add ability to limit the count of matched items. #20

Sazpaimon · 2014-03-25T23:57:11Z

In a scan, limit restricts the number of items scanned. It would be nice if there were a way to limit the amount of matched items a scan returns. Vogels should also probably recurse through each LastEvaluatedKey until the limit is met.

Query also appears to have a similar issue, but only when the query response is getting paginated (such as when it hits the 1MB limit per response). If I do a limit(500) and only get, for example, 100 items returned with a LastEvaluatedKey, Vogels should recurse through and continue to get more responses until the limit is met, or no more results are available.

Right now the only workaround for these is doing a loadAll, which does not respect limit and will always get all available items (which can potentially waste throughput).

ianmurrays · 2015-08-04T11:15:30Z

Any update on this? I need this too 😄

zhiyelee · 2015-08-27T16:02:21Z

+1

ryanfitz · 2015-11-11T18:08:37Z

Do to the way DynamoDB works this would be difficult to implement correctly in a generic way. I'll give an example:

  // find all users named bob
  Account.scan().where('name').equals('bob').limit(500);

Lets say there are 500,000 accounts total and 800 are named 'bob'.

First issue is say all the accounts named bob are towards the end of the scan iterator. You are going to scan over all 500K items before you reach 500 Bobs and potentially use up all your provisioned throughput. If you enabled loadAll() on your scan, vogels will scan 500 items at a time, but will eventually iterate over the entire table. limit with loadAll is probably a bit confusing for new users when combining them. Limit with dynamodb really just provides a way to limit the number of read throughput used per request, not really a way to limit the number of items returned.

The second issue is say in the first scan request 100 Bobs are returned and then a subsequent request is made again in order to attempt to fully load 500 Bobs, on this next request 450 Bobs get returned (because many bobs were located near this iterator). Should vogels just 500 users, or should it return 550 (total found). Data isn't sorted on scans so it would have to just randomly return 500 users and ignore the 50 others. The next iterator returned from DynamoDB wouldn't be valid if you then attempted to load up the next 500 Bobs, because you didn't return 50 of them. The same issue applies on queries to secondary indexes.

I'm open to suggestions on how to make this as user friendly as possible. We need to be able to work with the limitations DynamoDB and Id want to make it explicit as possible to developers that they might be consuming lots of throughput when executing certain functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to limit the count of matched items. #20

Add ability to limit the count of matched items. #20

Sazpaimon commented Mar 25, 2014

ianmurrays commented Aug 4, 2015

zhiyelee commented Aug 27, 2015

ryanfitz commented Nov 11, 2015

Add ability to limit the count of matched items. #20

Add ability to limit the count of matched items. #20

Comments

Sazpaimon commented Mar 25, 2014

ianmurrays commented Aug 4, 2015

zhiyelee commented Aug 27, 2015

ryanfitz commented Nov 11, 2015