Query/Scan Count/ScannedCount support and TableGenerator improvements #1181

Merged
merged 6 commits into from Jan 8, 2013

3 participants

@disruptek

Count/ScannedCount support for dynamodb.layer2.(scan|query)
add count argument to dynamodb.layer1.query()
fix consumed_units to match Amazon return value (float)
add value to get/set remaining generator iterations
add ability to jump ahead in result set pagination
add ability to query for meta-data values before iterating

This PR will need some testing as it may change the behavior of a few of subtle corner-cases that it tries to solve properly. A couple strange cases where it might break existing code:

  • access of the undocumented max_results value in the table generator
  • expectation of consumed_units to be an int, which it was (only) prior to iteration

An additional subtlety is that due to the way DynamoDB implements Count and ScannedCount, one needs to perform the usual result pagination iteration until the table has been fully scanned/queried. We leave this to the user for all the usual reasons, but it will likely trip up anyone who skips the documentation.

Andy Davidoff added some commits Dec 12, 2012
Andy Davidoff Count/ScannedCount support for dynamodb.layer2.(scan|query)
add count argument to dynamodb.layer1.query()
fix consumed_units to match Amazon return value (float)
add value to get/set remaining generator iterations
add ability to jump ahead in result set pagination
add ability to query for meta-data values before iterating
69ebc91
Andy Davidoff oops; fixing broken max_results=0 behavior 6ccbf61
@garnaat
the boto project member

The tests are currently failing on this branch.

======================================================================
FAIL: test_layer2_basic (tests.integration.dynamodb.test_layer2.DynamoDBLayer2Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/garnaat/projects/boto-dev/boto/tests/integration/dynamodb/test_layer2.py", line 251, in test_layer2_basic
    assert n == 2
AssertionError: 

Could just be some eventual consistency issue. Need to dig in a bit more.

@disruptek

The problem with leaving the consumed_units type as an integer is that Amazon actually does yield floating point values with sub-integer precision. I can patch the tests if you want...

Andy Davidoff added some commits Dec 16, 2012
Andy Davidoff make dynamodb.table.(scan|query) track layer2.(scan|query) perfectly
fix tests to correctly use keyword arguments in scan/query calls
PEP8
2a5258a
Andy Davidoff This change constrains requests made by the scan
and query TableGenerators according to any limit
of the generator.  This can result in dramatic
speed and read_unit efficiency gains.
65e390b
@disruptek

@garnaat, would you like me to modify this any further?

@jamesls jamesls commented on an outdated diff Dec 21, 2012
boto/dynamodb/table.py
:type item_class: Class
:param item_class: Allows you to override the class used
to generate the items. This should be a subclass of
:class:`boto.dynamodb.item.Item`
"""
- return self.layer2.query(self, hash_key, range_key_condition,
- attributes_to_get, request_limit,
- max_results, consistent_read,
- scan_index_forward, exclusive_start_key,
- item_class=item_class)
-
- def scan(self, scan_filter=None,
- attributes_to_get=None, request_limit=None, max_results=None,
- count=False, exclusive_start_key=None, item_class=Item):
+ return self.layer2.query(self, hash_key, **kw)
+
+ def scan(self, **kw):
@jamesls
the boto project member
jamesls added a note Dec 21, 2012

Won't this (and query) break anyone that's not using keyword args?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jamesls
the boto project member

It would also be great, if possible, to add some basic tests showing off the new functionality.

@disruptek

I would argue that code not using keyword arguments for arguments defined using keywords is already broken. Witness the fact that adding another keyword argument broke existing code in the tests due to the omission of a keyword. I'm curious as to what the best practice is here and if I'm mistaken, I'm happy to learn a lesson on style and fix my commit.

Perhaps I simply don't understand the use of keyword arguments in this case. If we want a variable number of arguments that are used without keywords, it seems that *args is most appropriate for that purpose.

@garnaat
the boto project member

I don't think there is any debate about the style. This is an improvement. It's just us trying to keep from breaking existing code. And there is nothing in Python that prevents you from calling a method defined with keyword arguments using positional style calling convention. And it works, so it's not broken. But it is brittle.

@disruptek

There's nothing here for me to argue against. If it helps, I think removing a brittle interface in public code far trumps breaking the code of those that program using brittle style. After all, we're preventing future code from being written in this brittle fashion.

To be honest, I feel it was a mistake to have scan/query methods on both the table and layer2 objects, but as that's obviously not going to change, I felt this was a more future-proof patch.

@jamesls
the boto project member

Just want to make sure we're all on the same page (it seems that we are), my main concern is with breaking backwards compatibility. So, for example, one of the lines in the tests previously read:

results = table.query('Amazon DynamoDB', BEGINS_WITH('DynamoDB'))

This now had to be updated as:

results = table.query('Amazon DynamoDB', range_key_condition=BEGINS_WITH('DynamoDB'))

Same for scan, so any user with this code would now be broken:

results = table.scan({'Tags': CONTAINS('table')})

I'm not sure I'm following the argument of what makes positional arguments inheritently brittle, but can we not make this backwards compatible by using forwarding *args, **kwargs to Layer2.(query|scan)?

@disruptek

I don't like it, but you now have the option of supporting the old behavior. Forgive me for not including a test for this; I just couldn't bring myself to write it. ;-)

@garnaat garnaat merged commit 7504ee5 into boto:develop Jan 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment