Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Queries expect both primary key fields to be returned otherwise Item creation raises an exception #656

Closed
croach opened this Issue · 13 comments

4 participants

@croach

Performing a query in boto returns a list of Item objects. The Item class expects that both fields making up the primary key will be returned in the results from Amazon, but if one of the fields is not listed in the attributes_to_get parameter the Item class raises an exception.

Given constraints on reading capacity, it is sometimes important to limit the attributes to get to the bare minimum making it unnecessary in many circumstances to contain either of the primary key fields. Either the Item class should be updated to allow for these fields to be missing or the results of the query should be a dict. Given the latter solution, the user could then wrap each dict in the results in a Item class if they need to perform further DynamoDB operations on them, such as save, put, or delete. For example:

# Turn query results into Item objects
results = [Item(**item) for item in results]
@tmorgan

I encountered this earlier as well. Your proposed solution of returning a dict (I assume you mean to do this just for this case) would be a bit confusing if you say, mistakenly omit the range key in attributes_to_get, and then call save.

I think it'd probably be more elegant to allow Item construction to not require hash and range keys, but only for this case. Say by modifying layer2.query to set them in the attrs dict to a special None value (e.g. class NotRequested: pass), so the Item constructor doesn't have to change, and have save/put/delete throw an exception if those values are still present when they are called. Any thoughts? For my case I ended up just adding them to attributes_to_get for now, but will happily write that if anyone with a say green lights it.

@garnaat
Owner

To some extent, I am questioning the wisdom of my original idea of treating the hash_key and range_key separately. I thought it was important for people to realize that these were special fields and to treat them accordingly.

However, the alternative if just having get_item return a regular dict simplifies a lot of this. Thoughts?

@disruptek

The thin layer over the API should be as transparent as possible, and it currently isn't. I've commented on this issue elsewhere, but I feel that if the logic that is added doesn't cover all use cases or interacts poorly with the API, then it's ultimately hurting rather than helping. I'd like to be able to override the item class -- duck typing should be sufficient from boto's perspective -- but I don't think the logic has a place in Layer1.

@tmorgan

if you are saying it'd be ok to make a backwards incompatible change @garnaat, I think that undoing treating hash_key and range_key separately would be good, just by removing them from the constructor. It's what caused the confusion with this issue as well, #627, being allowed to give conflicting values in the parameter and attrs dictionary still isn't right either in my opinion. I wouldn't go as far as returning plain dicts though, having the reference to the table set up so you can put/update etc. is very useful (and consistent with e.g. SQS/SDB interfaces), It just shouldn't get in the way.

@croach

I can agree with tmorgan that returning an Item object does definitely have value to it. If anything we should probably simply remove the checks for the primary key fields in the init method and instead raise an exception in the methods (i.e., save, put, and delete) that need them when they don't exist as a key/value pair in the dict. Overall I find the Item class to be most useful when it looks, acts, and feels like a simple dict with a little extra functionality tacked on, so my vote would be to continue to return an Item object, but to strip down the Item class as much as possible so that it feels, for all intents and purposes, like a simple dict.

Given that, I would propose the following:

1) keep the table as the only real attribute that is saved on the Item object, and make everything else (primary key fields included) just another key/value pair within itself,
2) remove the hash key and range key properties, and finally
3) add a check for the primary key fields in the save, put, and delete methods that will raise an exception if they don't exist.

@tmorgan

yes, essentially I agree with croach on (1),(3), if you're prepared to make a backwards incompatible change. Neutral on (2), the properties are fairly harmless in my opinion, and could be useful. They'll raise an exception if they're not set, but that's what you'd expect.

@disruptek

It's not clear what problem removing the properties solves. It seems most useful to have them return None when they are unset. Maybe the range_key property wouldn't exist when the table doesn't require it. Then you can use simple "is None" and "hasattr()" for naive introspection.

@croach

@disruptek removing the properties doesn't really solve any problem per se, it just simplifies the code of the Item class a bit more and makes it act more like a simple python dict.

@tmorgan, I think the only change that would be backwards incompatible would be number 2 since it removes properties that users may depend on. The other changes should actually make the functionality that the Item class supports a superset of what it currently does since the new changes would allow for the creation of Item objects without primary key fields which you cannot do with the current codebase.

Given both your comments, I would revise my proposal to include only numbers 1 and 3 from above. It would probably be best to leave the properties since getting rid of them would break backwards compatibility.

@garnaat
Owner

Where do things currently stand with this?

@garnaat
Owner

Okay, I have made changes locally that allow an Item to be created even if no hash/range are supplied (either as explicit args or in the attrs dict). In that situation, a None value is stored in the Item object as the value of the hash/range. If you then try to do an update or put on that item, you will get an client error because None is not a valid value. I have not removed any of the properties.

Does this seem like a reasonable approach?

@tmorgan

Seems perfectly reasonable to me.

@croach

Sounds like the right solution to me.

@garnaat garnaat closed this issue from a commit
@garnaat garnaat Allow an Item to be created without a hash/range key to handle situat…
…ion where attributes_to_get does not include hash/range. Fixes #656.
08678ec
@garnaat garnaat closed this in 08678ec
@garnaat
Owner

Please check out latest version.

@msabramo msabramo referenced this issue from a commit in msabramo/boto
@garnaat garnaat Allow an Item to be created without a hash/range key to handle situat…
…ion where attributes_to_get does not include hash/range. Fixes #656.
ad0ccad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.