Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup UTCDateTimeAttribute deserialization #277

Merged
merged 2 commits into from May 11, 2017

Conversation

rowillia
Copy link
Contributor

UTCDateTimeAttribute.deserialize spends most of it's time inside of dateutil which attempts to first
guess the format. In most cases PynamoDB will have written the data to begin with, so we can first
optimistically assume the data is stored in DATETIME_FORMAT before falling back to dateutil.

This ~4.5 times faster than always going through dateutil.

from dateutil.parser import parse
import datetime
from dateutil.tz import tzutc

DATETIME_FORMAT = '%Y-%m-%dT%H:%M:%S.%f%z'
now_str = datetime.datetime.now().replace(tzinfo=tzutc()).strftime(DATETIME_FORMAT)

%timeit parse(now_str) # 116 µs ± 9.46 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit datetime.datetime.strptime(now_str, DATETIME_FORMAT) # 26 µs ± 7.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

`UTCDateTimeAttribute.deserialize` spends most of it's time inside of dateutil which attempts to first
guess the format.  In most cases PynamoDB will have written the data to begin with, so we can first
optimistically assume the data is stored in DATETIME_FORMAT before falling back to dateutil.

This ~4.5 times faster than always going through dateutil.

```python
from dateutil.parser import parse
import datetime
from dateutil.tz import tzutc

DATETIME_FORMAT = '%Y-%m-%dT%H:%M:%S.%f%z'
now_str = datetime.datetime.now().replace(tzinfo=tzutc()).strftime(DATETIME_FORMAT)

%timeit parse(now_str) # 116 µs ± 9.46 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit datetime.datetime.strptime(now_str, DATETIME_FORMAT) # 26 µs ± 7.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
"""
expected_value = datetime(2047, 1, 6, 8, 21, tzinfo=tzutc())
attr = UTCDateTimeAttribute()
self.assertEqual(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help me a lot if you changed this to assert expected_value == attr.deserialize('January 6, 2047 at 8:21:00AM UTC') as I would like to move this towards pytest style unit tests. but if it feels weird to have two different styles, I can do it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lita Agreed, but I would do that in a follow on PR. Having two different testing styles in the same file would be a bit weird.

@@ -394,7 +395,11 @@ def deserialize(self, value):
"""
Takes a UTC datetime string and returns a datetime object
"""
return parse(value)
# First attempt to parse the datetime with the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incomplete, did you accidentally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I a word.

Copy link
Member

@garrettheel garrettheel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for the contribution! Will merge after the comment fix

@garrettheel garrettheel changed the title Speedup UTCDateTimeAttribute.deserialize Speedup UTCDateTimeAttribute deserialization May 11, 2017
@garrettheel garrettheel merged commit 2b70946 into pynamodb:master May 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants