New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup UTCDateTimeAttribute deserialization #277
Speedup UTCDateTimeAttribute deserialization #277
Conversation
`UTCDateTimeAttribute.deserialize` spends most of it's time inside of dateutil which attempts to first guess the format. In most cases PynamoDB will have written the data to begin with, so we can first optimistically assume the data is stored in DATETIME_FORMAT before falling back to dateutil. This ~4.5 times faster than always going through dateutil. ```python from dateutil.parser import parse import datetime from dateutil.tz import tzutc DATETIME_FORMAT = '%Y-%m-%dT%H:%M:%S.%f%z' now_str = datetime.datetime.now().replace(tzinfo=tzutc()).strftime(DATETIME_FORMAT) %timeit parse(now_str) # 116 µs ± 9.46 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) %timeit datetime.datetime.strptime(now_str, DATETIME_FORMAT) # 26 µs ± 7.53 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each) ```
""" | ||
expected_value = datetime(2047, 1, 6, 8, 21, tzinfo=tzutc()) | ||
attr = UTCDateTimeAttribute() | ||
self.assertEqual( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would help me a lot if you changed this to assert expected_value == attr.deserialize('January 6, 2047 at 8:21:00AM UTC')
as I would like to move this towards pytest style unit tests. but if it feels weird to have two different styles, I can do it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lita Agreed, but I would do that in a follow on PR. Having two different testing styles in the same file would be a bit weird.
pynamodb/attributes.py
Outdated
@@ -394,7 +395,11 @@ def deserialize(self, value): | |||
""" | |||
Takes a UTC datetime string and returns a datetime object | |||
""" | |||
return parse(value) | |||
# First attempt to parse the datetime with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is incomplete, did you accidentally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I a word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for the contribution! Will merge after the comment fix
UTCDateTimeAttribute.deserialize
spends most of it's time inside of dateutil which attempts to firstguess the format. In most cases PynamoDB will have written the data to begin with, so we can first
optimistically assume the data is stored in DATETIME_FORMAT before falling back to dateutil.
This ~4.5 times faster than always going through dateutil.