New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate limited Scan Implementation. #205
Conversation
# throttled_scan call. | ||
elapsed_time_ms = round((current_time - start_time) * 1000) | ||
# elapsed_time_ms can be 0 or negative if there is a clock drift | ||
if elapsed_time_ms < 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could collapse this to single line on :1009 using max()
|
||
if consecutive_provision_throughput_exceeded_ex == 0: | ||
total_consumed_read_capacity += latest_scan_consumed_capacity | ||
consumed_rate = total_consumed_read_capacity / elapsed_time_ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would probably from __future__ import division
at top so you don't need to worry about accidentally doing integer division
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. Awesome this is. Never knew it existed.
exclusive_start_key=None, | ||
segment=None, | ||
total_segments=None, | ||
model_cls=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the model should never make its way into the connection
if code == "ProvisionedThroughputExceededException": | ||
consecutive_provision_throughput_exceeded_ex += 1 | ||
if consecutive_provision_throughput_exceeded_ex > max_consecutive_exceptions: | ||
raise # Max threshold reached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 says two space before inline comment
i would put them on line above the raise though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for below two comments
elapsed_time_s = math.ceil(elapsed_time_ms / 1000) | ||
# Sleep proportional to the ratio of --consumed capacity-- to --capacity to consume-- | ||
time_to_sleep = max(1, round((total_consumed_read_capacity/ elapsed_time_s) \ | ||
/ (read_capacity_to_consume_per_second))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra set of parenthesis around read_capacity_to_consume_per_second
@@ -131,6 +131,37 @@ def get_item(self, hash_key, range_key=None, consistent_read=False, attributes_t | |||
consistent_read=consistent_read, | |||
attributes_to_get=attributes_to_get) | |||
|
|||
def rate_limited_scan(self, | |||
attributes_to_get=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either move self down a line or indent everything to same level as self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey. I am following the format that was present throughout the file. Making this change will make this method appear odd
max_sleep_between_retry=max_sleep_between_retry, | ||
max_consecutive_exceptions=max_consecutive_exceptions, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra blank line
The rate limited scan tries to pace the scan speed using sleeps, and also handles provision throughput exceeded exception. In order to handle large item sizes, the application can use the page_size variable to pace the scan if needed.
@danielhochman