Reimplement retry mechanism on 429 at the http request level and for all databricks-cli operations#343
Conversation
…solation between decorated function calls. (databricks#327)" This reverts commit 6e2da08.
… next retry (databricks#326)" This reverts commit 37d51b1.
This reverts commit 0f5e5fa.
Codecov Report
@@ Coverage Diff @@
## master #343 +/- ##
==========================================
- Coverage 84.90% 84.53% -0.37%
==========================================
Files 39 39
Lines 2749 2703 -46
==========================================
- Hits 2334 2285 -49
- Misses 415 418 +3
Continue to review full report at Codecov.
|
nfx
left a comment
There was a problem hiding this comment.
Please check if 503 global handling is not going to affect things much. And replace deprecated constant name with newer one
|
|
||
| def put_file(self, src_path, dbfs_path, overwrite, headers=None): | ||
| handle = self.create(dbfs_path, overwrite, headers=headers)['handle'] | ||
| handle = self.client.create(dbfs_path.absolute_path, overwrite, headers=headers)['handle'] |
There was a problem hiding this comment.
more for SDK side of things - we should consider some day later the "context manager" for DBFS files:
with self.client.open(dbfs_path.absolute_path) as dbfs_file:
# alternatively ".write(contents)" - that will do base64 stuff
dbfs_file.add_block(base64encode(...), ...)
| recursive=recursive, headers=headers) | ||
| self.client.delete(dbfs_path.absolute_path, recursive=recursive, headers=headers) | ||
| except HTTPError as e: | ||
| if e.response.status_code == 503: |
There was a problem hiding this comment.
https://github.com/urllib3/urllib3/blob/3308d655a563e0b72e3856c4a4cb06ce2f4f0e8d/src/urllib3/util/retry.py#L217 HTTP 503 would be retried globally, please check conditions
There was a problem hiding this comment.
Why do you say there will be global retries on 503? I am overriding the default status codes with status_forcelist=[429] in databricks_cli/sdk/api_client.py to ensure retries are made only for 429. These are only the defaults:
#: Default status codes to be used for ``status_forcelist``
RETRY_AFTER_STATUS_CODES = frozenset([413, 429, 503])
| total=6, | ||
| backoff_factor=1, | ||
| status_forcelist=[429], | ||
| method_whitelist=set({'POST'}) | set(Retry.DEFAULT_METHOD_WHITELIST), |
There was a problem hiding this comment.
Please use DEFAULT_ALLOWED_METHODS, because DEFAULT_METHOD_WHITELIST is deprecated.
There was a problem hiding this comment.
The new constant is not available in urllib3 v1.25.10, which seems to be installed with databricks-cli by default. It was introduced in v1.26.0. Confirmed by trying to use it:
Error: AttributeError: type object 'Retry' has no attribute 'DEFAULT_ALLOWED_METHODS'
There was a problem hiding this comment.
ok, then we need to schedule version updates, as this code would have to be updated down the line anyway
|
LGTM |
Revert retry logic on 429 responses for DBFS API requests (#319, #326, #327) and reimplement it using urllib3.util.Retry, for all the requests made by
databricks-cli.Retry-Afterheader is respected if present.urllib3.util.Retry).I tested manually by enabling debug logging and validating that retries are performed. I don't see an easy way of adding automated tests and I don't think it's worth the effort to do it given that we are only using standard functionality.