-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APIC-79] - reconfigure retry logic using tenacity #401
[APIC-79] - reconfigure retry logic using tenacity #401
Conversation
Okay @skomanduri it looks like you were right on the urllib part. Bifurcation of different force lists for different verbs seems painful. This is at 60-70ish% so i wanted to check in on approach before i started on the respect_retry_after_header part, some issues with retries on the cli and tests. theres also still some tearout work and I'm not sure how you want me to handle file buffering retries. It seems like the best course here is to leave what we have in place and add some jitter to the back off calculation. I had a working backoff algo on a different branch that I can port back in. |
Okay @skomanduri how does this look. Tests still need work but otherwise i think this works. Theres some funniness with the interaction of retry conditions in that if we raise a connection error for status the codes will not be available in as part of the exception object. This retry after header class and only the status code check gets around this and makes the response headers available for inspection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I had one question, but otherwise let me know when you are ready for final review.
Okay @skomanduri this should be ready for review |
just a minor thing: could you please add a changelog entry? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple more questions, but this seems almost ready to go!
# retry_state is an instance of tenacity.RetryCallState. | ||
# The .outcome property contains the result/exception | ||
# that came from the underlying function. | ||
result_headers = retry_state.outcome._result.headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to account for outcome
returning an exception here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not throwing exceptions so I don't think so unless I'm missing something? The retry logic is using status code alone to trigger a retry. The reason for this is the default requests exception handling doesnt seem to pass back any of the response information for requests to handle. At one point I had a raise_for_status call in there hoping that i could set the retry
parameter to something like lambda res: res.status_code in civis.civis.RETRY_CODES & StatusException
. That would never trigger though because res.status_code didnt exist in the retry object.
Also to take a step back this is sort of unrelated to the cause of the retry. All this class is doing is overriding the wait parameter with the header value if it exists. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skomanduri does that answer your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't retry_state.outcome
contain an exception if requests throws an exception? The situation I'm worried about is where that happens and then trying to call .headers
on it would throw its own exception and we would lose the requests exception. I don't know if this happens, but want to either rule it out or handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skomanduri I'm unclear on what type of exception you are envisioning. I thought the only exceptions that requests would implicitly return are ConnectionError, Timeout, and TooManyRedirects. I don't think any of these would come into play with the retry logic as it currently exists. Since we're purely using the status_code returned on the response object, that wait time class will only be evaluated once the retry logic has been entered afaik.
Is there another set of exceptions that I'm missing here?
That said, currently if an exception is thrown currently the retry is not triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, currently if an exception is thrown currently the retry is not triggered.
Okay, that's the piece I was missing. I thought if there was a ConnectionError it would still evaluate here, but if not then this would be fine. Can you add a test for no retry on ConnectionError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay @skomanduri test is last on the test_utils.py file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I had in mind for the test. WFM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skomanduri can i get the final approval so i can merge this?
@mheilman yeah for sure. this is probably a good time to ask about release process. do we just bump the version here and release or is there a cadence that we follow. |
…city * master: [APIC-314] Refactored File Cleaning Logic (#405) [APIC-307] fix a workflows usage example in docs/source/client.rst (#409) add some missing PR numbers to the changelog (#408) Update the API spec at `civis/tests/civis_api_spec.json` so that new endpoints are included (e.g., `/exports/files/csv`) (#407) Updated list of base API resources to include `aliases`, `git_repos`, `json_values`, `services`, and `storage_hosts` so that they show up in the sphinx docs (#406) [APIC-284] remove python 3.5 support since it is reaching EOL (#404) [APIC-305] Require sql_type for every column when table_columns is provided (CHANGELOG) (#403) [APIC-313] retain specific sql types in civis_file_to_table (#402)
Co-authored-by: Saranga Komanduri <skomanduri@civisanalytics.com>
There isn't a set cadence for releases. The list of unreleased changes is getting pretty long right now, so I was thinking of making one after this and maybe one more PR, so you can just leave the version number as is. |
rad thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* initial implementation of tenacity * added comment * got docs on how to stash cert * savepont * retry working * added class as trigger and propegated * everything except client working * cleanup * cleanup * cleanup * working as wrapper on call * added retry for client invoke * moved max retries var and cleanup * implemented retry_error_callback as a passthrough and some cleanup * tearout * tearout * cleanup * respect retry headers * respect retry-after header * respect retry-after header * flake fixes * add jitter to retry deco for files * working on tests * fixed some tests * all tests passing * renamed io retry tests for clarity * fixed flak8 * fixed flak8 * tests for retry * cleanup * cleanup * changlog * rebased * Update README.rst Co-authored-by: Saranga Komanduri <skomanduri@civisanalytics.com> * typo * added connectionerror test * cleanup * cleanup Co-authored-by: Saranga Komanduri <skomanduri@civisanalytics.com>
This reimplements retry logic using tenacity. Requirements are:
[X] retry on is [429, 502, 503, and 504]
[X] retry on ['HEAD', 'TRACE', 'GET', 'PUT', 'OPTIONS', 'DELETE']
[X] Retry-After header is present, we use that value for the retry interval.
[X] Retry POSTs on 429s and 503s.
[X] default max total time 10 minutes
[X] default max tries = 10
[X] exponential backoff with full jitter