Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: non-retryable RPCs use totalTimeout #1149

Merged
merged 1 commit into from Jul 27, 2020

Conversation

noahdietz
Copy link
Contributor

@noahdietz noahdietz commented Jul 13, 2020

This PR essentially reverts #745, thus making non-retryable RPCs reference the totalTimeout as the single RPC timeout (which was done in #712), rather than choosing from (intial | max | total).

The impetus for #745 was googleapis/google-cloud-java#5555, where a gax-java change to using totalTimeout for non-retried RPCs resulted in an API returning validation errors for its specific-client, because that timeout value was too high. The Cloud Tasks library has since had its timeouts lowered to below the backend validation threshold.

Prioritizing initial over max over total, based on if the value was set or not, we effectively lowered the timeouts for all non-retryable RPCs to the (historically) lowest of the three config values. This is because historically all three values are always set to something. Service teams might simply change an RPC from retryable to non-retryable not knowing that they have lowered the user-perceived timeout of the method.

Furthermore, in order for a user to change the timeout of a non-retryable RPC, they have to change all three timeout config values because gax-java asserts that initialTimeout <= maxTimeout <= totalTimeout. Generated documentation in clients shows an example of increasing a method's timeout by changing the totalTimeout value alone. This would do nothing for a non-retryable RPC, because the initialTimeout value would still be set and chosen. Similarly for service producers, if they wanted to increase the timeout for a non-retryable RPC in the generation config, they'd need to change all three values to satisfy the gax-java assertions, or set initial && maxTimeout to 0 and setting totalTimeout to the desired value.

IMO, it is slightly more intuitive to change the totalTimeout when a user (at runtime via gax-java api) or service team (at generation time via config) wants to increase the timeout perceived by the GAPIC-caller, rather all three values. I believe this is generated as example documentation in Java GAPICs for the same reason. This also lines up better with other languages that use the method-level timeout_millis config value to specify timeout for non-retryable RPCs.

@googlebot googlebot added the cla: yes label Jul 13, 2020
@codecov
Copy link

@codecov codecov bot commented Jul 13, 2020

Codecov Report

Merging #1149 into master will increase coverage by 0.00%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1149   +/-   ##
=========================================
  Coverage     78.65%   78.66%           
+ Complexity     1170     1168    -2     
=========================================
  Files           204      204           
  Lines          5195     5188    -7     
  Branches        417      415    -2     
=========================================
- Hits           4086     4081    -5     
+ Misses          935      934    -1     
+ Partials        174      173    -1     
Impacted Files Coverage Δ Complexity Δ
...rc/main/java/com/google/api/gax/rpc/Callables.java 68.62% <50.00%> (-0.34%) 12.00 <0.00> (-2.00)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0512a15...4d1d891. Read the comment docs.

@noahdietz
Copy link
Contributor Author

@noahdietz noahdietz commented Jul 17, 2020

Friendly bump on this PR! Thanks folks.

@noahdietz
Copy link
Contributor Author

@noahdietz noahdietz commented Jul 23, 2020

Friendly bump here. If folks think this is the wrong approach, I'd appreciate some feedback.

Copy link
Collaborator

@igorbernstein2 igorbernstein2 left a comment

LGTM for the overall change.
LGTM from the bigtable side.
My only reservation is that for services where rpc timeout != totaltimeout for nonidempotent rpcs. But I dont have enough knowledge to say if its a real risk

Copy link
Contributor

@vam-google vam-google left a comment

LGTM

@noahdietz noahdietz merged commit b7646a3 into googleapis:master Jul 27, 2020
7 of 8 checks passed
@noahdietz noahdietz deleted the non-retry-timeout branch Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants