Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix:shutdown stuck when there is error on the flush path #831

Merged
merged 2 commits into from Feb 16, 2021

Conversation

yirutang
Copy link
Contributor

@yirutang yirutang commented Feb 16, 2021

…l to release the requests in the queue, causing shutdown to wait forever

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> ☕️

…l to release the requests in the queue, causing shutdown to wait forever
@yirutang yirutang requested a review from as a code owner Feb 16, 2021
@yirutang yirutang requested a review from tswast Feb 16, 2021
@product-auto-label product-auto-label bot added the api: bigquerystorage label Feb 16, 2021
@google-cla google-cla bot added the cla: yes label Feb 16, 2021
@codecov
Copy link

@codecov codecov bot commented Feb 16, 2021

Codecov Report

Merging #831 (d1c2ecf) into master (94c7848) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #831      +/-   ##
============================================
+ Coverage     81.10%   81.14%   +0.04%     
  Complexity      996      996              
============================================
  Files            74       74              
  Lines          5346     5347       +1     
  Branches        415      415              
============================================
+ Hits           4336     4339       +3     
+ Misses          839      838       -1     
+ Partials        171      170       -1     
Impacted Files Coverage Δ Complexity Δ
...e/cloud/bigquery/storage/v1beta2/StreamWriter.java 85.03% <100.00%> (+0.44%) 38.00 <0.00> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94c7848...a04598a. Read the comment docs.

@stephaniewang526 stephaniewang526 merged commit c2fd750 into googleapis:master Feb 16, 2021
15 of 16 checks passed
@@ -932,6 +932,7 @@ public void onComplete() {
public void onError(Throwable t) {
LOG.fine("OnError called");
if (streamWriter.shutdown.get()) {
abortInflightRequests(t);
Copy link
Contributor

@yayi-google yayi-google Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two more issues:

  1. Even if you abort the inflight request here, the shutdown can still add more requests through writeAllOutstanding();
  2. Besides, it is also possible that another thread is calling append, and will contribute another stuck request at https://screenshot.googleplex.com/7pmsyb2zyKDEaGF


ApiFuture<AppendRowsResponse> appendFuture1 = sendTestMessage(writer, new String[] {"A"});
ApiFuture<AppendRowsResponse> appendFuture2 = sendTestMessage(writer, new String[] {"B"});
Thread.sleep(5000L);
Copy link
Contributor

@yayi-google yayi-google Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Thread.sleep(5000L);
fakeExecutor.advanceTime(Duration.ofSeconds(20));
// Shutdown writer immediately and there will be some error happened when flushing the queue.
writer.shutdown();
Copy link
Contributor

@yayi-google yayi-google Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. You advance the time before shutdown. So even in the old code path before your fix, you can reduce the only inflight requests in the queue and unblock. Will this unit test block before your fix?

Please think carefully how to write a unit test.

@yirutang yirutang deleted the fix1 branch Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage cla: yes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants