Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: write retry queue stops retrying #332

Merged
merged 1 commit into from Jun 20, 2022

Conversation

seth-hunter
Copy link
Contributor

@seth-hunter seth-hunter commented Jun 3, 2022

Fixes logic error in management of retry buffer that leads to deadlock scenario where write retries cease if server connection is lost for longer than MaxRetryTime

Copy link
Contributor

@vlastahajek vlastahajek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding the issue and the fix!

Just a few minor things:

  1. Fix tests, please. You can use this patch PR332TestFix.zip via git apply
  2. Change commit messages to be semantic.
  3. Small change in the error message to be consistent with other messages

internal/write/service.go Outdated Show resolved Hide resolved
…hes when server connection is lost for longer than MaxRetryTime

Design error, need to evict all expired batches prior to attempting write. Otherwise if server connection drops out for longer than MaxRetryTime, can get caught in an endless loop of evicting one expired batch each time a new batch is enqueued and thus never again actually retry writing to server.
@seth-hunter seth-hunter force-pushed the write-retry-fixes branch 2 times, most recently from 000351a to 2b26861 Compare June 11, 2022 20:42
@codecov-commenter
Copy link

Codecov Report

Merging #332 (2b26861) into master (7ca3d22) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #332   +/-   ##
=======================================
  Coverage   90.48%   90.48%           
=======================================
  Files          23       23           
  Lines        2490     2490           
=======================================
  Hits         2253     2253           
  Misses        178      178           
  Partials       59       59           
Impacted Files Coverage Δ
api/write.go 93.70% <100.00%> (ø)
api/writeAPIBlocking.go 84.21% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ca3d22...2b26861. Read the comment docs.

@seth-hunter
Copy link
Contributor Author

@vlastahajek: thank you for the review and suggestions. Updated branch should address your comments and also fix an additional closely-related issue I found during further testing. Requesting your review on that, as it may be a little more subjective from a design perspective.

@seth-hunter seth-hunter changed the title Write retry fixes fix: write retry queue timing errors Jun 13, 2022
@vlastahajek
Copy link
Contributor

Thanks for the update.
Actually, when a batch is discarded and the retry interval restarts, this is a designed concept, not an issue.
Could you, please, split this PR so the original fix can be merged and the newly introduced changes discussed?

@seth-hunter seth-hunter changed the title fix: write retry queue timing errors fix: write retry queue stops retrying Jun 18, 2022
@vlastahajek vlastahajek merged commit fe6c7cb into influxdata:master Jun 20, 2022
@vlastahajek
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants