Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log number of remaining retry attempts for asset uploads #5280

Merged
merged 1 commit into from
Aug 11, 2023

Conversation

kraih
Copy link
Member

@kraih kraih commented Aug 11, 2023

While retries for asset chunks do work correctly, the messages being logged for failures can be a little confusing. By adding the number of remaining retries in a way similar to other log messages, it gets a lot easier to spot what is really going on. And that the "errors" are actually fairly harmless given the context.

Once this has proven itself in production, we might want to redesign the event interface a bit and get rid of a few events. There is a bit of redundancy right now in the logging. But i can see how that might help find issues in the upload logic as it stands right now.

I've also cleaned up the upload code a little bit to bring it closer to our current standards, without making changes to the API (yet).

Progress: https://progress.opensuse.org/issues/132167

While retries for asset chunks do work correctly, the messages being
logged for failures can be a little confusing. By adding the number of
remaining retries in a way similar to other log messages, it gets a
lot easier to spot what is really going on. And that the "errors" are
actually fairly harmless given the context.

Once this has proven itself in production, we might want to redesign
the event interface a bit and get rid of a few events. There is a bit of
redundancy right now in the logging. But i can see how that might help
find issues in the upload logic as it stands right now.

I've also cleaned up the upload code a little bit to bring it closer to
our current standards, without making changes to the API (yet).

Progress: https://progress.opensuse.org/issues/132167
@codecov
Copy link

codecov bot commented Aug 11, 2023

Codecov Report

Merging #5280 (b75c9f1) into master (75a911b) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #5280   +/-   ##
=======================================
  Coverage   98.31%   98.31%           
=======================================
  Files         389      389           
  Lines       37184    37195   +11     
=======================================
+ Hits        36558    36569   +11     
  Misses        626      626           
Files Changed Coverage Δ
lib/OpenQA/WebAPI/Controller/API/V1/Job.pm 96.75% <ø> (ø)
t/32-openqa_client.t 100.00% <ø> (ø)
lib/OpenQA/Client/Upload.pm 100.00% <100.00%> (ø)
lib/OpenQA/Worker/Job.pm 100.00% <100.00%> (ø)
t/24-worker-jobs.t 100.00% <100.00%> (ø)

@kraih
Copy link
Member Author

kraih commented Aug 11, 2023

I also want to increase the retry timeout or the number of attempts, but not sure yet which one would be better here.

@kalikiana
Copy link
Member

I also want to increase the retry timeout or the number of attempts, but not sure yet which one would be better here.

I feel like this may be nice to make configurable. Not necessarily in this branch. I'd leave it to you, though.

@mergify mergify bot merged commit 4fe4f04 into master Aug 11, 2023
18 checks passed
@mergify mergify bot deleted the k/log_retry_attempts branch August 11, 2023 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants