Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: process flakes task redis lock and key #1101

Merged
merged 1 commit into from
Feb 27, 2025
Merged

Conversation

joseph-sentry
Copy link
Contributor

Previously, I made the decision to use a redis lock to exclude multiple process_flakes tasks from running for the same repo at the same time, so that we would minimize database concurrency issues.

The idea was that I would set this redis key that represents "more flakes need to be processed for this repo" and that tasks that failed to acquire the lock could rely on the currently running task to handle their work so they wouldn't have to block. I never actually implemented the part where the currently running task takes the work of tasks that ceded their work to it.

This commit implements that by replacing the value of the current redis key with a list of commits that need to be processed, this way the currently running task actually has access to the commits that need to be processed.

@joseph-sentry joseph-sentry requested a review from a team February 20, 2025 22:28
@codecov-notifications
Copy link

codecov-notifications bot commented Feb 20, 2025

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
tasks/process_flakes.py 92.30% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

codecov bot commented Feb 20, 2025

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

Project coverage is 97.28%. Comparing base (d7b53d8) to head (259f76f).
Report is 10 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
tasks/process_flakes.py 92.30% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1101   +/-   ##
=======================================
  Coverage   97.28%   97.28%           
=======================================
  Files         454      454           
  Lines       37355    37405   +50     
=======================================
+ Hits        36339    36390   +51     
+ Misses       1016     1015    -1     
Flag Coverage Δ
integration 43.11% <22.41%> (-0.04%) ⬇️
unit 89.79% <96.55%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving to unblock you here, but it would be nice to still rework this a bit to be more resilient towards races.


except ResponseError as e:
if "WRONGTYPE" in str(e) and commit_id:
while redis_client.getdel(list_key) is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the corresponding rpush has a similar compatibility code, this might run into a race condition:

  • this task tries to lpop, runs into an exception
  • the finisher tries to lpush, runs into an exception
  • the finisher completes its "delete + rpush" first
  • we then run the getdel here

I believe you can avoid this backwards compatibility code by just using a different redis key (though still make sure that the flake_uploads one is being deleted, as I believe it is not defined with a TTL so might hang around for eternity otherwise.

Previously, I made the decision to use a redis lock to exclude
multiple process_flakes tasks from running for the same repo at the
same time, so that we would minimize database concurrency issues.

The idea was that I would set this redis key that represents "more
flakes need to be processed for this repo" and that tasks that failed
to acquire the lock could rely on the currently running task to handle
their work so they wouldn't have to block. I never actually
implemented the part where the currently running task takes the work
of tasks that ceded their work to it.

This commit implements that by replacing the value of the current
redis key with a list of commits that need to be processed, this way
the currently running task actually has access to the commits that
need to be processed.

Since it's possible there are old versions of the ta finisher running,
our new process flakes task needs to handle both versions of the redis
scheme.

If an old process flakes task receives a task queued up by the new
task it will do nothing, but the next time a new process flakes task
runs for that repo, it will process the flakes for that commit.
@joseph-sentry joseph-sentry added this pull request to the merge queue Feb 27, 2025
Merged via the queue into main with commit 899d4dd Feb 27, 2025
26 of 29 checks passed
@joseph-sentry joseph-sentry deleted the joseph/fix-flakes branch February 27, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants