fix: process flakes task redis lock and key #1101

joseph-sentry · 2025-02-20T22:28:25Z

Previously, I made the decision to use a redis lock to exclude multiple process_flakes tasks from running for the same repo at the same time, so that we would minimize database concurrency issues.

The idea was that I would set this redis key that represents "more flakes need to be processed for this repo" and that tasks that failed to acquire the lock could rely on the currently running task to handle their work so they wouldn't have to block. I never actually implemented the part where the currently running task takes the work of tasks that ceded their work to it.

This commit implements that by replacing the value of the current redis key with a list of commits that need to be processed, this way the currently running task actually has access to the commits that need to be processed.

codecov-notifications · 2025-02-20T22:35:16Z

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
tasks/process_flakes.py	92.30%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

codecov · 2025-02-20T22:35:21Z

Codecov Report

Attention: Patch coverage is 96.55172% with 2 lines in your changes missing coverage. Please review.

Project coverage is 97.28%. Comparing base (d7b53d8) to head (259f76f).
Report is 10 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
tasks/process_flakes.py	92.30%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1101   +/-   ##
=======================================
  Coverage   97.28%   97.28%           
=======================================
  Files         454      454           
  Lines       37355    37405   +50     
=======================================
+ Hits        36339    36390   +51     
+ Misses       1016     1015    -1

Flag	Coverage Δ
integration	`43.11% <22.41%> (-0.04%)`	⬇️
unit	`89.79% <96.55%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Swatinem

approving to unblock you here, but it would be nice to still rework this a bit to be more resilient towards races.

Swatinem · 2025-02-21T10:09:52Z

tasks/process_flakes.py

+
+                except ResponseError as e:
+                    if "WRONGTYPE" in str(e) and commit_id:
+                        while redis_client.getdel(list_key) is not None:


as the corresponding rpush has a similar compatibility code, this might run into a race condition:

this task tries to lpop, runs into an exception

the finisher tries to lpush, runs into an exception

the finisher completes its "delete + rpush" first

we then run the getdel here

I believe you can avoid this backwards compatibility code by just using a different redis key (though still make sure that the flake_uploads one is being deleted, as I believe it is not defined with a TTL so might hang around for eternity otherwise.

Previously, I made the decision to use a redis lock to exclude multiple process_flakes tasks from running for the same repo at the same time, so that we would minimize database concurrency issues. The idea was that I would set this redis key that represents "more flakes need to be processed for this repo" and that tasks that failed to acquire the lock could rely on the currently running task to handle their work so they wouldn't have to block. I never actually implemented the part where the currently running task takes the work of tasks that ceded their work to it. This commit implements that by replacing the value of the current redis key with a list of commits that need to be processed, this way the currently running task actually has access to the commits that need to be processed. Since it's possible there are old versions of the ta finisher running, our new process flakes task needs to handle both versions of the redis scheme. If an old process flakes task receives a task queued up by the new task it will do nothing, but the next time a new process flakes task runs for that repo, it will process the flakes for that commit.

we previously were using the method of setting a key in redis to have a value to let the process flakes task know that there was more work to do, however this made it so the process flakes task would skip processing certain commits this is a follow-up change to previous changes we made to the process flakes task: #1101 this change makes it so the test results finisher is using the new method of queueing up commitids in redis before queueing up the process flakes task there will be another change where we make it so process flakes no longer checks the old redis key and that will complete this change

joseph-sentry requested a review from a team February 20, 2025 22:28

Swatinem approved these changes Feb 21, 2025

View reviewed changes

joseph-sentry force-pushed the joseph/fix-flakes branch from 178d329 to 52c8d4d Compare February 21, 2025 16:01

joseph-sentry force-pushed the joseph/fix-flakes branch from 52c8d4d to 259f76f Compare February 21, 2025 16:18

Swatinem approved these changes Feb 26, 2025

View reviewed changes

joseph-sentry added this pull request to the merge queue Feb 27, 2025

Merged via the queue into main with commit 899d4dd Feb 27, 2025
26 of 29 checks passed

joseph-sentry deleted the joseph/fix-flakes branch February 27, 2025 18:57

JerrySentry restored the joseph/fix-flakes branch March 12, 2025 14:57

JerrySentry added a commit that referenced this pull request Mar 12, 2025

Add upload-overwatch.yml workflow for PR #1101

fadbd18

joseph-sentry mentioned this pull request Mar 14, 2025

fix: ta finisher calls process flakes properly #1145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: process flakes task redis lock and key #1101

fix: process flakes task redis lock and key #1101

Uh oh!

joseph-sentry commented Feb 20, 2025

Uh oh!

codecov-notifications bot commented Feb 20, 2025 •

edited

Loading

Uh oh!

codecov bot commented Feb 20, 2025 •

edited

Loading

Uh oh!

Swatinem left a comment

Uh oh!

Swatinem Feb 21, 2025

Uh oh!

Uh oh!

Uh oh!

fix: process flakes task redis lock and key #1101

fix: process flakes task redis lock and key #1101

Uh oh!

Conversation

joseph-sentry commented Feb 20, 2025

Uh oh!

codecov-notifications bot commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codecov bot commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Swatinem left a comment

Choose a reason for hiding this comment

Uh oh!

Swatinem Feb 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov-notifications bot commented Feb 20, 2025 •

edited

Loading

codecov bot commented Feb 20, 2025 •

edited

Loading