-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FlowRunView
task run queries when all task runs are cached
#6572
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming my assumptions are right this looks good to me.
|
||
if load_static_tasks: | ||
task_run_data = TaskRunView._query_for_task_runs( | ||
where={ | ||
"map_index": {"_eq": -1}, | ||
"flow_run_id": {"_eq": flow_run_id}, | ||
"id": {"_nin": [view.task_run_id for view in _cached_task_runs]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive my v1 ignorance, but I assume that _nin
means not in
? So this is effectively saying "Get all of the task runs associated with this flow run, except for the ones that are cached."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! We only cache finished task runs so excluding them from the query results will reduce the amount of traffic with no change in behavior.
# Freeze the cached task runs we will return to prevent race conditions if | ||
# this object is retrieving task runs in multiple threads | ||
cached_task_runs = list(self._cached_task_runs.values()) | ||
cached_task_run_ids = {task_run.task_run_id for task_run in cached_task_runs} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bunchesofdonald A new change here that prevents a race condition if self._cached_task_runs
is updated by another thread while the query is running. We wouldn't include those task runs in the nin
call so they'd be returned by the API then on L759 we'd append the cached runs resulting in duplicates.
Closes bug reported in https://prefecthq.slack.com/archives/C028BPL85C3/p1661452093250219 where if all of the task runs are cached an error is thrown when retrieving cache runs
This is easily fixed by toggling the
error_on_empty
bool.Includes a change that excludes of cached task runs from the
load_static_tasks
query.Includes a fix that prevents duplicate task runs from being returned when the object is shared by multiple threads.
Example
Checklist
[ ] This pull request references any related issue by including "closes<link to issue>
"bug
,feature
,enhancement
,docs
, ormaintenance
label categorizing the change.