-
Notifications
You must be signed in to change notification settings - Fork 162
Change error handling for creating parent jobs #2313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change error handling for creating parent jobs #2313
Conversation
0fcca3b to
2ee66f3
Compare
2ee66f3 to
29c3b84
Compare
|
|
||
| match row_result { | ||
| Ok(row) => (false, Ok(row.get::<_, i32>(0) as u32)), | ||
| match result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to be able to duplicate jobs in the database. I think we agree on this.
If we inserted a duplicate, somehow, it would trigger a SqlState::FOREIGN_KEY_VIOLATION which, while correct, does not pick up the nuance that it could be a parent that we want to ignore.
Essentially I think we should leave the two functions distinct. Tempting as it is to refactor it, it makes it more confusing as to what is happening.
For reference these are the clauses constraints;
CONSTRAINT job_queue_request_fk
FOREIGN KEY (request_tag)
REFERENCES benchmark_request(tag)
ON DELETE CASCADE,
CONSTRAINT job_queue_collector
FOREIGN KEY (collector_id)
REFERENCES collector_config(id)
ON DELETE CASCADE,
CONSTRAINT job_queue_unique
UNIQUE (
request_tag,
target,
backend,
profile,
benchmark_set
)Which we do want. I'm not sure how JobEnqueueResult::RequestShaNotFound being returned as an error makes sense when the error was a SqlState::FOREIGN_KEY_VIOLATION? That would imply that the request_sha already existed?
It doesn't seem clear anymore that we are ignoring parents who already had a job associated with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Foreign key violation won't be raised because of duplicates, because the query uses ON CONFLICT DO NOTHING.
It doesn't seem clear anymore that we are ignoring parents who already had a job associated with them.
The function now has an explicit result type that encodes this much more clearly than before (where it returned a bool and a result). The commit vs parent job insertion mode is handled in the job queue; we could also put it into this function, but I don't think it's better.
I'm not sure how JobEnqueueResult::RequestShaNotFound being returned as an error makes sense when the error was a SqlState::FOREIGN_KEY_VIOLATION?
In theory we could also get another FK violation (at this point I think only the collector name), but right now RequestShaNotFound is the only case that actually matters. I included the error string, so we can find out in the log if something else ever happened, and rename in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the ON CONFLICT DO NOTHING was not there for a parent, as that allowed us to be able to detect foreign key violations - i.e a backfill did not need to take place. I stand corrected on that.
Tracing the logic - we would now fall through to JobEnqueueResult::Other(e.into()) if it was any other error than the request_tag not existing in the benchmark_request table. The logic for saying that is okay for a parent is now in job_queue/mod.rs.
This feels a lot more complicated than what was there previously, it wasn't slick or terribly clever but it was dead simple to follow. I'm not sure there is a compelling enough reason to fragment the logic and generalise for consolidating one function? It feels the fix was changing query_one to query for the parent insert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it's easier to follow now - there is a single function that explicitly enumerates which error conditions can happen, and the job queue then explicitly handles these error conditions. As a plus we don't need to have two functions that we need to keep in sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree. Having modes passed to a function, following a web of control flow and fall through logic, which could potentially grow, to figure out what is going on vs having two distinct code paths seems more complicated.
If we want to centralise things the query could be a good candidate. We could have one function that does the insert and two wrappers called what they are currently? That way we have one central location that avoids drift but two distinct functions that can handle things differently?
|
Let's see if this really unblocks the queue, we can refactor later. |
I missed this comment:
// This will return zero rows if the job already exists, we usedquery_onein the parent version of the enqueue job function, which was failing if the parent job was already in the DB.Rather than having two separate functions whose functionality can slightly diverge, I unified the logic into a single function.