-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Observability] Extend sys_status and sys_invocation_state with endpoint_id #913
Conversation
a4b0907
to
d3fc3f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @AhmedSoliman, I've looked at the datafusion side of things, looks 💯 !
If @slinkydeveloper can give a quick 👀 into the invoker and approves then lets merge.
@@ -25,6 +25,8 @@ pub struct InvocationStatusReportInner { | |||
pub start_count: usize, | |||
pub last_start_at: SystemTime, | |||
pub last_retry_attempt_failure: Option<InvocationErrorReport>, | |||
pub next_retry_at: Option<SystemTime>, | |||
pub last_attempt_endpoint_id: Option<EndpointId>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why you would add the last_attempt_endpoint_id
both in the sys_invocation_state
and in sys_status
. For this field, the source of truth is sys_status
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok i see. You can track the case where you tried to use the endpooint_id but has not been committed yet. makes sense. Perhaps let's add a comment about that somewhere in the schema of the sys_invocation_state table, to avoid me making the same comment in 3 months :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I'll add a comment on the schema.
self.status_store.on_endpoint_chosen( | ||
&partition, | ||
&full_invocation_id, | ||
endpoint_id.clone(), | ||
); | ||
// If we think this selected endpoint has been freshly picked, otherwise | ||
// we assume that we have stored it previously. | ||
if has_changed { | ||
ism.notify_chosen_endpoint(endpoint_id); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a unit test for this has_changed
behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So next time i stumble on it i can figure it out from the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good idea but I'm not sure if this is going to be an easy to slot in existing tests. I'm a bit swamped and it'll be hard now to add a new test only for this case tbh.
…int_id Summary: This introduces two new columns in `sys_invocation_state`: - `next_retry_at` Points to the next retry timestamp if a retry is scheduled for an invocation. - `last_attempt_endpoint_id` The endpoint ID that was used for the last invocation attempt. Also, a new column was added in `sys_status`: - `pinned_endpoint_id` Indicates the endpoint ID that this invocation is _pinned_ to. If set, the invocation must continue to use this endpoint id even if other endpoints were installed that claim to host higher service revisions.
[Observability] Extend sys_status and sys_invocation_state with endpoint_id
Summary:
This introduces two new columns in
sys_invocation_state
:next_retry_at
Points to the next retry timestamp if a retry is scheduled for an invocation.last_attempt_endpoint_id
The endpoint ID that was used for the last invocation attempt.Also, a new column was added in
sys_status
:pinned_endpoint_id
Indicates the endpoint ID that this invocation is pinned to. If set, the invocation must continue to use this endpoint id even if other endpoints were installed that claim to host higher service revisions.Stack created with Sapling. Best reviewed with ReviewStack.