Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Improve streaming metrics #26299
Feature: Improve streaming metrics #26299
Changes from 3 commits
b46cc58
29114fb
029a042
b653a21
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this should be changed to
/health
, but that could break people's monitoring & proxying setups, so I've not made that change for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want this endpoint to be accessible from the end-user, and when the streaming server is on the main domain, it means it needs to be reachable under
/api/v1/streaming/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does a health check endpoint need to be publicly accessible? It's only used by monitoring..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent was at some point that apps could check it to know if it was working correctly, rather than attempting to retry a WS connection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this makes sense, especially in the load balanced environment: say you run multiple instances of streaming, and have nginx load balance them, then the
GET /api/v1/streaming/health
could hit the healthy instance, an the WS connection could reach an unhealthy instance.The correct strategy here is exponential reconnection backoff with jitter and max retries before failing to non-streaming connectivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but as I remember this was the intent behind making this public.
Changing this may be a breaking change, I dont know if this is used by some clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so, I'm leaving it as a public endpoint now, but I think it'd be very wise to announce that it will be deprecated in 4.3.0 or something — giving people time to upgrade away from it, if they're using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we also want to have
/health
answering the same response as this one, and people can move to it before its deprecatedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could add in
/health
now, just wasn't sure if I'd get approval overall for that; hence just pointing it out now that it's technically an internal endpoint, not an external oneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was added when running a single server was considered the norm, so a single public-facing endpoint kinda made sense. I don't have a very strong opinion on this but having
/health
could make sense indeed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone with 400 here, instead of 404, though arguably it could be either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
404 is used for non-existing endpoints. 400 is generic. What about 410 (Gone)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not technically gone, it kind of never existed — basically you could try connecting to
/api/v1/streaming/foo
and that'd hit here, but only because "foo" is an unknown channel. I went with 400 because 404 didn't really feel like it made sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
400 seems fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally all of the following code would only happen for known eventsource streaming connections, but the spec doesn't allow us the ability to ensure that:
emphasis my own: https://html.spec.whatwg.org/multipage/server-sent-events.html
(so basically we can't add middleware that checks for the
Accept: text/event-stream
because the spec didn't mandate that clients send that header)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was in theory another case where you could attempt to send a message to a closed or closing websocket; This is the same check we have in
streamToWs
but with the addition of checking that the socket is alive based on the heartbeats.