Skip to content

Conversation

@tsloughter
Copy link
Member

I still need to add a paragraph about the startup probe. I'm not planning on using it in service_discovery.

@tsloughter tsloughter requested a review from ferd September 26, 2023 10:22
@github-actions
Copy link

github-actions bot commented Sep 26, 2023

@github-actions github-actions bot temporarily deployed to pull request September 26, 2023 10:23 Inactive
@tsloughter tsloughter force-pushed the readiness-liveness-rm-update branch from 51d1c97 to 030a667 Compare September 26, 2023 10:25
@github-actions github-actions bot temporarily deployed to pull request September 26, 2023 10:26 Inactive
@tsloughter tsloughter force-pushed the readiness-liveness-rm-update branch 2 times, most recently from b27d823 to 5794f72 Compare September 26, 2023 10:38
@github-actions github-actions bot temporarily deployed to pull request September 26, 2023 10:39 Inactive
@tsloughter
Copy link
Member Author

I'll also add a paragraph at the end about the values used for readiness.

Main thing with startup probe will be why we don't use it.

@tsloughter tsloughter force-pushed the readiness-liveness-rm-update branch from 5794f72 to b7cc6da Compare September 26, 2023 17:06
@github-actions github-actions bot temporarily deployed to pull request September 26, 2023 17:07 Inactive
@tsloughter tsloughter force-pushed the readiness-liveness-rm-update branch from b7cc6da to f992617 Compare September 26, 2023 17:25
@github-actions github-actions bot temporarily deployed to pull request September 26, 2023 17:26 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 27, 2023 10:15 Inactive
maintenance, it can return a =503= status response for readiness causing it to be
removed from the =Service= backends.

Nothing special is needed to have a graceful shutdown, for any reason, whether
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eeeeh, if you don't remove yourself from the load balancer pool, some gateways (like ALBs or ELBs) may still try to send traffic to a process that's halfway through shutting down, and that will cause interrupted connections that are forwarded to the client. This may however be stack dependent because not all stacks clean up as nicely and orderly as an OTP tree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removal should already be handled by k8s. There is always delay, even in the case of the readinessProbe being setup to return 503 on shutdown since the check is only done every period seconds, but its handled by k8s and not the application itself is my point. Maybe it needs rewording?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that could make it clearer. It's possible this was just my misunderstanding and I didn't realize that being in a shutting down state also automatically took you out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I should probably mention the use of prep_stop. I haven't implemented this in service_discovery because it is a little tricky in its case since it has 2 applications, one for http and one for grpc, that both have to wait until active requests are handled before shutting down.

I suppose I should also mention the need to have a proper boot order and supervision tree so that stuff like the database pool aren't shutdown before the http app can pause to complete open requests.

I could just add a "see Graceful Shutdown section (to come)" for more...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, maybe it isn't tricky, was overthinking it. It is the same whether they are both in the release or not. Each can wait for their active connections one after the other and the SIGKIll takes care of it if they take too long.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, yea, this is just k8s specific. I will just add a callout to another section and with saying in k8s you don't need to do anything special, but see this section for how to handle the shutdown in your Erlang application.

tsloughter and others added 4 commits September 28, 2023 03:46
Co-authored-by: Fred Hebert <mononcqc@ferd.ca>
Co-authored-by: Fred Hebert <mononcqc@ferd.ca>
Co-authored-by: Fred Hebert <mononcqc@ferd.ca>
Co-authored-by: Fred Hebert <mononcqc@ferd.ca>
@github-actions github-actions bot temporarily deployed to pull request September 28, 2023 09:49 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 28, 2023 11:25 Inactive
@tsloughter tsloughter force-pushed the readiness-liveness-rm-update branch from 22ec8ec to 5144f31 Compare September 28, 2023 11:39
@github-actions github-actions bot temporarily deployed to pull request September 28, 2023 11:41 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 29, 2023 11:06 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 29, 2023 11:10 Inactive
@github-actions github-actions bot temporarily deployed to pull request September 29, 2023 17:03 Inactive
readinessProbe:
httpGet:
path: /healthz
path: /ready
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be interesting to eventually aim for /health/<type> but we're good to go here.

@tsloughter tsloughter merged commit aa3c135 into main Oct 2, 2023
@tsloughter tsloughter deleted the readiness-liveness-rm-update branch October 2, 2023 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants