Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntroToAKS: MongoDB should use StatefulSets instead of Deployments #282

Open
larryclaman opened this issue Aug 7, 2021 · 7 comments
Open

Comments

@larryclaman
Copy link
Contributor

To be aligned with best practices for stateful applications, MongoDB should be deployed as a StatefulSet (of one replica) rather than a Deployment.
I have encountered some issues ('race conditions') when debugging & troubleshooting mongodb if it has been deployed as a deployment. Specifically, you can't do any sort of rolling update, or restart (kubectl rollout restart deployment mongodb), as the new pod will never be able to make a claim on the pvc while the old pod is hanging around.
Using a statefulset overcomes this problem. I've tested it using a stateful set and I will be pushing a PR with the revisions necessary to support a stateful set in this hack.

@larryclaman
Copy link
Contributor Author

see PR #283

@larryclaman
Copy link
Contributor Author

discussed this with @jrzyshr -- to be considered

@jrzyshr
Copy link
Collaborator

jrzyshr commented Nov 10, 2021

This proposed update does align with best practices for stateful applications. It also introduces a new concept (stateful sets) not currently covered in this intro-level hack on AKS. This hack purposefully avoids some of the complexities of Kubernetes in order to teach those new to it the fundamentals such as pods, deployments, & services.

We could introduce StatefulSets and explain it with a little bit of lecture. Or, we could just slip it in and tell students they need to use this "kind" in the YAML file similar to how we slip in the kind "job" for the content-init daemon.

To keep it simple, in Challenge 6, we currently instruct students to deploy mongo from Dockerhub with minimal instructions. Coaches are encouraged to tell students to "use the knowledge you just acquired in Challenge 4 to deploy Fab Medical" to deploy mongo.

I suggest a compromise here:

  1. update the coach guide with a note to leave it to the coach's discretion to instruct students to use a StatefulSet or not.
    2.) provide a completed YAML file in the coach/Solution folder that demonstrates how this is done.

If others have stronger feelings on this recommendation, please let us know here in the comments!

@lastcoolnameleft any thoughts here?

@larryclaman
Copy link
Contributor Author

Happy to update per your suggestions if you think it would flow better.

@lastcoolnameleft
Copy link
Contributor

IMO, it depends on the intent of the challenge:

  1. If the goal is to just show that you can deploy MongoDB in your cluster, then doing it as a deployment is fine. However, I HIGHLY recommend adding a disclaimer akin to: "The data inside this MongoDB instance will not persist past the lifecycle of the pod. If you want to persist data, please see 023-AdvK8s Hack"

  2. If the goal is to show how to properly deploy a service that will persist, then yes, use StatefulSets; however, then the coaches guide & solution should be updated to also include a PVC.

My hot take: Stick with option 1. I still remember my first time I killed a database pod and all my data was gone. It was a great learning experience for me about how & why which later lead to me exploring volumes. If anything, I'd add a challenge to kill the DB and see what happens to the data so that the students learn that lesson.

@larryclaman
Copy link
Contributor Author

larryclaman commented Nov 11, 2021

I agree with the "KISS" recommendations here and I'm certainly not trying to overcomplicate things.
But do want to point out: one of the reasons I was inspired to create this PR is what I wrote in the header to this issue, namely:

"I have encountered some issues ('race conditions') when debugging & troubleshooting mongodb if it has been deployed as a deployment. Specifically, you can't do any sort of rolling update (eg, deploy again with new parameters), or restart (kubectl rollout restart deployment mongodb), as the new pod will never be able to make a claim on the pvc while the old pod is hanging around.
Using a statefulset overcomes this problem. I've tested it using a stateful set and I will be pushing a PR with the revisions necessary to support a stateful set in this hack."

It's kind of nasty to be working with participants and having to tell them that the only way to fix the current issue is to delete the existing deployment and then redeploy again. Using a stateful set overcomes this issue. But perhaps I'm nitpicking 😁

Also, @lastcoolnameleft , re your suggestion "I'd add a challenge to kill the DB and see what happens to the data"; that's already in the hack, see challenge 8: https://github.com/microsoft/WhatTheHack/blob/master/001-IntroToKubernetes/Student/08-storage.md

@jrzyshr
Copy link
Collaborator

jrzyshr commented Nov 16, 2021

@larryclaman I forgot to tell you that I had a student end up experiencing your issue in our last delivery. :) It actually ended up being a good discussion and learning experience for all.

@lastcoolnameleft C4 is exactly for the purpose you describe in your #1. As Larry points out, C8 is all about adding a PVC and learning how storage works.

I'm still leaning toward my compromise proposal above... put the stateful set "solution" files in the coach guide and a note to the coach to dictate IF students should use stateful set at their discretion. No changes to the student guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants