Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible race condition (?) when restarting master #73

Closed
justinsb opened this issue Jul 6, 2016 · 1 comment
Closed

Possible race condition (?) when restarting master #73

justinsb opened this issue Jul 6, 2016 · 1 comment
Labels

Comments

@justinsb
Copy link
Member

justinsb commented Jul 6, 2016

If we restart the master, and the disks were mounted previously, the etcd manifest might be in /etc/kubernetes/manifests. But after a restart we might no longer be able to mount the etcd volumes (someone else may have got them), and we might also start etcd before mounting the disks.

@justinsb justinsb added the P0 label Jul 7, 2016
justinsb added a commit to justinsb/kops that referenced this issue Jul 7, 2016
If the instance restarted but lost the volume mount, there might be a
short or indefinite delay before protokube can mount the volume again.
But the etcd manifest would probably still be in
/etc/kubernetes/manifests from the previous run.

To ensure that kubelet doesn't run etcd until the volume is actually
mounted, we use a symlink to a directory on the volume itself.  Thus
kubelet can't start etcd until we put the volume there.  We can also
delete the symlink before mounting, so we have full control.

Issue kubernetes#73
justinsb added a commit to justinsb/kops that referenced this issue Jul 7, 2016
If the instance restarted but lost the volume mount, there might be a
short or indefinite delay before protokube can mount the volume again.
But the etcd manifest would probably still be in
/etc/kubernetes/manifests from the previous run.

To ensure that kubelet doesn't run etcd until the volume is actually
mounted, we use a symlink to a directory on the volume itself.  Thus
kubelet can't start etcd until we put the volume there.  We can also
delete the symlink before mounting, so we have full control.

Issue kubernetes#73
justinsb added a commit to justinsb/kops that referenced this issue Jul 7, 2016
If the instance restarted but lost the volume mount, there might be a
short or indefinite delay before protokube can mount the volume again.
But the etcd manifest would probably still be in
/etc/kubernetes/manifests from the previous run.

To ensure that kubelet doesn't run etcd until the volume is actually
mounted, we use a symlink to a directory on the volume itself.  Thus
kubelet can't start etcd until we put the volume there.  We can also
delete the symlink before mounting, so we have full control.

Issue kubernetes#73
@justinsb
Copy link
Member Author

justinsb commented Jul 8, 2016

I believe this is now fixed; going to close but please reopen if anyone sees anything resembling it again!

@justinsb justinsb closed this as completed Jul 8, 2016
rifelpet pushed a commit to rifelpet/kops that referenced this issue Mar 23, 2018
cloudbow pushed a commit to cloudbow/kops that referenced this issue Jun 8, 2018
…re/rest_layer_for_poc to develop

* commit '1ce94ce9271aad750971a26c478fa995aba676ff':
  Add poc alpha rest api
  Improve parameter for spark
  Updated spark job version to 28 and spark worker to 25
  Improvements to code logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant