Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have cluster running after the nodes restart #3

Closed
innobead opened this issue Aug 3, 2020 · 8 comments
Closed

Have cluster running after the nodes restart #3

innobead opened this issue Aug 3, 2020 · 8 comments
Assignees
Labels
enhancement New feature or request

Comments

@innobead
Copy link
Owner

innobead commented Aug 3, 2020

For now, after stopping the VMs, there are no ways to restart the VMs w/ the same configured IPs and still make the cluster running w/o issues.

Look for a solution to make cluster management during node state change.

@innobead innobead added the enhancement New feature or request label Aug 3, 2020
@innobead innobead changed the title Have cluster running w/o issues after the nodes restarts Have cluster running after the nodes restarts Aug 4, 2020
@rugwirobaker
Copy link

cluster life cycle commands(start, stop) would be a great addition

@innobead
Copy link
Owner Author

innobead commented Aug 13, 2020

This is caused by the default IPAM (host-local) behavior, which does not support the reserved IPs to reuse for the restarted instances via container ID (here it's VM ID).

✔ cat /var/lib/cni/networks/ignite-cni-bridge/10.61.0.104
ignite-527549c6aeab8e84

If you try to start an unexpected stopped VM (like the firecracker process stopped by host restart or power outage), you would encounter the below error, because the assigned IP does not get released by host-local

✔ ignite start 527549c6aeab8e8
ERRO[0000] failed to setup network for namespace "ignite-527549c6aeab8e84": failed to allocate for range 0: 10.61.0.104 has been allocated to ignite-527549c6aeab8e84, duplicate allocation is not allowed 
FATA[0000] failed to allocate for range 0: 10.61.0.104 has been allocated to ignite-527549c6aeab8e84, duplicate allocation is not allowed 

There are two solutions here:

  • create another CNI plugin based on host-local but support allocated IPs to reuse when requested by the assigned VMs. Then, use this new ipam plugin instead of host-local.
  • use DNS server to dynamically manage the mapping between VM domain name and IPs. All the nodes management base on domain name instead of IP.

@rugwirobaker
Copy link

which one of these do you prefer? Wouldn't a CNI plugin be too much more work?

@innobead
Copy link
Owner Author

innobead commented Aug 14, 2020

I would prefer going with the first solution w/ less dependencies n overhead before working on multiple nodes support in kubefire.

I will have some progress n update these days :)

@innobead
Copy link
Owner Author

@innobead innobead changed the title Have cluster running after the nodes restarts Have cluster running after the nodes restart Aug 15, 2020
@innobead
Copy link
Owner Author

innobead commented Aug 17, 2020

Supported in v0.1.0. Closing.

  • kubefire cluster start/stop
  • require kubefire-cni-bridge plugin which depends on host-local-rev. (This can be installed by kubefire install)

@innobead
Copy link
Owner Author

About VMs unexpected stopped by the host status like power outage, this will be followed up at the upstream PR as below.
weaveworks/ignite#660

@innobead
Copy link
Owner Author

About VMs unexpected stopped by the host status like power outage, this will be followed up at the upstream PR as below.
weaveworks/ignite#660

Fixed and merged 💯

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants