Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent Volume and changing IPs #82

Closed
cha87de opened this issue Nov 8, 2017 · 14 comments
Closed

Persistent Volume and changing IPs #82

cha87de opened this issue Nov 8, 2017 · 14 comments

Comments

@cha87de
Copy link

@cha87de cha87de commented Nov 8, 2017

In the "Best Practices" it states:

Persistence: Storing /opt/couchbase/var outside the container with the -v option allows you to delete the container and recreate it later without losing the data in Couchbase Server. [...]

Unfortunately this won't work when recreating the Docker container with a new IP, since the IP addresses of hosts is part of this persistent volume. Since this scenario is a very usual case when working with lots of containers in practice, I would like to ask how the official guide from couchbase is, in order to avoid breaking couchbase instances when e.g. updating (usually done by recreating containers) or migrating (usually done by removing and creating on different location) them.

Thank you in advance.

@cha87de
Copy link
Author

@cha87de cha87de commented Nov 21, 2017

I moved to a (cloud capable) database. Thank you guys 👍

@tleyden
Copy link
Contributor

@tleyden tleyden commented Nov 30, 2017

@cha87de Sorry to hear that, and I hope you stay tuned as there are projects in the works to make it easier to run Couchbase in a more container-native orchestration environment, starting with Kubernetes.

@4F2E4A2E
Copy link

@4F2E4A2E 4F2E4A2E commented May 9, 2018

@cha87de which solution have you choosen?
@tleyden what about the IP change problem? Can't it be solved?

@cha87de
Copy link
Author

@cha87de cha87de commented May 9, 2018

@4F2E4A2E my solution was https://github.com/rethinkdb - or any other database server that is "cloud native". For my use cases the scalability and performance of rethinkdb or couchdb is sufficient anyway.

@tleyden
Copy link
Contributor

@tleyden tleyden commented May 9, 2018

@4F2E4A2E AFAIK it's going to be tackled in the Couchbase Kubernetes Operator. Last I heard it was still on the roadmap, but that project is progressing quickly.

@tleyden
Copy link
Contributor

@tleyden tleyden commented May 9, 2018

@cha87de we're working on it!! Couchbase is betting heavily on the Kubernetes Operator as the best path forward for the fractured cloud native landscape.

@hookenz
Copy link
Contributor

@hookenz hookenz commented May 16, 2018

I've found on AWS that the best approach is to add nodes to the cluster by internal IP address and not by name or external IP. We do not expose Couchbase server nodes directly to the public Internet, but by other services that use Couchbase as a back end database.

AWS guarantees not to change your internal IP address when you stop/start or restart nodes which makes this approach viable. If you absolutely need to expose couchbase across the internet then attach an elastic IP to your instance and use that.

There are I sure other approaches, but this works well for us.

@cha87de
Copy link
Author

@cha87de cha87de commented May 17, 2018

@hookenz are we still talking about containers?

Anyway, not changing the (internal) IP is obviously one solution, yet it feels like a dirty work around and may not work in many cases (think about a very busy cluster where private IPs are assigned every few minutes / seconds to containers, which come and go regularly).
Finally, the issue is not with restarts, but when container volumes are reused. This can be the case in development setups (e.g. i reused the couchbase volume but created the container only when needed), when updating the couchbase container (which usually means pull image, stop old container, start new container, remove old container), "forking/branching" on volumes to create faster nodes in a couchbase cluster, etc.

To be honest, I am personally dreaming of a couchbase cluster where each node takes care of that, without having an additional operator, which may fail (SPOF), has to be set up first (requires orchestrating / wiring), and so on. Not having router/operator/management node types (like found in MySQL Cluster, MongoDB, ...) made couchbase very attractive.

@4F2E4A2E
Copy link

@4F2E4A2E 4F2E4A2E commented May 17, 2018

@cha87de I do agree with you on that, having Couchbase doing the scale for you would be wonderful. I think Couchbase is not far away from having that happening, it already takes care of a lot and it handles failover clustering very well together with the Cluster-API. But yeah, the Change-IP problem is a bit of a pain and an obstacle on the road to achieve a seamless scale.

We are simply using the docker-engine (docker-compose-service) directly, no Kubernetes or AWS and it works only in a non cluster mode.

@ab77
Copy link

@ab77 ab77 commented Dec 20, 2018

For posterity, we've managed to work-around the scenario of changing container IPs on AWS ECS container instances using NetworkMode: awsvpc, service registry linked with a private R53 namespace, a helper container and a bunch of init/boostrap shell scripts.

Our ECS infrastructure is built entirely using CloudFormation and updates to the tasks/services always necessitate container re-creation, which changes their IPs and effectively destroys any CB clusters running on them. Our ECS container instances are part of an ASG with dynamic Lambda function scaling them up and down based on demand for resources.

We always set the first node in the CB cluster to use a CFN generated private namespace DNS name, which is resolvable by all hosts in the VPC. We also employ a "helper" container and shared EFS storage between all nodes, where "state" is kept. The state includes the current cluster IPs, the shared hosts file, the list of servers to add and a list of servers to remove. We use plain text files for all of this.

The CB containers run a bootstrap script on startup and using state information update ip, ip_start and couchbase-server.node files appropriately. The helper script automatically adds/removes nodes and rebalances the cluster as wlel as updates the shared hosts file to make sure the private namespace DNS name always resolves to the cluster (first) master node IP. The host file is mapped to each CB container as well as the helper via Docker volume mounts from EFS/NFS shared storage.

It's not an ideal solution, since the bootstrap and the helper scripts are big enough to be brittle, but it seems to work in our case. We can scale our ASG from 1 to X nodes and back down without losing data.

We use scale-in protection to make sure the first "master" CB node is never destroyed as a result of the ASG terminating the container instance hosting it. We also protect the node by disabling EC2 API termination.

So in summary, this it is possible to achieve, but it does require quite a bit of wiring:
https://anton.belodedenko.me/couchbase-aws-ecs-docker/

@jjaimon
Copy link

@jjaimon jjaimon commented Jul 31, 2019

I'm wondering if there is any work happening to free couchbase from IP binding and make true "cloud native". In the context of auto scaling and moving workloads around within a cluster (dockerized containers), it is not possible to backup and restore when ever workload moves from one node to another. While the solution provided above works, I'm a bit concerned about the complexity and possible breaking points in a production environment with large data sets. I'm using docker swarm in this case.

@hookenz
Copy link
Contributor

@hookenz hookenz commented Aug 1, 2019

I forgot to mention in my original comment that I used host networking for the couchbase container instances. So the couchbase container got the IP of the host and only one couchbase container per host. That worked for me. I understand everyone's usecase is different.

@ab77
Copy link

@ab77 ab77 commented Oct 26, 2019

@hookenz how are you managing to maintain CB service after an IP change (i.e. EC2 instance stop/start, etc.)?

@ceejatec
Copy link
Contributor

@ceejatec ceejatec commented Sep 3, 2021

In case others come across this, here are the basic rules for Couchbase:

  1. You can use DNS entries rather than IPs for node names. They must be fully-qualified domain names, ie, contain at least one dot.
  2. When adding a node to a cluster using a DNS name, the IP that it resolves to must be both routable from the originating server AND bindable by the target server. A number of IP masquerading scenarios, such as NAT or Docker Swarm's default overlay network, do not meet these requirements and hence will prevent setting up the cluster correctly.

If you meet those requirements, though, you can set up a cluster which is not tied to specific IPs, and you can have nodes fall out of the cluster and re-join with a different IP so long as their data and configuration are still there. I created a demo of doing this with only Docker Swarm on #161 . That's a useful situation for testing or small-scale deployments, although it's not exhaustively tested or supported for production deployments.

If you want to be "Cloud Native" with Couchbase, the Autonomous Operator (on top of Kubernetes, GKE, AKS, EKS, OpenShift, or similar environments) is the right production answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants