-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistent data on nodes #150
Comments
+1 |
It would be awesome if direct attached storage can be implemented as a preview of some sort. One of the things that came to my mind is the idea of updating containers while keeping the storage: For example, let's say we have a MySQL 5.6.16 container running and it was allocated 20GB of storage on a client to store its data. If there's a new version of the MySQL container (5.6.17), we want to be able to swap out the container but still keep the storage (persistent data) and have it mounted into the updated container. This way, we can upgrade the infrastructure containers without data loss or having a complicated upgrade process that requires backing up the data, upgrading then restoring it. |
@F21 good use case too. |
Just to add some details to this, we need two overarching features: one is the notion of persistent storage that is either mounted into or otherwise persisted on the node, and is accounted for in terms of total disk space available for allocations. The second is global tracking of where state is located in the cluster so jobs can be rescheduled with hard (mysql) / soft (riak) affinity for nodes that already have their data, and possibly a mechanism to reserve the resources even if the job fails or is not running. Since these features are quite large we would implement them gradually. For example floating persistence (a la EBS), node-based persistence, global storage state tracking, soft affinity, hard affinity, and offline reservations as independent milestones. I yet can't speak to when / if we will implement these features. |
@cbednarski sounds like you guys are going in the right direction! Cool! |
Hi @cbednarski, thanks for the explanation of your goals. This is something that we would also like to see. Other services that we would like to manage that require data storage are Redis Sentinel, Consul, and NSQ. Currently we dedicate nodes to this tasks, so affinity is something that we manage manually. I understand that other use cases might need/want some more magical targeting, but it would be interesting to see some way of manually deciding this before deciding on how this fits into the overall model. My point is that if nomad provides some way of manually assigning data volumes to containers, and leave the logic of making sure the containers only start on the correct hosts to manual configuration, then we could start to get a feeling of how it all works, and with that experience, design better models afterwards. I found this in the code: Line 103 in 628f395
Thank you, |
+1 to @cbednarski's thoughts. We might also need to think about identity of data volumes. Data Volumes are slightly different than other compute resources like cpu/memory/disk etc which are scalar in nature since these are resources which can be loaned to any processes that might need compute resources where as volumes are usually used by the same type of process which created it in the first place and users might need to refer to a volume while specifying a task definition. For ex -
In this example we are asking Nomad to place the task on a machine where the volume named |
Maybe it's possible to lean on https://github.com/emccode/rexray for non-local storage such as Amazon EBS. It doesn't manage persistence on local disks though, so that portion would still need to be implemented. |
I am one of the maintainers of https://github.com/libopenstorage/openstorage. The goal of this project is to provide persistent cluster aware storage to Linux containers (Docker in particular). It supports both data volumes as well as the Graph driver interface. So your images and data are persisted in a multi node scheduler aware manner. I hope this project can help what Nomad would like to achieve. The open storage daemon (OSD) itself runs on every node as a Docker container. They discover other OSD nodes in the cluster via a KV DB. An container run via the Docker remote API can leverage volumes and graph support from OSD. OSD in turn can support multiple persistent backends. Ideally this would work for Nomad without doing much. The specs are also available at openstorage.org |
@gourao That sounds really exiciting! Are there any plans to support things beyond docker: qemu, rkt, raw exec etc? |
Yes @F21, that's the plan. There are a few folks looking at rkt support, and as the OCI spec becomes more concrete, this will hopefully be a solved problem. |
+1 While testing nomad with simple containers i did not realize that there was no option in the job syntax for bind mounts which i used when dealing with docker directly. :( I like @diptanu's proposal. As @melo mentioned nomad is already doing something like this Line 103 in 628f395
Most other tools to manage docker containers allow users specify volumes on container creation (Marathon, Shipyard for Example... Kubernetes too i think?) I'm a novice in go, so i did not try anything myself by now. :) |
+1 Any timeframe on this as volumes not being supported in Nomad is a huge deal breaker for us using Nomad. |
+1 |
Yea, for my initial use it is acceptable to enable raw_exec and work around this issue, but that is only because this is not yet truly production use. I too could not put nomad in production without the most basic docker volume mount to the host being supported by the docker driver. |
+1 need docker volumes too for production. |
I'm interested in this not just for Docker, but also for qemu and an in-house Xen implementation. That is, it would be nice if the solution was generic enough to be useful for all task drivers. |
+1 no way to use in production without docker volumes |
👍 |
So I have been running into this issue myself, as its a pretty fundamental idea to use volumes in conjunction with docker. I understand there is a much larger architecture and design discussion to have around how to manage storage using Nomad in general. However when I was thinking about the issue, I came to the idea of specifying arbitrary commands to pass on down to docker. Something like this:
This would be entirely un-monitored via Nomad, and placing the container so that its volumes worked would be up to the end user, i.e. they would specify the necessary constraints on the job. No idea if this is even possible, but figured I would voice the idea at the very least. |
👍 for --volumes flag. another great use case: running a cadvisor container as a system service on all nodes that can pipe stats to oh, say, influxdb. In this sense, it has less to do w/ persistent storage than providing volume mounts to the container to monitor the underlying host. Per the cadvisor docs on getting it running:
|
Any way I could use this seems like an ultimate solution for my needs |
or probably I should use something simpler, like https://github.com/leg100/docker-ebs-attach |
@let4be: Not currently. There is no support for persistent volumes in Nomad currently |
@dadgar The most important and very simple feature i'd like to see is that we can do a simple bind-mount into the containers (docker's -v option, something similar for rkt and whatever else there is) Other features like integrations with dockers "storage" containers won't get near our persistent data since those introduce quite a bit of complexity (and dependencies on the container service to make sure data is migrated whenever we update the container service, be it docker, rkt or anything similar) We run services like zookeeper, kafka, mesos, cassandra, haproxy, docker-registry, nginx and similar inside the containers we manage with our service, but we'd like to manage those services/containers fully through nomad instead. which means "system" jobs for most deploys. Since services have quite varying requirements we define roles for hosts with different specs, the requirements of services on the infrastructure-level vary so much that it's not really useful to try and launch stuff fully dynamically on random nodes. In our case we don't need or even want any magic for finding or managing storage, we want to tell nomad what servers to run which task on (through system tasks in this case.) All nodes with role/class "cassandra" run cassandra container/service and all those nodes have decently specced storage that we guarantee will be available at the same place on the host. This is also a requirement to be able to monitor diskspace and disk utilization for each class/role of service properly. Regarding security: |
I've made a generic workaround to handle docker volumes using the raw_exec driver, available on github here: https://github.com/csawyerYumaed/nomad-docker |
If you want to use Docker bind mounts in Nomad but still want to use the docker driver, you should totally check out this new as-good-as-ready-for-production tool I just made: https://github.com/carlanton/nomad-docker-wrapper |
We are going to start working on volume plugins in the next Nomad release. But in the interim(in the upcoming 0.5 release), we will enable users to pass the volume configuration option in the docker driver configuration. Also, operators will have to explicitly opt into allowing users to pass the volume/volume driver related configuration option in their jobs by enabling it in Nomad client config. Users should keep in mind that Nomad won't be responsible for cleaning up things behind the scenes with respect to network based file systems until the support for Nomad's own volume plugins come out. |
That's great news and a sensible intermediate step |
@diptanu is there any chance to bring that to rkt as well ? |
Is there a schedule attached to that release by chance? Hard to sell people on Nomad without mounts. |
@w-p We are trying to do the RC release this week, and main release next week. |
Thanks for getting the RC out. |
Is there any support now for doing stuff like MySQL with persistent data volumes? |
@diptanu, do you have any milestone or ETA for volume drivers? |
I believe they are now supported. You can pass, in the docker config section of the job spec, an array of strings with the same format you would use in the docker run -v command. |
If the crux of this issue is Docker volume driver, I think you guys addressed it with the recent PR*. If it's about extending the resource model, I'd suggest that'll take quite some time and maybe become it's own design. Defer to others, as I'm just learning about Nomad myself. thanks! |
@dadgar is this one on track for 0.6.0? |
@c4milo No, this isn't being tackled in 0.6.0 |
Since Nomad 0.7.0, what is the recommended best practice for running a database Docker container that requires a persistent data volume? |
@maticmeznar, I cannot speak for "recommended best practice", and there is more than one way to achieve it, but I can share an approach that we are using at the moment. That can be achieved in more than one way, for instance, there is REX-Ray and solutions alike, that look attractive for using a cloud provider storage (like AWS S3, Google Cloud Storage, etc.), but we haven't tried it. What we are using at the moment is a separate distributed replicated storage cluster (we use GlusterFS at the moment, there are alternatives), mounting GlusterFS volume(s) on each node in Nomad cluster, and mapping an appropriate folder from mounted volume into Docker container.
Again, there are multiple ways to go about data persitency with managed Docker containers, hope our perspective may be helpful to somebody. |
The absence of a proper solution to volume management with Nomad is literally the only reason I cannot recommend it to our clients and/or use it instead of Kubernetes. Its Vault and Consul integration, ease of use, minimal installation overhead and workload support is intriguing, but it all doesn't matter because it cannot be trusted with persistent data 😞 I wish this was higher up the product backlog. |
Well, I don't think that's 100% true. Nomad definitely has support for
persistent data, but maybe not in the way that you are expecting. Many of
us have used a variety of methods to ensure our needs are met here, and the
experience was not terrible. I would recommend them to other people.
Kubernetes is not the same, and it's not reasonable to do a direct
comparison of features (you would need to compare the "ecosystems" more
than the individual components).
…On Sat, Feb 3, 2018 at 8:07 AM, Moritz Heiber ***@***.***> wrote:
The absence of a proper solution to volume management with Nomad is
literally the only reason I cannot recommend it to our clients and/or use
it instead of Kubernetes. Its Vault and Consul integration, ease of use,
minimal installation overhead and workload support is intriguing, but it
all doesn't matter because it cannot be trusted with persistent data 😞
I wish this was higher up the product backlog.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#150 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJp2uUK_83CigZg-SJ7jRGcnXcTbR69Nks5tRFn-gaJpZM4GFmHO>
.
|
If you look at this thread's origin --- "Nomad should have some way for tasks to acquire persistent storage on nodes." --- it doesn't say that Nomad itself should procure/acquire the persistent storage, only that the task should have a way. One way is through "container storage on demand". Assuming use of the Nomad 'docker' driver, if the volume-driver plugin can present relevant meta-data at run-time, then it's possible for the storage to be provisioned on-demand when the task starts. Here's what this might look like:
In this case, a 10GB volume named "myvol" gets created, with synchronous replication on 3 nodes and is mapped into the container at "/mnt/myapp". The task acquires the persistent storage. This capability is available today through the Portworx volume-driver plugin, as documented here: https://docs.portworx.com/scheduler/nomad/install.html (*) disclaimer: I work at Portworx. |
Hello, I've seen a lot of discussion about persistent storage with Docker containers which I've been using effectively. However I'm also keenly interested in persistent storage for qemu VMs scheduled through nomad. I may have overlooked something but I don't see this as an option. Is there any expectation of adding this? Or is there any path with existing configuration to achieving some form of persistent storage? |
@dvusboy Could you share your wrapper code please? |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad should have some way for tasks to acquire persistent storage on nodes. In a lot of cases, we might want to run our own hdfs or ceph cluster on nomad.
That means, things like hdfs' datanodes needs to be able to reserve persistent storage on the node it is launched on. If the whole cluster goes down, once its brought back up, the appropriate tasks should be launched on its original nodes (where possible), so that it can gain access to data it has previously written.
The text was updated successfully, but these errors were encountered: