Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Durable local storage #598

Closed
pixie79 opened this issue Jul 24, 2014 · 46 comments
Closed

Durable local storage #598

pixie79 opened this issue Jul 24, 2014 · 46 comments
Labels
area/api Indicates an issue on api area. kind/design Categorizes issue or PR as related to design. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@pixie79
Copy link

pixie79 commented Jul 24, 2014

Is there a way to pin a pod to a minion?

For example we have some data that is stored on the host disk that is persistent between reboots, as such I need to tell the replication controller that this container should be pinned for example to minion1.

@erictune
Copy link
Member

If something happens to minion1, then your pod can't run. Kubernetes tries
to abstract away dependency on specific minions.

Is it practical for you to do one of the following:

  • install the data on all hosts?
  • make a docker image which includes this data so that it can be installed
    on any minion?
  • serve it off NFS from a machine which is not a minion, and then access it
    as a docker remote volume?
    I'm guessing not, or you wouldn't have asked, but it would be helpful to
    understand your use case more.

On Thu, Jul 24, 2014 at 12:24 AM, Mark Olliver notifications@github.com
wrote:

Is there a way to pin a pod to a minion?

For example we have some data that is stored on the host disk that is
persistent between reboots, as such I need to tell the replication
controller that this container should be pinned for example to minion1.


Reply to this email directly or view it on GitHub
#598.

@lavalamp
Copy link
Member

What Eric said. We may be forced to add such constraints in the future, but we're going to try hard not to. :)

We (@Sarsate) are working on additional volume types to make this easy.

@pixie79
Copy link
Author

pixie79 commented Jul 24, 2014

Unfortunatly it is not that simple as the application in the docker will be updating the data all the time and we can not use NFS as that is much to much overhead for the access latencies we need.

Ideally in the future this data would be stored on an SSD volume mounted to the minion. But for now I am happy with it being on the host but it does need to be pinned.

@lavalamp
Copy link
Member

Yeah, SSD access is one of the things that will probably force us to add some sort of constraint to keep your pod co-located with its SSD.

@lavalamp
Copy link
Member

Paging @bgrant0607.

@bgrant0607 bgrant0607 changed the title Pin a pod Durable local storage Jul 24, 2014
@bgrant0607
Copy link
Member

I renamed this issue to narrow it to the specific use case.

Support for durable local storage is an issue that has been raised by several partners in discussions, and is evident in every example application we've looked at (Guestbook, Acme Air, Drupal). This is a requirement for running a database, other storage system (e.g., HDFS, Zookeeper, etcd), SSD-based cache, etc.

Support for more types of volumes ( #97 ) is maybe necessary but definitely not sufficient. We also need to represent the storage devices as allocatable resources ( #168 ).

As I mentioned in #146 , pods are currently relatively disposable and don't have durable identities. So, I think the main design question is this: do we conflate the identity of the storage with the identity of the pod and try to increase the durability of the pod, or do we represent durable local volumes as objects with an identity and lifetime independent of the pods? The latter would permit/require creation of a new pod that could be attached to pre-existing storage.

The latter is somewhat attractive, but would obstruct local restarts, which is desirable for high availability and bootstrapping, and wouldn't interact well with replicationController, due to the need to create/manage an additional object and also to match individual pods and volumes, which would reduce the fungibility of the pods.

So, I'm going to suggest we go with pod durability.

Rather than a single pin/persistence bit, I suggest we go with forgiveness: a list of (event type, optional duration, optional rate) of disruption events (e.g., host unreachability) the pod will tolerate. We could support an any event type and infinite duration for pods that want to be pinned regardless of what happens.

This approach would generalize nicely for cases where, for example, applications wanted to endure reboots but give up in the case of extended outages or in the case that the disk goes bad. We're also going to want to use a similar spec for availability requirements / failure tolerances of sets of pods.

Ideally, the pod could be restarted/recreated by Kubelet directly. This would likely require checkpointing #489 , but initially we'd at least need to be able to:

  • not transition the pod to a stopped state and/or delete it, or at least be able to recreate it with the same identity
  • recover pre-existing storage from a well known place

Regarding the former, we probably need to introduce some indication of outages into the pod status -- probably not the primary state enum, but in a separate readiness field.

Regarding the latter, there are cases where it is convenient to place the storage in the host in a user-specified location, to facilitate debugging, data recovery, etc. without needing to look up long host-specific system-generated identifiers, though that's probably not a requirement for v0.

It might be nice to have a way for a durable pod to have a way to request to delete itself without making an API call. Some people have suggested that run-until-success (i.e., exit 0) is not a sufficiently reliable way to convey this. Perhaps we could use an empty volume on exit as the signal. Certainly that would mean there wasn't any valuable data to worry about, and it would be easy for an application to drop an empty file there if it just wanted to stay put.

Support for raw SSD should be filed as a separate issue, if desired.

@thockin @johnwilkes

@thockin
Copy link
Member

thockin commented Jul 25, 2014

Can we start with clear statements of requirement? What we have with local
volumes is already pretty durable, as long as the pod stays put. That may
be a side effect of the implementation, but maybe we should keep it.

Alternatively, maybe the answer is "don't use local storage". Just like
GCE has a PD associated with a VM, we could have something similar with
pods.

But that is getting ahead - I don't feel like I really understand the
required aspects of this.

On Thu, Jul 24, 2014 at 5:52 PM, bgrant0607 notifications@github.com
wrote:

I renamed this issue to narrow it to the specific use case.

Support for durable local storage is an issue that has been raised by
several partners in discussions, and is evident in every example
application we've looked at (Guestbook, Acme Air, Drupal). This is a
requirement for running a database, other storage system (e.g., HDFS,
Zookeeper, etcd), SSD-based cache, etc.

Support for more types of volumes ( #97
#97 ) is maybe
necessary but definitely not sufficient. We also need to represent the
storage devices as allocatable resources ( #168
#168 ).

As I mentioned in #146
#146 , pods are
currently relatively disposable and don't have durable identities. So, I
think the main design question is this: do we conflate the identity of the
storage with the identity of the pod and try to increase the durability of
the pod, or do we represent durable local volumes as objects with an
identity and lifetime independent of the pods? The latter would
permit/require creation of a new pod that could be attached to pre-existing
storage.

The latter is somewhat attractive, but would obstruct local restarts,
which is desirable for high availability and bootstrapping, and wouldn't
interact well with replicationController, due to the need to create/manage
an additional object and also to match individual pods and volumes, which
would reduce the fungibility of the pods.

So, I'm going to suggest we go with pod durability.

Rather than a single pin/persistence bit, I suggest we go with
forgiveness: a list of (event type, optional duration, optional rate)
of disruption events (e.g., host unreachability) the pod will tolerate. We
could support an any event type and infinite duration for pods that
want to be pinned regardless of what happens.

This approach would generalize nicely for cases where, for example,
applications wanted to endure reboots but give up in the case of extended
outages or in the case that the disk goes bad. We're also going to want to
use a similar spec for availability requirements / failure tolerances of
sets of pods.

Ideally, the pod could be restarted/recreated by Kubelet directly. This
would likely require checkpointing #489
#489 , but
initially we'd at least need to be able to:

not transition the pod to a stopped state and/or delete it, or at
least be able to recreate it with the same identity

recover pre-existing storage from a well known place

Regarding the former, we probably need to introduce some indication of
outages into the pod status -- probably not the primary state enum, but in
a separate readiness field.

Regarding the latter, there are cases where it is convenient to place the
storage in the host in a user-specified location, to facilitate debugging,
data recovery, etc. without needing to look up long host-specific
system-generated identifiers, though that's probably not a requirement for
v0.

It might be nice to have a way for a durable pod to have a way to request
to delete itself without making an API call. Some people have suggested
that run-until-success (i.e., exit 0) is not a sufficiently reliable way to
convey this. Perhaps we could use an empty volume on exit as the signal.
Certainly that would mean there wasn't any valuable data to worry about,
and it would be easy for an application to drop an empty file there if it
just wanted to stay put.

Support for raw SSD should be filed as a separate issue, if desired.

@thockin https://github.com/thockin @johnwilkes
https://github.com/johnwilkes

Reply to this email directly or view it on GitHub
#598 (comment)
.

@smarterclayton
Copy link
Contributor

The durable pod described above matches our experience with a broad range of real world use cases - many organizations are willing to support reasonable durability of containers in bulk, as long as the operational characteristics are understood ahead of time. They eventually want to move applications to more stateless models, but accepting outages and focusing on mean-time-to-recovery is a model they already tolerate. Furthermore, this allows operators to focus on durability in bulk (at a host level), with a corresponding reduction in effort over their previous single use systems.

We'd be willing to describe a clear requirement for a way to indicate that certain pods should tolerate disruption, with a best effort attempt to preserve local volumes and the container until such a time as the operator describes a host "lost".

The suggestion to indicate a pod is done by clearing its storage is elegant, although in practice it's either user intervention or the container idling out of use.

@smarterclayton
Copy link
Contributor

User specified data locations is also not significant for us in the near term.

@johnwilkes
Copy link
Contributor

+1 to the forgiveness model.

Let's make sure that it's possible to list the same reason (especially
"any") multiple times - we'd like to make it possible to forgive a few long
outages, and many shorter ones.
john

On Thu, Jul 24, 2014 at 5:52 PM, bgrant0607 notifications@github.com
wrote:

I renamed this issue to narrow it to the specific use case.

Support for durable local storage is an issue that has been raised by
several partners in discussions, and is evident in every example
application we've looked at (Guestbook, Acme Air, Drupal). This is a
requirement for running a database, other storage system (e.g., HDFS,
Zookeeper, etcd), SSD-based cache, etc.

Support for more types of volumes ( #97
#97 ) is maybe
necessary but definitely not sufficient. We also need to represent the
storage devices as allocatable resources ( #168
#168 ).

As I mentioned in #146
#146 , pods are
currently relatively disposable and don't have durable identities. So, I
think the main design question is this: do we conflate the identity of the
storage with the identity of the pod and try to increase the durability of
the pod, or do we represent durable local volumes as objects with an
identity and lifetime independent of the pods? The latter would
permit/require creation of a new pod that could be attached to pre-existing
storage.

The latter is somewhat attractive, but would obstruct local restarts,
which is desirable for high availability and bootstrapping, and wouldn't
interact well with replicationController, due to the need to create/manage
an additional object and also to match individual pods and volumes, which
would reduce the fungibility of the pods.

So, I'm going to suggest we go with pod durability.

Rather than a single pin/persistence bit, I suggest we go with
forgiveness: a list of (event type, optional duration, optional rate)
of disruption events (e.g., host unreachability) the pod will tolerate. We
could support an any event type and infinite duration for pods that
want to be pinned regardless of what happens.

This approach would generalize nicely for cases where, for example,
applications wanted to endure reboots but give up in the case of extended
outages or in the case that the disk goes bad. We're also going to want to
use a similar spec for availability requirements / failure tolerances of
sets of pods.

Ideally, the pod could be restarted/recreated by Kubelet directly. This
would likely require checkpointing #489
#489 , but
initially we'd at least need to be able to:

not transition the pod to a stopped state and/or delete it, or at
least be able to recreate it with the same identity

recover pre-existing storage from a well known place

Regarding the former, we probably need to introduce some indication of
outages into the pod status -- probably not the primary state enum, but in
a separate readiness field.

Regarding the latter, there are cases where it is convenient to place the
storage in the host in a user-specified location, to facilitate debugging,
data recovery, etc. without needing to look up long host-specific
system-generated identifiers, though that's probably not a requirement for
v0.

It might be nice to have a way for a durable pod to have a way to request
to delete itself without making an API call. Some people have suggested
that run-until-success (i.e., exit 0) is not a sufficiently reliable way to
convey this. Perhaps we could use an empty volume on exit as the signal.
Certainly that would mean there wasn't any valuable data to worry about,
and it would be easy for an application to drop an empty file there if it
just wanted to stay put.

Support for raw SSD should be filed as a separate issue, if desired.

@thockin https://github.com/thockin @johnwilkes
https://github.com/johnwilkes

Reply to this email directly or view it on GitHub
#598 (comment)
.

@bgrant0607
Copy link
Member

@thockin @lavalamp What happens today in the following scenarios:

  • Kubelet dies
  • Docker daemon dies
  • Host reboots
  • Host is down/unreachable for 15 minutes

@lavalamp
Copy link
Member

For a regular pod without a replication controller: absolutely nothing.

for the rep. controller case, except for dockerd death, they result in a new pod being spun up somewhere. And then if the old pod shows up again, one of the pods will get killed.

Not sure exactly what a dockerd death would cause.

@smarterclayton
Copy link
Contributor

Dockerd death leaves orphaned processes that the daemon doesn't know is running (last I checked).

EDIT: corrected, right now children processes stay running until the daemon starts, at which point the daemon loops over all containers and kills them, and then will not restart them unless daemon AutoRestart is true (daemon/daemon.go#175)

@thockin
Copy link
Member

thockin commented Jul 26, 2014

On Thu, Jul 24, 2014 at 10:36 PM, bgrant0607 notifications@github.com wrote:

@thockin @lavalamp What happens today in the following scenarios:

Kubelet dies

Nothing happens

Docker daemon dies

All containers die, kubelet probably craps itself trying to talk to
docker daemon. What should happen is that containers should stay
alive, but kubelet can';t talk to dockerd.

Host reboots

Unless someone moves the pod from that host (in etd), the pod comes
back with the host

Host is down/unreachable for 15 minutes

Nothing unless someone moves the pod in etcd.

What do we want to happen?

@thockin
Copy link
Member

thockin commented Jul 26, 2014

I think host-pinning and forgiveness/stickiness is going to be unavoidable
for some use cases. There's a difference between pinning and stickiness,
though. Pinning implies that you (the user) know what host you want,
whereas forgiveness says "I'll take any host, but once I get there don't
move me for X or Y happenings"

On Thu, Jul 24, 2014 at 3:07 PM, Daniel Smith notifications@github.com
wrote:

Paging @bgrant0607 https://github.com/bgrant0607.

Reply to this email directly or view it on GitHub
#598 (comment)
.

@KyleAMathews
Copy link
Contributor

Related to this (has someone created an issue on this yet?) is the need for
hooks for supporting the migration of stateful containers. Moving stateless
containers (e.g. to balance load) is easy — you start a new one on a
different host and kill the old one. But with stateful containers/pods it
almost always takes custom scripting e.g. simple rsync for files or
bringing up a DB container as a slave before promoting it to be the new
Master or whatever else is needed.

Kyle Mathews

Blog: http://bricolage.io http://bricolage.io
Twitter: http://twitter.com/kylemathews

On Sat, Jul 26, 2014 at 8:57 AM, Tim Hockin notifications@github.com
wrote:

I think host-pinning and forgiveness/stickiness is going to be unavoidable
for some use cases. There's a difference between pinning and stickiness,
though. Pinning implies that you (the user) know what host you want,
whereas forgiveness says "I'll take any host, but once I get there don't
move me for X or Y happenings"

On Thu, Jul 24, 2014 at 3:07 PM, Daniel Smith notifications@github.com
wrote:

Paging @bgrant0607 https://github.com/bgrant0607.

Reply to this email directly or view it on GitHub
<
#598 (comment)

.


Reply to this email directly or view it on GitHub
#598 (comment)
.

@bgrant0607
Copy link
Member

@thockin Pinning to a specific host could be achieved either using constraints or the forthcoming direct scheduling API in addition to forgiveness.

@erictune
Copy link
Member

I usually agree with @bgrant0607 but I'm going to explore a contrary position on this issue.

  • Takes attention away from Stateless Process model.
    • The model of building systems of stateless processes (see e.g. http://12factor.net/processes) is a powerful one. Making pods possibly persistent gives the appearance of controverting that model.
  • Punts on issue of recreating data.
    • If an object is going to be managed to a system like k8s, then it should be possible for the system to recreate the object without user intervention. Once a k8s cluster and pods and controllers are started, it should be able to continue working approximately forever, despite a low rate of hardware failures and repairs. Therefore you have to be able to recreate any particular local chunk of data.
    • Until we have a way to achieve non-zero availability forever, it is too soon to start optimizing for better availability (which is what the forgiveness proposal seems to be doing.)
    • Once we have a way to specify how to (re)create a data object, we may find that there is enough information in the specification it self to infer what the "forgiveness" should be.
      • For example, if the requirements for a data item are 1TB, then we should be default forgive (defer rescheduling) for outages that are shorter than the mean time to create a 1TB data item.
  • Forgiveness too sophisticated for most users.
    • Forgiveness is well defined for reasoning about availability from first principles. However, my experience is that even users with mature development processes don't reason about availability from first principles; they take an iterative approach, only adjusting their systems in response to problems they actually experience.
    • We should not add a complex feature that may not be necessary. It will certainly complicate writing additional schedulers and make the pod specifications more intimidating.

@smarterclayton
Copy link
Contributor

Some thoughts:

  • Is the goal of a system like kubernetes to run only 12 factor applications?
  • Are the benefits of introducing stateful software to a flexible container infrastructure sufficient to outweigh the minor accommodations that they require?
  • Is creating a replication controller today implicitly correlated to creating stateless software?

An assumption I had been working for is that replication controller is the abstraction that provides the illusion of recreating an object without user intervention. That the scheduler does not reschedule containers - instead, the reconciliation loop of the replication controller forms dynamic tension with a scheduler by deleting containers that no longer fit an appropriate definition of health. Thus the schedulers' responsibility is reduced - it only attempts to place new pods, but never reschedules. In this model, the scheduler does not need to know about forgiveness, rather the replication controller does. And the replication controller is the one that needs to make the decision about health.

If that's not the case, then where does that responsibility lie? Would the scheduler be responsible for deleting pods off one location, placing them on another, and determining whether that transition is appropriate? If so, that seems like a growth in responsibility of the scheduler - every scheduler would then need to deal with the complexity of knowing "is this transition appropriate" and hardcoding the list of transitions.

The former model seems more flexible - for instance, replication controller types can be created of arbitrary complexity - including forgiveness - with each replication controller needs to deal with the consequences of when delete is appropriate. And ultimately, determining when something should be deleted is often specific to use case (a task vs a service vs a build vs a stateful service etc)

@erictune
Copy link
Member

I wasn't suggesting that we should not support stateful apps, just that we should not support stateful apps with only the Pod object.

Brian said earlier:
do we conflate the identity of the storage with the identity of the pod and try to increase the durability of the pod, or do we represent durable local volumes as objects with an identity and lifetime independent of the pods?

I was arguing for the second option; that we have a different object type to represent that persistent data.

Those data objects would need their own replication control: data could be more widely replicated than Pods, and could be in different states.

@smarterclayton
Copy link
Contributor

I can at least talk to running moderately dense container hosts with a single shared durable storage volume per host (that each container was allocated storage on). The storage is network attached, and snapshot-able for backup purposes. For most outages, the volume was detached and reattached to a new host with the same identity as the old host, with the containers not rescheduled. This has been a reasonable solution for most operation teams running OpenShift - trading off some availability for reduced complexity of managing that a single volume during recovery. And in those types of outages, the most current and accurate data for those containers is on the volume, so replication is unlikely to be faster than restoring the volume.

This is just one particular scenario, but it's a sort of local minima of availability for stateful containers at reasonable density and familiarity to ops teams (at lower densities individual attached volumes is probably better). And in this case, forgiveness does seem to model the tradeoff better - waiting longer before deciding the state is gone.

However, to your point, if that particular volume is never coming back having a well set up model for distributing state and tracking independent volumes reduces your vulnerability to total loss. Also, the planned reallocation model works better along your model - if I decide to evacuate a host for maintenance I may very well want to rebalance to other hosts, and that requires a certain volume with a certain set of data in place on that other host.

@thockin
Copy link
Member

thockin commented Aug 23, 2014

Revisiting older topics that I think are important: This tapered off with no clear resolution. I still don't feel like I understand the behavioral requirements. We've discussed a lot of considerations of a couple of implementations, but have not discussed exactly what we are trying to achieve.

Are we trying to enable data objects to have a lifetime that is decoupled from any one pod?

Are we trying to allow pods to have $large "local" data (i.e. filesystem, not DB or other storage service)?

other?

@smarterclayton
Copy link
Contributor

I think data objects decoupled from pod is modeled with sufficient granularity in volumes today. Being able to define some level of pod stability that does not cause significant scheduling difficulties has value for places where large local data exists. Perhaps this belongs as a scheduler problem, where an integrator can determine that a volume type as a corresponding impact on scheduling decisions.

@thockin
Copy link
Member

thockin commented Aug 26, 2014

Volumes have a lifetime coupled perfectly their pod. If we are arguing that there's a need to have durable data that outlives any pod, we have not really started that design

@orospakr
Copy link

I think a major use case for this is Pod software upgrades. Right now, upgrading software deployed inside Containers in a Pod is, afaict, a destructive operation.

@bgrant0607 bgrant0607 added this to the v1.0 milestone Aug 28, 2014
@goltermann goltermann removed this from the v0.8 milestone Feb 6, 2015
@pixie79 pixie79 removed this from the v0.8 milestone Feb 6, 2015
@pwFoo
Copy link

pwFoo commented Apr 21, 2015

Don't know how kubernetes or docker works internal, but durable local storage should be work with data only container if native docker is used?
So maybe data containers and a implementation of volumes-from should work to get local data persistent? Because I'll move to a docker host I need a local persistent storage.

But it seems there is no solution in the near future?

@thockin
Copy link
Member

thockin commented Apr 22, 2015

I am not against some variant of data containers, but I don't really know
how people are using data containers.

Is the goal just to get a stable/recoverable host dir to write into? Is
the goal to preload said dir with the contents of a docker container?
Something else?

On Tue, Apr 21, 2015 at 12:01 AM, pwFoo notifications@github.com wrote:

Don't know how kubernetes or docker works internal, but durable local
storage should be work with data only container if native docker is used?
So maybe data containers and a implementation of volumes-from should work
to get local data persistent? Because I'll move to a docker host I need a
local persistent storage.

But it seems there is no solution in the near future?


Reply to this email directly or view it on GitHub
#598 (comment)
.

@pwFoo
Copy link

pwFoo commented Apr 22, 2015

Data containers are persistent and reboot save. After read some issues hostdir have some permission problems. Data containers could be a reboot save volume solution and simpler to move to another minion than hostdir data if needed (?)
So data containers could be a better way to handle volumes?

@markturansky
Copy link
Contributor

@thockin could you explain for me the difference difference between a data container and a persistent volume? I'd like to understand this issue better.

@smarterclayton
Copy link
Contributor

I discussed this in another issue, but I'd prefer a GC'd directory identified by a unique value that anyone who knows could reuse. Given proper uid support that content is protected by Unix rules and would satisfy the "I need a dir that is best effort reused per host across multiple pods", as a build cache or scratch dir. But it would need time based GC after the last reference is allocated.

On Apr 22, 2015, at 2:08 AM, Tim Hockin notifications@github.com wrote:

I am not against some variant of data containers, but I don't really know
how people are using data containers.

Is the goal just to get a stable/recoverable host dir to write into? Is
the goal to preload said dir with the contents of a docker container?
Something else?

On Tue, Apr 21, 2015 at 12:01 AM, pwFoo notifications@github.com wrote:

Don't know how kubernetes or docker works internal, but durable local
storage should be work with data only container if native docker is used?
So maybe data containers and a implementation of volumes-from should work
to get local data persistent? Because I'll move to a docker host I need a
local persistent storage.

But it seems there is no solution in the near future?


Reply to this email directly or view it on GitHub
#598 (comment)
.


Reply to this email directly or view it on GitHub.

@eparis
Copy link
Contributor

eparis commented Apr 28, 2015

To understand data containers, we need to remember that container != image.

In the docker model, you can make changes to your container's filesystem, stop the container/reboot the host, and then you can restart the container and it will still have those FS changes.

In the kube model, we tend to believe that containers are always launched cleanly from an image and storage/filesystem changes should be done 'outside' of the container.

"Data containers" are much more in the docker thoughts. You can create a container and make some changes to the filesystem in that container. Docker can then mount the filesystem from one container into another container. And the changes are in the 'data container.'

Its like what we do with volumes, but they do it with containers (and like many things docker it is only really elegant on a single host)

An example of a 'data container' could be for configuration. You could create a container filled with your rsyslog configuration and another container which actually has rsyslog. You launch the rsyslog continer mounting the /etc/ files from the configuration container into the container with the daemon. Now you can update the rsyslog container independently of the config, and the config independently of the binary.

Another example would be a container to save stateful data. Create a container which just has /var/lib/etcd/. Now mount that containers /var/lib/etcd/ into your etcd container. You can update/change the etcd container without worry about the data. You can also 'save' the data container as an image and docker push/docker pull to get the data onto another host, in case you wanted to migrate the data.

I haven't read this issue, so the following statements are likely worth exactly 0. But in general, I am not a fan of dockers mutability of containers. I like that kubernetes has a clear and rational expectation that running the same command 2 times will give the same results. If we chose to use containers under the covers for some type of data storage, that it fine, but I really hope we don't expose that to the user. We should expose some functionality to the system user, not some underlying detail....

@pikeas
Copy link

pikeas commented May 14, 2015

It's been close to a year since this issue was opened, is Kubernetes really no closer to providing durable local storage and nominal/stateful services (#260)?

@pwFoo
Copy link

pwFoo commented Jun 2, 2015

Why not create a data volume which creates a data only container in the background?
It should be added to the container / pod by volumes-from docker option and is persistent / durable.

@thockin
Copy link
Member

thockin commented Jul 9, 2015

Folding this together with #7562.

@thockin thockin closed this as completed Jul 9, 2015
vishh pushed a commit to vishh/kubernetes that referenced this issue Apr 6, 2016
Switch to gliderlabs/alpine Docker image.
monopole added a commit to monopole/kubernetes that referenced this issue May 30, 2017
The kubectl decoupling project (kubernetes#598) requires many BUILD edits.

Even relatively simple PR's involve many OWNER files, e.g. kubernetes#46317 involves five.

We plan to script-generate some PRs, and those may involve _hundreds_ of BUILD files.

This project will take many PRs, and collecting all approvals for each
will be very time consuming.
k8s-github-robot pushed a commit that referenced this issue Jun 3, 2017
Automatic merge from submit-queue

Add jregan to OWNERS for kubectl isolation work.

The kubectl decoupling project (#598) requires many BUILD edits.

Even relatively simple PR's involve many OWNER files, e.g. #46317 involves five.

We plan to script-generate some PRs, and those may involve _hundreds_ of BUILD files.

This project will take many PRs, and collecting all approvals for each will be very time consuming.

**Release note**:
```release-note
NONE
```
mrIncompetent pushed a commit to kubermatic/kubernetes that referenced this issue Jun 6, 2017
The kubectl decoupling project (kubernetes#598) requires many BUILD edits.

Even relatively simple PR's involve many OWNER files, e.g. kubernetes#46317 involves five.

We plan to script-generate some PRs, and those may involve _hundreds_ of BUILD files.

This project will take many PRs, and collecting all approvals for each
will be very time consuming.
deads2k pushed a commit to deads2k/kubernetes that referenced this issue Mar 9, 2021
UPSTREAM: <carry>: allow kubelet to self-authorize metrics scraping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. kind/design Categorizes issue or PR as related to design. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests