Image volumes and container volumes #831

thockin · 2014-08-08T05:11:34Z

This would map closely to Docker's native volumes support, and allow people to build and version pre-baked data as containers. Maybe read-only? Haven't thought that far...

brendandburns · 2014-08-08T05:18:11Z

I guess so? But why not just use a git repo?

On Thu, Aug 7, 2014 at 10:11 PM, Tim Hockin notifications@github.com
wrote:

This would map closely to Docker's native volumes support, and allow
people to build and version pre-baked data as containers. Maybe read-only?
Haven't thought that far...

—
Reply to this email directly or view it on GitHub
#831.

thockin · 2014-08-08T05:48:32Z

More plugins more better? I wanted to put it out there, since we do sort
of diverge from Docker's native volumes support. Clearly not urgent :)

On Thu, Aug 7, 2014 at 10:18 PM, brendandburns notifications@github.com
wrote:

I guess so? But why not just use a git repo?

On Thu, Aug 7, 2014 at 10:11 PM, Tim Hockin notifications@github.com
wrote:

This would map closely to Docker's native volumes support, and allow
people to build and version pre-baked data as containers. Maybe
read-only?
Haven't thought that far...

Reply to this email directly or view it on GitHub
#831.

Reply to this email directly or view it on GitHub
#831 (comment)
.

bgrant0607 · 2014-10-01T04:00:19Z

More potential uses of this:

Deployment of scripts/programs for lifecycle hooks: One of the main points of the lifecycle hooks (PreStart and PostStop event hooks #140) is to decouple applications from the execution environment (Kubernetes in this case). If the hook scripts/programs must be deployed as part of the application container, that compromises this objective.
Dynamic package composition more generally: This would be more similar to our internal package model, where we can independently manage the base filesystem, language runtime, application, utility programs for debugging, etc.
Configuration deployment
Input data deployment

A git repo could be used for some of these cases, but for others it would be less than ideal.

erictune · 2014-10-01T14:20:21Z

If the base image of a docker file (e.g. FROM fedora) is a linux distro,
then isn't it going to be annoying to have a bunch of Linux Standard Base
type of files in what is really supposed to be a data-only packge?

On the other hand, if it is creating using tar -c . | docker import - myimage, then what is the advantage is of a docker image over a tar file?

On Tue, Sep 30, 2014 at 9:00 PM, bgrant0607 notifications@github.com
wrote:

More potential uses of this:

Deployment of scripts/programs for lifecycle hooks: One of the main
points of the lifecycle hooks (PreStart and PostStop event hooks #140
PreStart and PostStop event hooks #140) is to
decouple applications from the execution environment (Kubernetes in this
case). If the hook scripts/programs must be deployed as part of the
application container, that compromises this objective.

Dynamic package composition more generally: This would be more
similar to our internal package model, where we can independently manage
the base filesystem, language runtime, application, utility programs for
debugging, etc.

Configuration deployment

Input data deployment

A git repo could be used for some of these cases, but for others it would
be less than ideal.

—
Reply to this email directly or view it on GitHub
#831 (comment)
.

thockin · 2014-10-01T17:24:06Z

How about the VOLUMES directives of a container? Any container can declare
itself to be exposing any number of volumes. Maybe the functionality to
expose is not the whole container, but just the volumes from that container?

On the other hand, people can create data containers "FROM scratch", and
who are we to say it's annoying?

On Wed, Oct 1, 2014 at 7:20 AM, erictune notifications@github.com wrote:

If the base image of a docker file (e.g. FROM fedora) is a linux distro,
then isn't it going to be annoying to have a bunch of Linux Standard Base
type of files in what is really supposed to be a data-only packge?

On the other hand, if it is creating using tar -c . | docker import - myimage, then what is the advantage is of a docker image over a tar file?

On Tue, Sep 30, 2014 at 9:00 PM, bgrant0607 notifications@github.com
wrote:

More potential uses of this:

Deployment of scripts/programs for lifecycle hooks: One of the main
points of the lifecycle hooks (PreStart and PostStop event hooks #140
PreStart and PostStop event hooks #140) is to
decouple applications from the execution environment (Kubernetes in this
case). If the hook scripts/programs must be deployed as part of the
application container, that compromises this objective.

Dynamic package composition more generally: This would be more
similar to our internal package model, where we can independently manage
the base filesystem, language runtime, application, utility programs for
debugging, etc.

Configuration deployment

Input data deployment

A git repo could be used for some of these cases, but for others it
would
be less than ideal.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-57416301>

.

Reply to this email directly or view it on GitHub
#831 (comment)
.

dchen1107 · 2014-10-01T17:57:12Z

@thockin I like the initial idea of having new volume type to support docker's data volume container, which matches common package, shared package etc. concept internally. I also can see the potential use cases listed by @bgrant0607. But please don't go down to the road like Docker had today: declare a container as data volume, in which case, we are going to introduce another level complexity of dependencies to containers within a pod, or even dependencies between pods if a pod only has data volume container, etc. I think your initial idea of having a volume type which actually refers as a docker volume container or other read-only volume is a better approach for a long run.

thockin · 2014-10-01T18:06:28Z

The interesting thing about docker volumes is that a container does not
have to be RUNNING for the volumes to exist. It's a weird model, but I
think it could work.

I don't think we know what people really want in this space yet, though.

On Wed, Oct 1, 2014 at 10:57 AM, Dawn Chen notifications@github.com wrote:

@thockin https://github.com/thockin I like the initial idea of having
new volume type to support docker's data volume container, which matches
common package, shared package etc. concept internally. I also can see the
potential use cases listed by @bgrant0607 https://github.com/bgrant0607.
But please don't go down to the road like Docker had today: declare a
container as data volume, in which case, we are going to introduce another
level complexity of dependencies to containers within a pod, or even
dependencies between pods if a pod only has data volume container, etc. I
think your initial idea of having a volume type which actually refers as a
docker volume container or other read-only volume is a better approach for
a long run.

Reply to this email directly or view it on GitHub
#831 (comment)
.

erictune · 2014-10-06T16:47:20Z

It appears the same net effect as this issue can be achieved without introducing a new volume type using two containers in a pod and some shell wrapped around the underling container command lines (see #1589 (comment)).

How to decide on container-as-a-volume vs. command-line-based sequencing?

more portability between Kubernetes and non-Kubernetes docker use cases with container-as-volume.
easier for user to discover container-as-volume concept and identify that it is the right solution?
having fewer and less general mechanisms for setting up "packages" may lend itself to more tightly integrated build/deploy systems. But, maybe that is not a goal for Kubernetes.
either solution can integrate with data durability, I think.
liveness checking is more complex with command-line-based sequencing, since the pod to be liveness checked goes through a waiting phase and then a running phase.

stp-ip · 2014-11-21T12:48:59Z

There are a lot of ways to go about it and a new volume type is in my opinion not needed. I tried to standardize the way we structure data and make different volume providers possible. These range from Host Volumes, Data Volume Containers, Side Containers with additional logic to Volume as a Service, which is where k8s could integrate greatly. The start is already available via git as volume. I think the native Volumes in Docker are enough, but just lack a standard. The more detailed ideas are available at moby/moby#9277.

bgrant0607 · 2014-11-21T20:21:02Z

I think the question is whether the data volume container should be represented as a container or as a volume. I prefer to think of them as volumes and find passive containers to be non-intuitive for users, problematic for management systems to deal with, and the source of weird corner-case behaviors.

stp-ip · 2014-11-21T23:32:25Z

@bgrant0607 still they are supported in docker and therefore we should acknowledge they exist. I would love to see more integrated methods in k8s itself, which just expose a specific type of volume. I was hinting at that in my proposal via the VaaS approach. But I would dislike this approach reducing compatibility.

mindscratch · 2014-12-18T01:29:24Z

+1 for supporting a container as a volume. I have a scenario where I have a container that has a bunch of data baked into it for use by other containers, helps keep the data "local" to the work being done.

rehevkor5 · 2015-01-09T19:18:36Z

Whatever you decide, I hope you will make it clear in the documentation to save people time of searching around to find this information. Currently, the documentation for both compute and container engine:

makes no mention that VOLUME/--volumes-from/container-as-volume is not supported
makes no mention of possible work-arounds
makes no mention of the possibility of retrieving things from a git repo

It's important to note that using a git repo isn't the same. It requires the git repo to be securely accessible from Google Cloud (or wherever Kubernetes is being used). Further, it's unclear how non-public repositories would be accessible, unless the user & password is hard-coded into the Kubernetes GitRepo#repository JSON/YAML string. Also, it requires that the desired artifact(s) be checked in to source control. And it decouples the Docker image from the artifact (which may or may not be desirable).

I will be working around this issue by moving the data that's in my container volume into a Dockerfile that layers on top of the container that wanted to use the volume, with ADD. The problem you're running into is that the community at large is encouraging the "container as volume" approach in websites and blog posts, and as a result people will continue to have difficulty. For example, the docker website itself says, "If you have some persistent data that you want to share between containers, or want to use from non-persistent containers, it's best to create a named Data Volume Container, and then to mount the data from it." (emphasis mine).

Also, @erictune a container-only volume can be (and probably should be) written as "FROM scratch". I'd argue that if the user doesn't do it that way, that's their choice.

AshleyAitken · 2015-01-11T01:42:28Z

+1 @rehevkor5

I am disappointed to hear that k8s doesn't support data volume containers.

I am not sure how I am supposed to abstract r/w data away from the host now. I was under the impression that k8s was about abstracting away the host, but a host volume introduces all sorts of host-specfics, like having to share the same username/group for access to the data.

@rehevkor5 I thought the same thing about data volume containers at first (should be written as FROM scratch) until I read this (which may or may not be correct): http://container42.com/2014/11/18/data-only-container-madness/ Your workaround seems to do just this?

thockin · 2015-01-11T05:12:53Z

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is shared
across containers within a pod, without resorting to host directories -
this is what an emptyDir volume is.

Now the question comes down to "but I don't want an empty directory, I
want ". The question I think we need to sort out is what
the is. We currently support pull-from-git, which is just
a single instance of a larger pattern - "go fetch some pre-cooked data once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort. This
brings a slew of new design questions: Is it just a container like all the
other app containers? How do I ensure it runs BEFORE my app containers?
What if it experiences a failure? I don't have answers to all of these yet.

Now, let's think about the case of docker data volume containers. What is
the semantic that people are really asking for? Is it:

"run" a data container and then expose the entire chroot of that run?
"run" a data container and then expose all of the VOLUME (from
Dockerfile) dirs as kubernetes volumes?
"run" a data container and then do the equivalent of --volumes-from into
kubernetes containers?

These are all subtly different semantically, especially in the face of a
data container that has multiple VOLUME statements. Some operating modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as we can
up front).

ACTION ITEM: I'd very much like for people who use docker data containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really hide, so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge, input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken notifications@github.com
wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume containers.

I am not sure how I am supposed to abstract r/w data away from the host
now. I was under the impression that k8s was about abstracting away the
host, but a host volume introduces all sorts of host-specfics, like having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing about
data volume containers at first (should be written as FROM scratch) until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
#831 (comment)
.

thockin · 2015-01-11T05:18:10Z

To follow up to myself - all of this assumes that any data mutations you
make have a lifetime equivalent to the pod. If the pod dies for any reason
(the machine goes down, it gets deleted in the API, some non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin thockin@google.com wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is shared
across containers within a pod, without resorting to host directories -
this is what an emptyDir volume is.

Now the question comes down to "but I don't want an empty directory, I
want ". The question I think we need to sort out is what
the is. We currently support pull-from-git, which is just
a single instance of a larger pattern - "go fetch some pre-cooked data once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort. This
brings a slew of new design questions: Is it just a container like all the
other app containers? How do I ensure it runs BEFORE my app containers?
What if it experiences a failure? I don't have answers to all of these yet.

Now, let's think about the case of docker data volume containers. What is
the semantic that people are really asking for? Is it:

"run" a data container and then expose the entire chroot of that run?

"run" a data container and then expose all of the VOLUME (from
Dockerfile) dirs as kubernetes volumes?

"run" a data container and then do the equivalent of --volumes-from into
kubernetes containers?

These are all subtly different semantically, especially in the face of a
data container that has multiple VOLUME statements. Some operating modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as we can
up front).

ACTION ITEM: I'd very much like for people who use docker data containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really hide, so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge, input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken notifications@github.com
wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the host
now. I was under the impression that k8s was about abstracting away the
host, but a host volume introduces all sorts of host-specfics, like having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing about
data volume containers at first (should be written as FROM scratch) until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
#831 (comment)
.

mindscratch · 2015-01-11T17:25:32Z

I'll try to describe a use case for a data container. I'll have a pod with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data provided by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often is
10-15 gigabytes.

The data container has a process responsible for pulling the data that will
be needed by the process container. While the process container is doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap it so
the next time the process container begins a new process it's using the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin notifications@github.com
wrote:

To follow up to myself - all of this assumes that any data mutations you
make have a lifetime equivalent to the pod. If the pod dies for any reason
(the machine goes down, it gets deleted in the API, some non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin thockin@google.com wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is
shared
across containers within a pod, without resorting to host directories -
this is what an emptyDir volume is.

Now the question comes down to "but I don't want an empty directory, I
want ". The question I think we need to sort out is what
the is. We currently support pull-from-git, which is
just
a single instance of a larger pattern - "go fetch some pre-cooked data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort. This
brings a slew of new design questions: Is it just a container like all
the
other app containers? How do I ensure it runs BEFORE my app containers?
What if it experiences a failure? I don't have answers to all of these
yet.

Now, let's think about the case of docker data volume containers. What is
the semantic that people are really asking for? Is it:

"run" a data container and then expose the entire chroot of that run?

"run" a data container and then expose all of the VOLUME (from
Dockerfile) dirs as kubernetes volumes?

"run" a data container and then do the equivalent of --volumes-from
into
kubernetes containers?

These are all subtly different semantically, especially in the face of a
data container that has multiple VOLUME statements. Some operating modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really hide,
so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge, input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <notifications@github.com

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the host
now. I was under the impression that k8s was about abstracting away the
host, but a host volume introduces all sorts of host-specfics, like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing
about
data volume containers at first (should be written as FROM scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<
#831 (comment)

.

—
Reply to this email directly or view it on GitHub
#831 (comment)
.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

thockin · 2015-01-11T18:35:36Z

To me, this does not make a strong argument. Everything you describe is
possible if your data container just writes to a shared emptyDir volume.

Now, there IS a gotcha with the initial load of data, but that has to be
handled in any similar model. Either the data is immutable, in which case
the data container can load it once and go to sleep, or else it is changing
over time, time, in which case you have to wait for it to get current. In
the former case, the initial data is ALL that matters. Is that an
interesting use? In the latter, does the initial data matter, or only
"current" data?

The only other argument is to be exactly docker compatible semantically,
but frankly the volumes-from behavior is so semantically rigid, it may not
be worth being compatible with.
On Jan 11, 2015 9:26 AM, "Craig Wickesser" notifications@github.com wrote:

I'll try to describe a use case for a data container. I'll have a pod with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data provided
by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often is
10-15 gigabytes.

The data container has a process responsible for pulling the data that
will
be needed by the process container. While the process container is doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap it so
the next time the process container begins a new process it's using the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin notifications@github.com
wrote:

To follow up to myself - all of this assumes that any data mutations you
make have a lifetime equivalent to the pod. If the pod dies for any
reason
(the machine goes down, it gets deleted in the API, some non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin thockin@google.com wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is
shared

across containers within a pod, without resorting to host directories

this is what an emptyDir volume is.

Now the question comes down to "but I don't want an empty directory,
I
want ". The question I think we need to sort out is
what
the is. We currently support pull-from-git, which is
just
a single instance of a larger pattern - "go fetch some pre-cooked data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support
those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you
quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in
your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort.
This
brings a slew of new design questions: Is it just a container like all
the
other app containers? How do I ensure it runs BEFORE my app
containers?
What if it experiences a failure? I don't have answers to all of these
yet.

Now, let's think about the case of docker data volume containers. What
is
the semantic that people are really asking for? Is it:

"run" a data container and then expose the entire chroot of that
run?

"run" a data container and then expose all of the VOLUME (from
Dockerfile) dirs as kubernetes volumes?

"run" a data container and then do the equivalent of --volumes-from
into
kubernetes containers?

These are all subtly different semantically, especially in the face of
a
data container that has multiple VOLUME statements. Some operating
modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as
we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data
containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something
like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really
hide,
so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge,
input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <
notifications@github.com

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the
host
now. I was under the impression that k8s was about abstracting away
the
host, but a host volume introduces all sorts of host-specfics, like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing
about
data volume containers at first (should be written as FROM scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<

#831 (comment)

.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69484090>

.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply to this email directly or view it on GitHub
#831 (comment)
.

mindscratch · 2015-01-11T18:43:06Z

Using an emptydir that is available between containers sounds sufficient.
The initial data could actually be baked into the docker image, then the
process in the "data" container could make sure it updates it when
necessary.

After understanding "emptydir" better, I agree, the use case I provided
would work with what Kubernetes supports today.

Thanks.

On Sun, Jan 11, 2015 at 1:36 PM, Tim Hockin notifications@github.com
wrote:

To me, this does not make a strong argument. Everything you describe is
possible if your data container just writes to a shared emptyDir volume.

Now, there IS a gotcha with the initial load of data, but that has to be
handled in any similar model. Either the data is immutable, in which case
the data container can load it once and go to sleep, or else it is changing
over time, time, in which case you have to wait for it to get current. In
the former case, the initial data is ALL that matters. Is that an
interesting use? In the latter, does the initial data matter, or only
"current" data?

The only other argument is to be exactly docker compatible semantically,
but frankly the volumes-from behavior is so semantically rigid, it may not
be worth being compatible with.
On Jan 11, 2015 9:26 AM, "Craig Wickesser" notifications@github.com
wrote:

I'll try to describe a use case for a data container. I'll have a pod
with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data provided
by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often is
10-15 gigabytes.

The data container has a process responsible for pulling the data that
will
be needed by the process container. While the process container is doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap it
so
the next time the process container begins a new process it's using the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin notifications@github.com
wrote:

To follow up to myself - all of this assumes that any data mutations
you
make have a lifetime equivalent to the pod. If the pod dies for any
reason
(the machine goes down, it gets deleted in the API, some
non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin thockin@google.com
wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is
shared

across containers within a pod, without resorting to host directories

this is what an emptyDir volume is.

Now the question comes down to "but I don't want an empty
directory,
I
want ". The question I think we need to sort out is
what
the is. We currently support pull-from-git, which is
just
a single instance of a larger pattern - "go fetch some pre-cooked
data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs,
svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support
those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you
quickly
arrive at the followup features like "...and re-pull from git every
10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in
your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort.
This
brings a slew of new design questions: Is it just a container like
all
the
other app containers? How do I ensure it runs BEFORE my app
containers?
What if it experiences a failure? I don't have answers to all of
these
yet.

Now, let's think about the case of docker data volume containers.
What
is
the semantic that people are really asking for? Is it:

"run" a data container and then expose the entire chroot of that
run?

"run" a data container and then expose all of the VOLUME (from
Dockerfile) dirs as kubernetes volumes?

"run" a data container and then do the equivalent of --volumes-from
into
kubernetes containers?

These are all subtly different semantically, especially in the face
of
a
data container that has multiple VOLUME statements. Some operating
modes
also make it hard to verify input until after a pod has been
accepted,
scheduled, and attempted on a kubelet (we try to validate as much as
we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data
containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something
like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really
hide,
so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as
a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge,
input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <
notifications@github.com

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the
host
now. I was under the impression that k8s was about abstracting away
the
host, but a host volume introduces all sorts of host-specfics, like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing
about
data volume containers at first (should be written as FROM scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<

#831 (comment)

.

Reply to this email directly or view it on GitHub
<

#831 (comment)

.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply to this email directly or view it on GitHub
<
#831 (comment)

.

—
Reply to this email directly or view it on GitHub
#831 (comment)
.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

dims · 2017-11-20T15:44:37Z

@ieugen i believe yes, you should be able to do that. It's a pretty simple shell script so feel free to try it and let me know if you see issues.

As for upstream kubeadm etc. if someone wants to take the initiative, i can help.

Thanks,
Dims

kfox1111 · 2017-11-20T16:45:07Z

Would be interested in getting this into a helm chart somehow... I saw containerized mount utils merged but they say it doesnt work with flexVolumes. Maybe something like:
https://github.com/openstack/kolla-kubernetes/blob/master/helm/microservice/ceph-rbd-daemonset/templates/ceph-rbd-daemonset.yaml

with shared mount namespaces?

kfox1111 · 2017-11-20T17:28:41Z

alternately, if we could get a statically linked jq, we might be able to just slide in the two files directly onto the host....

kfox1111 · 2017-11-20T17:33:04Z

or, I guess we could spit the difference and just run jq in a container... docker run -i --rm jq....

sg3s · 2018-01-03T08:48:32Z

It's been a month since I started studying how to use kubernetes (properly). It didn't take me long to find pretty much all use cases mentioned in this issue on my own.

In all cases it comes down to wanting to expose static files to more than one process/container.

For us that means:

Expose dir from container A (php runtime with static frontend) to container B (generic nginx) to serve only the static (css/js) files.
Expose dir from container A (static js frontend) to container B (generic nginx) to serve those files.

As I understand, the facilities that would allow this functionality are currently only properly supported by containerd(?) in the form of volumes, but it would be very helpful to have...

Some cases could probably be solved by obscuring the copy action needed and ensuring it is successful.

Any other (simple) solution that would allow packaging static files as a single artifact, and then use it inside of a pod/container, without copying it (with postStart commands) each time, would probably also work for most use cases mentioned in this issue. The thing is, with pipelines to build containers and registries to hold/version them all pretty much figured out, they are an extremely handy vassal for this (and in my opinion not outside their scope, it helps with the single responsibility principle).

Anyway, just my 2 cents.

gjcarneiro · 2018-01-03T13:19:24Z

Expose dir from container A (php runtime with static frontend) to container B (generic nginx) to serve only the static (css/js) files.
Expose dir from container A (static js frontend) to container B (generic nginx) to serve those files.

I have found that recent Docker multi-stage builds pretty much allow you to do this at docker build level.

kfox1111 · 2018-01-03T18:18:11Z

multistage builds are not the same thing. multistage builds let you do stuff like build, throw away the build environment and copy the built artefacts to the final container. in the end, you are left with 1 container. it in this situation has for example, nginx and your static files.

In the spirit of k8s composability though, the desire is to have one container for nginx that can be independently updated from a second container storing your static files. they are combined together at runtime via k8s pod semantics. This is what the issue is about.

gjcarneiro · 2018-01-03T18:48:09Z

Yes, I agree docker images as volumes is nicer. I'm just leaving a clue, to whoever is reading this bug, how you can work around the missing feature in the meantime.

kfox1111 · 2019-03-21T16:28:51Z

The recent ephemeral csi volume support along with https://github.com/kubernetes-csi/csi-driver-image-populator should make this possible. :)

bgrant0607 · 2019-03-21T20:32:21Z

@kfox1111 Cool, thanks!

BTW, Google's internal composable "package" mechanism is described in this talk:
https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf
which is mentioned in the SRE book:
https://landing.google.com/sre/sre-book/chapters/release-engineering/

huangyoukun · 2019-06-19T12:55:51Z

+1

ianroberts · 2021-05-15T10:59:58Z

In the spirit of k8s composability though, the desire is to have one container for nginx that can be independently updated from a second container storing your static files. they are combined together at runtime via k8s pod semantics. This is what the issue is about.

I have a similar requirement in my project - I'm building a platform that exposes a number of REST services for language processing. This includes tools like speech recognition which are structured as an "engine" (a few MB in size, the code that implements the service) and a separate "model" (1-5GB in size, for each specific language/domain/etc.). The two can evolve independently, and we might have three engine versions and 20 different models active at any given time. Currently we have 60 different images, and when a new engine is released we have to re-build 20 images of many GB each to embed all the different models.

Ideally we'd be able to build three images for the engines and 20 images for the models, and mount the model into the engine container via the pod spec. This would make updating the engine a much lighter weight process as the nodes only have to pull one new engine image (a few MB) rather than 20 new engine+model images (potentially up to 100GB).

UPSTREAM: <carry>: add CSI migration feature gates for GCE PD and Azure Disk

willzhang · 2023-02-11T09:29:52Z

i want mount docker registry auth and certs to official registry images , i'd like put the auth and certs in another image , i don't need prepare auth and certs at local host path

kfox1111 · 2023-02-13T19:08:25Z

i want mount docker registry auth and certs to official registry images , i'd like put the auth and certs in another image , i don't need prepare auth and certs at local host path

Secrets in general are better suited to be put into Kubernetes secrets then images, IMO. Easier to encrypt/audit. You can treat them like volumes. Could you try that and see if it will do what you want?

toms-place · 2023-07-06T13:24:03Z

I already commented here: #6120

But to summarize, I managed to share the container filesystem using shareProcessNamespace like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: datasharing
spec:
  selector:
    matchLabels:
      app: datasharing
  template:
    metadata:
      labels:
        app: datasharing
    spec:
      shareProcessNamespace: true
      containers:
        - name: data
          image: ubuntu
          imagePullPolicy: IfNotPresent
          command: ["/bin/bash", "-c"]
          args: ["trap 'echo signal;exit 0' SIGINT; sleep infinity"]
          resources:
            limits:
              memory: "128Mi"
              cpu: "500m"
          lifecycle:
            postStart:
              exec:
                command:
                  - /bin/sh
                  - -c
                  - mkdir -p /DATA && touch /DATA/empty_test_file
        - name: nginx
          image: nginx:stable
          imagePullPolicy: IfNotPresent
          resources:
            limits:
              memory: "128Mi"
              cpu: "500m"
          ports:
            - containerPort: 8080
          env:
            - name: GET_DATA_PROC_DIR
              value: 'find /proc -maxdepth 1 -type d -regex "/proc/[0-9]*" | head -2 | tail -1'
          lifecycle:
            postStart:
              exec:
                command:
                  - /bin/sh
                  - -c
                  - ln -s $(eval $GET_DATA_PROC_DIR)/root/DATA /mnt/DATA

arkadijs · 2024-04-16T17:01:50Z

If you can put a native executable into the data container, then it's less tricky to just symlink from there via emptyDir intermediary. Perhaps, also less opportunity for not having the symlink already in place when the service starts.

shareProcessNamespace: true
securityContext:
  runAsUser: 0 # /proc/<pid>/root resolve fails if not root
volumes:
- name: share
  emptyDir: {}
containers:
- name: data
  image: data:1 # FROM busybox
  volumeMounts:
  - name: share
    mountPath: /share
  command:
  - "/bin/sh"
  - "-c"
  - "ln -s /proc/$$$$/root/model /share/; sleep infinity"
- name: service
  image: service:1
  volumeMounts:
  - name: share
    mountPath: /share
    readOnly: true
  command:
  - "/bin/sh"
  - "-c"
  - "while test ! -L /share/model; do sleep 1; done; exec -a service /bin/service --data=/share/model"

rhuss · 2024-04-29T07:13:53Z

@arkadijs thanks for validating the symbolic link approach that model cars have been introduce to KServe (see also https://docs.google.com/document/d/1Bs4fnP8rhPMaoPoLSYVvuRq-z9vkGPQ0rKbmfH4I7js/edit?usp=sharing for the design document from 10/2023)

See also the proposal kserve/kserve#3646 to avoid race conditions decoupled from the main container (by using K8s sidecar containers (initContainers with restartPolicy set to Always) and a startupProbe)

ralgozino · 2024-06-11T10:40:17Z

Hi @rhuss , in your linked document you state that it is only required that the containers run with the same UID, but in my tests I was able to open /proc/$pid/root of one container from another only when running the containers as root. Otherwise I get permission errors.

Do you have any test case that you could share by any chance?

Thanks in advance

lavalamp added design and removed design labels Aug 13, 2014

bgrant0607 added the kind/design Categorizes issue or PR as related to design. label Sep 30, 2014

bgrant0607 mentioned this issue Sep 30, 2014

Ensure there is an easy way to provide container input (and to get output) #1503

Closed

erictune mentioned this issue Oct 6, 2014

Durable data #1515

Closed

bgrant0607 added the area/app-lifecycle label Nov 20, 2014

bgrant0607 mentioned this issue Nov 20, 2014

Integrate third-party configuration management tools with our config distribution API #2068

Closed

bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 3, 2014

bgrant0607 mentioned this issue Dec 6, 2014

Resource Advice in Pod Templates #2768

Closed

bgrant0607 mentioned this issue Dec 18, 2014

Add container as possible volume type #3009

Closed

AmitKumarDas mentioned this issue Nov 27, 2017

[UseCase] Can openebs solve the need of container-as-a-volume in Kubernetes? openebs/openebs#946

Closed

derekperkins mentioned this issue Dec 20, 2017

Update helm install for Kubernetes and add new k8s Docker image vitessio/vitess#3487

Merged

spiffxp removed the triaged label Mar 16, 2018

bgrant0607 added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Apr 3, 2018

bgrant0607 mentioned this issue Aug 23, 2018

Proposal for Lifecycle Hooks kubernetes/community#1171

Closed

Iristyle mentioned this issue Nov 2, 2020

LCOW: mounting a new volume shadows filesystem for Linux containers on Windows host, but copies contents on other platforms moby/moby#39892

Closed

sallyom pushed a commit to sallyom/kubernetes that referenced this issue Jul 4, 2021

Merge pull request kubernetes#831 from bertinatto/gce-azure-csi

0228142

UPSTREAM: <carry>: add CSI migration feature gates for GCE PD and Azure Disk

bgrant0607 mentioned this issue Jul 29, 2022

Develop a way to handle application configuration kptdev/kpt#3210

Open

rhuss mentioned this issue Aug 4, 2023

Enhancing KServe Model Fetching with oci: Schema and Sidecar Containers kserve/kserve#3043

Closed

This was referenced May 28, 2024

Model Registry proposal (ref KF community meeting 20240102) kubeflow/community#682

Open

VolumeSource: OCI Artifact and/or Image kubernetes/enhancements#4639

Open

Initial commit for OCI dist spec v1.1.0 agent support kserve/kserve#3539

Draft

Image volumes and container volumes #831

Image volumes and container volumes #831

Comments

thockin commented Aug 8, 2014

brendandburns commented Aug 8, 2014

thockin commented Aug 8, 2014

bgrant0607 commented Oct 1, 2014

erictune commented Oct 1, 2014

thockin commented Oct 1, 2014

dchen1107 commented Oct 1, 2014

thockin commented Oct 1, 2014

erictune commented Oct 6, 2014

stp-ip commented Nov 21, 2014

bgrant0607 commented Nov 21, 2014

stp-ip commented Nov 21, 2014

mindscratch commented Dec 18, 2014

rehevkor5 commented Jan 9, 2015

AshleyAitken commented Jan 11, 2015

thockin commented Jan 11, 2015

thockin commented Jan 11, 2015

mindscratch commented Jan 11, 2015

thockin commented Jan 11, 2015

across containers within a pod, without resorting to host directories

mindscratch commented Jan 11, 2015

across containers within a pod, without resorting to host directories

dims commented Nov 20, 2017

kfox1111 commented Nov 20, 2017

kfox1111 commented Nov 20, 2017

kfox1111 commented Nov 20, 2017

sg3s commented Jan 3, 2018 • edited Loading

gjcarneiro commented Jan 3, 2018

kfox1111 commented Jan 3, 2018

gjcarneiro commented Jan 3, 2018

kfox1111 commented Mar 21, 2019

bgrant0607 commented Mar 21, 2019

huangyoukun commented Jun 19, 2019

ianroberts commented May 15, 2021

willzhang commented Feb 11, 2023

kfox1111 commented Feb 13, 2023

toms-place commented Jul 6, 2023

arkadijs commented Apr 16, 2024 • edited Loading

rhuss commented Apr 29, 2024

ralgozino commented Jun 11, 2024

sg3s commented Jan 3, 2018 •

edited

Loading

arkadijs commented Apr 16, 2024 •

edited

Loading