Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass pod labels to containers and to lifecycle hooks #560

Closed
lavalamp opened this issue Jul 22, 2014 · 38 comments
Closed

Pass pod labels to containers and to lifecycle hooks #560

lavalamp opened this issue Jul 22, 2014 · 38 comments
Labels
area/downward-api kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@lavalamp
Copy link
Member

For context, see: https://groups.google.com/d/msg/google-containers/frOLMyNl5U4/W5_DQUL933IJ

Idea specifically is to have a <service_name>_LABELS env var in addition to the service's port.

Filing as an issue to track discussion. I'm unsure if it's generally useful enough to justify the extra complexity.

@bgrant0607 bgrant0607 added area/downward-api kind/design Categorizes issue or PR as related to design. labels Sep 25, 2014
@bgrant0607 bgrant0607 added this to the v1.0 milestone Oct 4, 2014
@bgrant0607 bgrant0607 changed the title Consider passing service labels to pods Pass pod labels to containers and to lifecycle hooks Oct 10, 2014
@bgrant0607
Copy link
Member

As discussed in #386, environment variables aren't ideal. But we should pass labels down.

@smarterclayton
Copy link
Contributor

Labels will reach the node when #1662 lands

@smarterclayton
Copy link
Contributor

Somewhat opinionated proposal of how downward API could be exposed as env for both labels and service values described here:

#1768 (comment)

  1. Define either a new boolean on each env (resolve) or a new section (generated-env) on container
  2. On Kubelet start, for every generated-env, do simple bash variable substitution using the env from the image, the env from the container, and previously generated env
  3. Expose specific "downward API" env to generated env that can be used in the value part of the env (takes precedence over image env, but not container env or other generated env)

Essentially, do a limited form of env var substitution as pre-processing pass on the Pod prior to invoking docker start. Walk the list once to get all the user provided env, then go through and allow env substitution and let the substitution values pull from a) kubernetes "downward env API" source (service vars, labels, pod info) and b) env vars. Optionally, we could include c) env in the image. Then we would pass that entire env to the container.

This would mean we would never inject env into the user's container by default (so no hidden magic variables except service hosts for now), but users would be able to selectively include variables by name. The kube downward env API vars could be appropriately namespaced.

@thockin
Copy link
Member

thockin commented Jan 31, 2015

And then users who want the actual string "$" in their env vars need to
escape them? Puke.

I could buy a new block of env vars that are explicitly handled this way,
but not the default block.

On Fri, Jan 30, 2015 at 8:43 AM, Clayton Coleman notifications@github.com
wrote:

Somewhat opinionated proposal of how downward API could be exposed as env
for both labels and service values described here:

#1768 (comment)
#1768 (comment)

Essentially, do a limited form of env var substitution as pre-processing
pass on the Pod prior to invoking docker start. Walk the list once to get
all the user provided env, then go through and allow env substitution and
let the substitution values pull from a) kubernetes "downward env API"
source (service vars, labels, pod info) and b) env vars. Optionally, we
could include c) env in the image. Then we would pass that entire env to
the container.

This would mean we would never inject env into the user's container by
default (so no hidden magic variables except service hosts for now), but
users would be able to selectively include variables by name. The kube
downward env API vars could be appropriately namespaced.

Reply to this email directly or view it on GitHub
#560 (comment)
.

@smarterclayton
Copy link
Contributor

There would be a flag that says you want this. I'm not insane, just dangerous. Like "Resolve":true or "expand":true

On Jan 30, 2015, at 10:07 PM, Tim Hockin notifications@github.com wrote:

And then users who want the actual string "$" in their env vars need to
escape them? Puke.

I could buy a new block of env vars that are explicitly handled this way,
but not the default block.

On Fri, Jan 30, 2015 at 8:43 AM, Clayton Coleman notifications@github.com
wrote:

Somewhat opinionated proposal of how downward API could be exposed as env
for both labels and service values described here:

#1768 (comment)
#1768 (comment)

Essentially, do a limited form of env var substitution as pre-processing
pass on the Pod prior to invoking docker start. Walk the list once to get
all the user provided env, then go through and allow env substitution and
let the substitution values pull from a) kubernetes "downward env API"
source (service vars, labels, pod info) and b) env vars. Optionally, we
could include c) env in the image. Then we would pass that entire env to
the container.

This would mean we would never inject env into the user's container by
default (so no hidden magic variables except service hosts for now), but
users would be able to selectively include variables by name. The kube
downward env API vars could be appropriately namespaced.

Reply to this email directly or view it on GitHub
#560 (comment)
.


Reply to this email directly or view it on GitHub.

@smarterclayton
Copy link
Contributor

I clarified in the other one this was explicit opt in on a per var basis, but forgot to copy that here.

On Jan 30, 2015, at 10:07 PM, Tim Hockin notifications@github.com wrote:

And then users who want the actual string "$" in their env vars need to
escape them? Puke.

I could buy a new block of env vars that are explicitly handled this way,
but not the default block.

On Fri, Jan 30, 2015 at 8:43 AM, Clayton Coleman notifications@github.com
wrote:

Somewhat opinionated proposal of how downward API could be exposed as env
for both labels and service values described here:

#1768 (comment)
#1768 (comment)

Essentially, do a limited form of env var substitution as pre-processing
pass on the Pod prior to invoking docker start. Walk the list once to get
all the user provided env, then go through and allow env substitution and
let the substitution values pull from a) kubernetes "downward env API"
source (service vars, labels, pod info) and b) env vars. Optionally, we
could include c) env in the image. Then we would pass that entire env to
the container.

This would mean we would never inject env into the user's container by
default (so no hidden magic variables except service hosts for now), but
users would be able to selectively include variables by name. The kube
downward env API vars could be appropriately namespaced.

Reply to this email directly or view it on GitHub
#560 (comment)
.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Jan 31, 2015

I could probably live with that. As per the migration issue - exposing
some of this stuff is actually detrimental to migration. Tread carefully.

On Fri, Jan 30, 2015 at 7:14 PM, Clayton Coleman notifications@github.com
wrote:

I clarified in the other one this was explicit opt in on a per var basis,
but forgot to copy that here.

On Jan 30, 2015, at 10:07 PM, Tim Hockin notifications@github.com
wrote:

And then users who want the actual string "$" in their env vars need to
escape them? Puke.

I could buy a new block of env vars that are explicitly handled this
way,
but not the default block.

On Fri, Jan 30, 2015 at 8:43 AM, Clayton Coleman <
notifications@github.com>
wrote:

Somewhat opinionated proposal of how downward API could be exposed as
env
for both labels and service values described here:

#1768 (comment)
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/1768#issuecomment-71331800>

Essentially, do a limited form of env var substitution as
pre-processing
pass on the Pod prior to invoking docker start. Walk the list once to
get
all the user provided env, then go through and allow env substitution
and
let the substitution values pull from a) kubernetes "downward env API"
source (service vars, labels, pod info) and b) env vars. Optionally,
we
could include c) env in the image. Then we would pass that entire env
to
the container.

This would mean we would never inject env into the user's container by
default (so no hidden magic variables except service hosts for now),
but
users would be able to selectively include variables by name. The kube
downward env API vars could be appropriately namespaced.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/560#issuecomment-72230218>

.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
#560 (comment)
.

@smarterclayton
Copy link
Contributor

True. I was thinking as a default of exposing only host, port, and maybe a few attributes off the pod. We could then nuke the docker style env vars.

At least here we have a way to do a migration, vs with magic env vars that show up and people use without you being able to tell.

On Jan 30, 2015, at 10:18 PM, Tim Hockin notifications@github.com wrote:

I could probably live with that. As per the migration issue - exposing
some of this stuff is actually detrimental to migration. Tread carefully.

On Fri, Jan 30, 2015 at 7:14 PM, Clayton Coleman notifications@github.com
wrote:

I clarified in the other one this was explicit opt in on a per var basis,
but forgot to copy that here.

On Jan 30, 2015, at 10:07 PM, Tim Hockin notifications@github.com
wrote:

And then users who want the actual string "$" in their env vars need to
escape them? Puke.

I could buy a new block of env vars that are explicitly handled this
way,
but not the default block.

On Fri, Jan 30, 2015 at 8:43 AM, Clayton Coleman <
notifications@github.com>
wrote:

Somewhat opinionated proposal of how downward API could be exposed as
env
for both labels and service values described here:

#1768 (comment)
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/1768#issuecomment-71331800>

Essentially, do a limited form of env var substitution as
pre-processing
pass on the Pod prior to invoking docker start. Walk the list once to
get
all the user provided env, then go through and allow env substitution
and
let the substitution values pull from a) kubernetes "downward env API"
source (service vars, labels, pod info) and b) env vars. Optionally,
we
could include c) env in the image. Then we would pass that entire env
to
the container.

This would mean we would never inject env into the user's container by
default (so no hidden magic variables except service hosts for now),
but
users would be able to selectively include variables by name. The kube
downward env API vars could be appropriately namespaced.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/560#issuecomment-72230218>

.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
#560 (comment)
.


Reply to this email directly or view it on GitHub.

@pmorie
Copy link
Member

pmorie commented Feb 3, 2015

@smarterclayton Clarifying for the record that by flag in your above comment you mean attribute on api.EnvVar. Confirm?

@smarterclayton
Copy link
Contributor

Yes, although a separate block is also reasonable. Is a separate block easier to reason about, or confusing to new users?

On Feb 2, 2015, at 10:55 PM, Paul Morie notifications@github.com wrote:

@smarterclayton Clarifying for the record that by flag in your above comment you mean attribute on api.EnvVar. Confirm?


Reply to this email directly or view it on GitHub.

@pmorie
Copy link
Member

pmorie commented Feb 5, 2015

@smarterclayton I think a separate block that is designated as being resolved is probably easier to reason about, but that's my own subjective opinion. @sdminonne, any opinions about that?

@bgrant0607
Copy link
Member

FWIW, this should probably be discussed in #386.

As @thockin cautioned and as mentioned in #3949, I wouldn't expose a pod's host.

Note that any property encoded in an env. var. would require a container restart when its value changed.

I'd prefer a separate block, similar to the distinction between labels and annotations.

@sdminonne
Copy link
Contributor

I would prefer to avoid a new block.
I've a few reasons for that:

  1. In our use cases we really need labels to be injected in the pod (for example to prefix logs which will be sent to log server pods). We'll use labels to constrain scheduling algorithms and would like to avoid having a block for the scheduler and a block for env var injection.
  2. A new block is another group of info in an already crowded json/yaml file

Looking at #386 (@smarterclayton I wasn't aware of it when we started the discussion about this) the /etc/labels file (proposed by @bgrant0607) sounds a reasonable solution. The container applications are free to use or ignore this information and dynamic updates to it. The syntax key=label is fine.

@thockin
Copy link
Member

thockin commented Feb 6, 2015

I have a slight preference for a new block, but the /etc/labels file or
something like it sounds OK too. The nice thing about a new block is that
we can extend it to define new things like "also expose labels as a file at
this path". Extending Env is more restircted.

On Thu, Feb 5, 2015 at 1:39 AM, Salvatore Dario Minonne <
notifications@github.com> wrote:

I would prefer to avoid a new block.
I've a few reasons for that:

  1. In our use cases we really need labels to be injected in the pod (for
    example to prefix logs which will be sent to log server pods). We'll use
    labels to constrain scheduling algorithms and would like to avoid having a
    block for the scheduler and a block for env var injection.
  2. A new block is another group of info in an already crowded json/yaml
    file

Looking at #386
#386 (
@smarterclayton https://github.com/smarterclayton I wasn't aware of it
when we started the discussion about this) the /etc/labels file (proposed
by @bgrant0607 https://github.com/bgrant0607) sounds a reasonable
solution. The container applications are free to use or ignore this
information and dynamic updates to it. The syntax key=label is fine.

Reply to this email directly or view it on GitHub
#560 (comment)
.

@smarterclayton
Copy link
Contributor

Good point - it would be nice to make that flexible. We could also introduce a volume type that can pull labels or volumes. I guess the question is - what is the boundary for volume types? File generation? File generation from a template? Retrieving keys from etcd and then transforming them? Or is that a job for sidecar containers or pre start lifecycle hooks?

On Feb 5, 2015, at 7:52 PM, Tim Hockin notifications@github.com wrote:

I have a slight preference for a new block, but the /etc/labels file or
something like it sounds OK too. The nice thing about a new block is that
we can extend it to define new things like "also expose labels as a file at
this path". Extending Env is more restircted.

On Thu, Feb 5, 2015 at 1:39 AM, Salvatore Dario Minonne <
notifications@github.com> wrote:

I would prefer to avoid a new block.
I've a few reasons for that:

  1. In our use cases we really need labels to be injected in the pod (for
    example to prefix logs which will be sent to log server pods). We'll use
    labels to constrain scheduling algorithms and would like to avoid having a
    block for the scheduler and a block for env var injection.
  2. A new block is another group of info in an already crowded json/yaml
    file

Looking at #386
#386 (
@smarterclayton https://github.com/smarterclayton I wasn't aware of it
when we started the discussion about this) the /etc/labels file (proposed
by @bgrant0607 https://github.com/bgrant0607) sounds a reasonable
solution. The container applications are free to use or ignore this
information and dynamic updates to it. The syntax key=label is fine.

Reply to this email directly or view it on GitHub
#560 (comment)
.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Feb 6, 2015

Right now volumes are run-once. If you need something active, as in a git
syncer, it's not a volume. we might want to extend the volume abstraction,
but later.

On Thu, Feb 5, 2015 at 5:16 PM, Clayton Coleman notifications@github.com
wrote:

Good point - it would be nice to make that flexible. We could also
introduce a volume type that can pull labels or volumes. I guess the
question is - what is the boundary for volume types? File generation? File
generation from a template? Retrieving keys from etcd and then transforming
them? Or is that a job for sidecar containers or pre start lifecycle hooks?

On Feb 5, 2015, at 7:52 PM, Tim Hockin notifications@github.com
wrote:

I have a slight preference for a new block, but the /etc/labels file or
something like it sounds OK too. The nice thing about a new block is
that
we can extend it to define new things like "also expose labels as a file
at
this path". Extending Env is more restircted.

On Thu, Feb 5, 2015 at 1:39 AM, Salvatore Dario Minonne <
notifications@github.com> wrote:

I would prefer to avoid a new block.
I've a few reasons for that:

  1. In our use cases we really need labels to be injected in the pod
    (for
    example to prefix logs which will be sent to log server pods). We'll
    use
    labels to constrain scheduling algorithms and would like to avoid
    having a
    block for the scheduler and a block for env var injection.
  2. A new block is another group of info in an already crowded
    json/yaml
    file

Looking at #386
#386 (
@smarterclayton https://github.com/smarterclayton I wasn't aware of
it
when we started the discussion about this) the /etc/labels file
(proposed
by @bgrant0607 https://github.com/bgrant0607) sounds a reasonable
solution. The container applications are free to use or ignore this
information and dynamic updates to it. The syntax key=label is fine.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/560#issuecomment-73018709>

.

Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub
#560 (comment)
.

@goltermann goltermann removed this from the v1.0 milestone Feb 6, 2015
@lavalamp lavalamp removed this from the v1.0 milestone Feb 6, 2015
@pmorie
Copy link
Member

pmorie commented Feb 15, 2015

After some thought and discussion with @sdminonne, @EricMountain-1A, and @smarterclayton, I had some thoughts on this:

  1. We know different users will want this information both as volumes and environment variables
  2. There are other sources of data users want in this manner (secrets, service metadata)
  3. The VolumeSource / kubelet volume plugin abstraction is a good one (my opinion) for exposing different information into pods as volumes
  4. Should there be a similar source abstraction for environments? I don't think simply generating env vars will be sufficient for all cases (for example, you wouldn't want to expose secret data as env vars using the current mechanism since the data can be captured by a container commit)

Thinking through the possibility of a source abstraction for environments in the context of this issue:

  1. Add a new DynamicContext block to a pod; this block will allow you to adapt different sources of information (labels, service metadata, etc)
  2. Add a new EnvironmentSource abstraction to a container; the existing method of specifying values for environment becomes something like ManualSource
  3. Add a new DynamicContextSource for environment variables; this will allow you to adapt pieces of DynamicContext into environment variables
  4. Add a new DynamicContextVolumeSource for volumes; this will allow you to adapt piece of the dynamic context into volumes / files

Other source of information can be adapted with new Source implementations.

Typing this out, I'm not sure if I like it.

Any thoughts, @smarterclayton @sdminonne @EricMountain-1A @thockin @bgrant0607 @lavalamp ?

@sdminonne
Copy link
Contributor

@pmorie thanks for the explanation.
So we may end up with: a set of env variables injected in the POD, and a system docker container where info can be found about PODs and those info are available via a shared filesystem?

@pmorie
Copy link
Member

pmorie commented Feb 17, 2015

@sdminonne correct. The sidecar container would need auth to access the master, either via a secret consumed in the volume or something like the LOAS Daemon.

@thockin
Copy link
Member

thockin commented Feb 17, 2015

Why would it need auth? It all seems strictly read-only...

As for feature set, less is more unless there's no other choice. I'd love
to say that the downward API is just about customizing env vars, and the
sidecar container can be more flexible. Can we get away with that?

And yes, I think that a generic "sync kubernetes info to files" container
would be a great piece of the overall system.

On Tue, Feb 17, 2015 at 8:40 AM, Paul Morie notifications@github.com
wrote:

@sdminonne https://github.com/sdminonne correct. The sidecar container
would need auth to access the master, either via a secret consumed in the
volume or something like the LOAS Daemon
#2209.

Reply to this email directly or view it on GitHub
#560 (comment)
.

@smarterclayton
Copy link
Contributor

----- Original Message -----

Why would it need auth? It all seems strictly read-only...

As for feature set, less is more unless there's no other choice. I'd love
to say that the downward API is just about customizing env vars, and the
sidecar container can be more flexible. Can we get away with that?

I think so for the near term. We can take real use cases to handle the container case, and adapt them in later if necessary.

And yes, I think that a generic "sync kubernetes info to files" container
would be a great piece of the overall system.

On Tue, Feb 17, 2015 at 8:40 AM, Paul Morie notifications@github.com
wrote:

@sdminonne https://github.com/sdminonne correct. The sidecar container
would need auth to access the master, either via a secret consumed in the
volume or something like the LOAS Daemon
#2209.

Reply to this email directly or view it on GitHub
#560 (comment)
.


Reply to this email directly or view it on GitHub:
#560 (comment)

@pmorie
Copy link
Member

pmorie commented Feb 18, 2015

@sdminonne @thockin @smarterclayton @bgrant0607

Continuing to flesh out the sidecar container outlined above:

{
  "id": "kubernetes-info-test-pod",
  "kind": "Pod",
  "apiVersion":"v1beta2",
  "labels": {
    "name": "kubernetes-info-test-pod"
  },
  "desiredState": {
    "manifest": {
      "version": "v1beta1",
      "id": "kubernetes-info-test-pod",
      "containers": [{
        "name": "client-container",
        "image": "busybox",
        "command": [ "cat", "/etc/kubernetes-info/labels" ],
        "volumeMounts": [{
          "name": "kubernetes-info-volume",
          "mountPath": "/etc/kubernetes-info",
          "readOnly": true
        }]
      },
      {
        "name": "kubernetes-info-sidecar",
        "image": "kubernetes/kubernetes-info",
        "command": [ "write-kubernetes-info", "/etc/kubernetes-info"],
        "volumeMounts": [{
          "name": "kubernetes-info-volume",
          "mountPath": "/etc/kubernetes-info",
        }]
      }],
      "volumes": [{
        "name": "kubernetes-info-volume",
      }]
    }
  }
}

@sdminonne
Copy link
Contributor

@pmorie thanks for this.
@smarterclayton @thockin let's take the testcase which I already mentioned. Propagating labels to prefix log for aggregation (something similar to #4255). In this case at the run of the container we should write the file in /etc/kubernetes-info updating it at each label, update.
The updates should be triggered by the presence of a container named kubernetes-info-sidecar?

@pmorie
Copy link
Member

pmorie commented Feb 19, 2015

Had a good chat w/ @sdminonne and @EricMountain-1A this morning. Dario is going to prototype a new docker image for the sidecar container that will accept parameters for:

  1. The pod name and namespace
  2. A target directory to project information into
  3. Flags for which info to project (initially it will just support labels)

Example:

project-pod-info --name=<pod name> --namespace=<pod namespace> --output-path=/pod-info --labels

The sidecar will poll the API server for information about the Pod (we had talked about watching, but I had forgotten that there is no watch API for pods) and refresh the information in the shared volume.

@pmorie
Copy link
Member

pmorie commented Feb 22, 2015

@sdminonne @EricMountain-1A see @thockin's comments re: atomic update here: #4673 (comment)

@bgrant0607
Copy link
Member

Dumping info into files was also discussed in #386. I certainly prefer that over environment variables.

However, I think that doesn't address @smarterclayton's concerns about accidental coupling, opt-in for specific information, and adapting to application expectations rather than forcing them to adapt to Kubernetes conventions (other than deployment-time adapters).

@sdminonne
Copy link
Contributor

Just add #5093 as a possible implementation.

@bgrant0607
Copy link
Member

Done enough, via downward API

seans3 pushed a commit to seans3/kubernetes that referenced this issue Apr 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/downward-api kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

10 participants