Allows environment variables to be lazily expanded on container run #31525

simonferquel · 2017-03-03T17:43:54Z

Context:
On Windows, environment variables can be either volatile (scoped to a
process and its children, like on Linux), or persisted (stored in
registry, shared accross sessions, even sometimes accross users).

The current ENV dockerfile command does not consider that a particular
variable can have a living value stored in the container registry simply
because persisted variables does not exist on *nix platforms.

This is a problem, very well illustrated by a simple file:

FROM microsoft/nanoserver
ADD .\\bin c:\\myapp
ENV PATH ${PATH};c:\\myapp
RUN echo %PATH%

In this example, the builder does not consider that PATH can have a
living value in the stored base image and overrides its value with
;c:\\myapp. this breaks basically the whole Windows

The way I solved this, is by introducing a flag to the ENV command
telling the variable should be expanded at runtime:

ENV --lazy-expand PATH ${PATH};c:\\myapp

The lazy-expandiness of a variable is stored in the image configuration,
and when running a container for an image with such parameters, the
processStart info is modified to expand these variables in powershell
before invoking the previously defined processStart.

Signed-off-by: Simon Ferquel simon.ferquel@docker.com

tonistiigi · 2017-03-03T18:07:29Z

daemon/oci_windows.go

+	toPrepend += "cmd /S /C "
+	cmdLine := "'" + strings.Join(s.Process.Args, " ") + "'"
+	s.Process.Args = make([]string, 0)
+	s.Process.Args = append(s.Process.Args, "powershell", "-Command", toPrepend, "cmd", "/s", "/c", cmdLine)


Do all windows containers have powershell?
Also, append([]string{}, ...), or just skip the append.

Both nanoserver and server core have powershell, yes. I'll make the change and amend

Context: On Windows, environment variables can be either volatile (scoped to a process and its children, like on Linux), or persisted (stored in registry, shared accross sessions, even sometimes accross users). The current ENV dockerfile command does not consider that a particular variable can have a living value stored in the container registry simply because persisted variables does not exist on *nix platforms. This is a problem, very well illustrated by a simple file: ``` FROM microsoft/nanoserver ADD .\\bin c:\\myapp ENV PATH ${PATH};c:\\myapp RUN echo %PATH% ``` In this example, the builder does not consider that PATH can have a living value in the stored base image and overrides its value with `;c:\\myapp`. this breaks basically the whole Windows The way I solved this, is by introducing a flag to the ENV command telling the variable should be expanded at runtime: ``` ENV --lazy-expand PATH ${PATH};c:\\myapp ``` The lazy-expandiness of a variable is stored in the image configuration, and when running a container for an image with such parameters, the processStart info is modified to expand these variables in powershell before invoking the previously defined processStart. Signed-off-by: Simon Ferquel <simon.ferquel@docker.com>

justincormack · 2017-03-09T21:33:33Z

cc @jhowardmsft whats your opinion of this vs #29048?

lowenna · 2017-03-10T02:19:50Z

The assertion that powershell is always present is a false one - it will be optional in future base images, so it shouldn't be mandated with that knowledge. I circumvented that in #29048 through cmd for extracting vars - see the big block comment (although I realise mine was a getter - this is a setter).

Otherwise, purely from a quick glance at the code (not tested at all), it does pretty much the same thing, but in a significantly more complex manner.

From an end-user perspective, I believe what (in 29048) is GETENV SOMEVAR becomes ENV --lazy-expand SOMEVAR. Is this any clearer than an explicit GETENV statement? I'm not convinced. This option was considered at length in the 90-min maintainers meeting some time back now, and ruled out. I need to go back to my notes to recall why. Perhaps @tianon, @duglin or @thaJeztah can recall.

I have strong concerns about the prepending of the arguments https://github.com/docker/docker/pull/31525/files#diff-010226ebaaeeb61428008a57958bcf75R44 through Powershell, then hard-coding cmd /s /c <the actual command>. That is wrong. They should be constructed in the environment block, not passed as an extra command. I don't think the current implementation will work correctly with combinations of CMD and ENTRYPOINT.

Before going with either this or 20948, I think there's some design choices still to be made. Personally I still prefer the GETENV or INSERT-SOME-OTHER-COMMAND-INSTEAD var option.

tonistiigi · 2017-03-10T03:21:32Z

@jhowardmsft Would it be possible to keep this behavior but avoid overriding command for starting a container?

simonferquel · 2017-03-10T15:21:13Z

If we could pass an option to hcs asking it to expand env variables, it would be really helpful (because I do aggree that prepending the powershell script is really messy)
Also, with this approach, we could decide to make lazy expanded variables implicitly each time an undefined env var is referenced on Windows. It would change the behavior of variables substitution, but it would make env path $path;/myapp just work.

lowenna · 2017-03-10T17:57:04Z

@simonferquel

If we could pass an option to hcs asking it to expand env variables,

Short answer no.

Longer answer:
What you're effectively asking for is the implementation of the original GETENV PR to be implemented in HCS (or GCS for Hyper-V containers). Ignoring all the other complexities, HCS and GCS are effectively just glorified process launchers which call CreateProcess https://msdn.microsoft.com/en-us/library/windows/desktop/ms682425(v=vs.85).aspx. The lpEnvironment parameter is where the environment block gets passed.

There's no way from outside a Windows container to get the environment for a process from the registry (and optionally the shell). (Well there is, but it's incredibly complex, would require mounting the registry hives, parsing it offline, and doing shell processing offline. Somehow....). So let's realistically rule that out as impractical. The only way to get the effective environment from a container is by starting a process in it. Hence what both this PR and the GETENV PR do to solve it.

@tonistiigi

Would it be possible to keep this behavior but avoid overriding command for starting a container?

For the reasons above, I don't really think so.

Summary: I'm still of the believe the GETENV approach is by far the cleanest solution to join the two realities (that of the containers view; and that of the builders view) together in a easy-to-use way for end-users.

stevvooe · 2017-03-20T22:10:06Z

I would be in favor of a solution that avoids modifying the image format. Assigning this to the declaration stage of the environment variable seems a little awkward when we could just change how it is referenced and interpreted.

It might make more sense to do something like this:

RUN --lazy=PATH echo %PATH%

(Alternatively, GETENV might be better, but I am not working from that base, for now)

Combined with an escaped path in the definition above, it will ensure that the behavior is restricted to the statement it affects, rather than other, unrelated statements. In general, we should differentiate features that affect the current build from those that modify the resulting image format.

I also find this quite odd because an environment variable typically is expanded at the time that is referenced. For the definition of the env var, the expansion is done at the definition point, making the behavior contextually clear, but well-controlled. Deferring this behavior to add lazy expansion introduces an interpreter that might never exit. Even in this example, the definition is a recursion, and has the potential to never terminate.

tonistiigi · 2017-03-20T23:14:08Z

I would be in favor of a solution that avoids modifying the image format.

I assume this could be done without extra fields to image because lazy expand only makes sense if a value has %% and if it is not enabled these places would be already replaced with empty/previous values. So it could be determined by just looking at the values in current config.

Considering that the actual problem here is that the image format currently does not have a place that defines ENV in windows I don't see the extra field itself as a blocker(I don't have a solution for cleanly setting these values on container startup though). ArgsEscaped is a good example of windows exceptions leaking into config already.

Looking to the future, builder should be a DAG of filesystem modifications(either moving files or running commands on them) and to create images a filesystem should be associated with image metadata. Currently, builder implementation doesn't make a clear distinction and causes bugs(wrong reference counting, ebusy, side effects to daemon state), performance issues(slow commit, container creation for metadata changes) and blocks innovation on composition features(concurrency, better cache). GETENV doesn't fit into this model and would need to be some kind of a special case. This syntax doesn't have that problem.

simonferquel · 2017-03-27T08:21:40Z

IIRC, what is very costly when running a container on windows is to setup the container and its network - not actually running the process.
What if we ran a second process after each run that extracts all persisted env vars to put it in the image configuration (we would also have to either update base images or run this process on the from command). What do you think about it @jhowardmsft ?

tonistiigi · 2017-04-06T19:12:02Z

In linux we run init process inside the container that does extra zombie reaping and signal forwarding. If there is a concern about executing shell I assume we could just write a small binary that calls the required system call directly and execs/forks to the main process after that.

lowenna · 2017-04-06T19:16:47Z

@tonistiigi Where does that binary come from? How is it guaranteed to be in every container image (including the MS distributed base ones)? How would it be signed. How would it do stdio handle handoff on Windows when HCS is tracking the original process for exit and "fork/exec" (not that there is such a thing in reality in the Win32 API set)

I really don't think this is a good idea that would work on Windows.

tonistiigi · 2017-04-06T19:19:17Z

@jhowardmsft In the case of linux, the docker-init binary is shipped together with Docker.

lowenna · 2017-04-06T19:42:05Z

Right, but that still doesn't solve the problem above... Just introduces yet another unnecessary overhead

simonferquel · 2017-04-07T10:30:15Z

With that we would get he opportunity to extract env vars at each "run" with a low overhead (don't have to create the container / assign it a network twice). That seems really nice to me to reconciliate env vars states

cpuguy83 · 2018-01-17T02:09:52Z

👉

olljanat · 2018-12-21T19:05:01Z

@simonferquel this one needs to be rebased

thaJeztah · 2019-07-13T21:10:41Z

Let's close this one for now; if we still want this, this probably has to be reimplemented for buildkit

GordonTheTurtle added the status/0-triage label Mar 3, 2017

simonferquel mentioned this pull request Mar 3, 2017

Windows: Builder GETENV #29048

Closed

tonistiigi reviewed Mar 3, 2017

View reviewed changes

simonferquel force-pushed the lazy-expand-variables branch 2 times, most recently from 6966d37 to dcdd77b Compare March 3, 2017 18:17

vdemeester added status/1-design-review and removed status/0-triage labels Mar 3, 2017

simonferquel force-pushed the lazy-expand-variables branch 2 times, most recently from 78ae9a9 to a884653 Compare March 6, 2017 13:58

simonferquel force-pushed the lazy-expand-variables branch from a884653 to c1a8a69 Compare March 6, 2017 16:22

thaJeztah added this to backlog in maintainers-session Mar 9, 2017

GordonTheTurtle assigned stevvooe Mar 18, 2017

thaJeztah moved this from backlog to Active in maintainers-session Apr 6, 2017

thaJeztah moved this from Active to Revisit in maintainers-session Jun 1, 2017

AkihiroSuda mentioned this pull request Jul 18, 2017

windows: propagate env across ops? moby/buildkit#74

Closed

thaJeztah mentioned this pull request Sep 26, 2017

Set to ENV the result of command #29110

Open

derek bot added the status/failing-ci Indicates that the PR in its current state fails the test suite label Dec 22, 2018

thaJeztah added the area/builder label Jul 13, 2019

thaJeztah closed this Jul 13, 2019

TBBle mentioned this pull request Oct 16, 2022

Don't set a default PATH for Windows moby/buildkit#3158

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allows environment variables to be lazily expanded on container run #31525

Allows environment variables to be lazily expanded on container run #31525

simonferquel commented Mar 3, 2017

tonistiigi Mar 3, 2017

simonferquel Mar 3, 2017

justincormack commented Mar 9, 2017

lowenna commented Mar 10, 2017

tonistiigi commented Mar 10, 2017

simonferquel commented Mar 10, 2017

lowenna commented Mar 10, 2017 •

edited

Loading

stevvooe commented Mar 20, 2017

tonistiigi commented Mar 20, 2017

simonferquel commented Mar 27, 2017 •

edited

Loading

tonistiigi commented Apr 6, 2017

lowenna commented Apr 6, 2017 •

edited

Loading

tonistiigi commented Apr 6, 2017

lowenna commented Apr 6, 2017

simonferquel commented Apr 7, 2017

cpuguy83 commented Jan 17, 2018

olljanat commented Dec 21, 2018

thaJeztah commented Jul 13, 2019

Allows environment variables to be lazily expanded on container run #31525

Allows environment variables to be lazily expanded on container run #31525

Conversation

simonferquel commented Mar 3, 2017

tonistiigi Mar 3, 2017

Choose a reason for hiding this comment

simonferquel Mar 3, 2017

Choose a reason for hiding this comment

justincormack commented Mar 9, 2017

lowenna commented Mar 10, 2017

tonistiigi commented Mar 10, 2017

simonferquel commented Mar 10, 2017

lowenna commented Mar 10, 2017 • edited Loading

stevvooe commented Mar 20, 2017

tonistiigi commented Mar 20, 2017

simonferquel commented Mar 27, 2017 • edited Loading

tonistiigi commented Apr 6, 2017

lowenna commented Apr 6, 2017 • edited Loading

tonistiigi commented Apr 6, 2017

lowenna commented Apr 6, 2017

simonferquel commented Apr 7, 2017

cpuguy83 commented Jan 17, 2018

olljanat commented Dec 21, 2018

thaJeztah commented Jul 13, 2019

lowenna commented Mar 10, 2017 •

edited

Loading

simonferquel commented Mar 27, 2017 •

edited

Loading

lowenna commented Apr 6, 2017 •

edited

Loading