Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The /tmp/src directory is never removed. #84

Closed
GrahamDumpleton opened this issue Jan 6, 2016 · 24 comments · Fixed by #132
Closed

The /tmp/src directory is never removed. #84

GrahamDumpleton opened this issue Jan 6, 2016 · 24 comments · Fixed by #132
Labels
P3 Priority 3

Comments

@GrahamDumpleton
Copy link
Contributor

Minor issue.

After the 'assemble' script does:

echo "---> Copying application source ..."
cp -Rf /tmp/src/. ./

Nothing ever appears to remove '/tmp/src'.

This means you have two copies of the application. If the source code and data files for the application were very large, that is an unnecessary waste of space.

Is that intentional for any specific reason?

@mfojtik
Copy link
Contributor

mfojtik commented Jan 6, 2016

@GrahamDumpleton I think we mount that directory to the container, so it should not be present in the output image. Will verify that.

@GrahamDumpleton
Copy link
Contributor Author

Maybe it is because I am running 's2i' on the command line and not within OpenShift.

Is how it is executed going to be different?

@GrahamDumpleton
Copy link
Contributor Author

$ s2i --force-pull=false build /tmp/example openshift/python-27-centos7 my-image
---> Copying application source ...
---> Installing dependencies ...
Downloading/unpacking antigravity (from -r requirements.txt (line 2))
Downloading antigravity-0.1.zip
Running setup.py (path:/tmp/pip-build-6PkdIN/antigravity/setup.py) egg_info for package antigravity

Installing collected packages: antigravity
Running setup.py install for antigravity

Successfully installed antigravity
Cleaning up...

$ docker run -u 10000 -it -e APP_MODULE=wsgi --rm -p 8080:8080 my-image ls -lasR /tmp/src
/tmp/src:
total 16
4 drwxr-xr-x 2 default root 4096 Jan  6 09:52 .
4 drwxrwxrwt 8 root    root 4096 Jan  6 09:52 ..
4 -rw-r--r-- 1 default root   22 Jan  6 04:01 requirements.txt
4 -rw-r--r-- 1 default root 3563 Jan  6 01:08 wsgi.py

@mfojtik
Copy link
Contributor

mfojtik commented Jan 6, 2016

@GrahamDumpleton ok, that smells like a bug. We should not keep the source in the /tmp directory. Either S2I or the assemble script should remove it.

@GrahamDumpleton
Copy link
Contributor Author

And just to confirm files are copied in, rather than mounted, where copying in is what I thought happened, here are verbose logs.

$ s2i --loglevel=5 --force-pull=false build /tmp/example op

I0106 20:56:46.223596 48617 main.go:61] Running S2I version "v1.0.4"
I0106 20:56:46.372237 48617 docker.go:224] Image openshift/python-27-centos7:latest available locally

Builder Name:       Python 2.7
Builder Image:      openshift/python-27-centos7
Builder Image Version:  fb12296
Builder Base Version:   71a650e
Source:         /tmp/example
Output Image Tag:   my-image
Incremental Build:  disabled
Remove Old Build:   disabled
Builder Pull Policy:    if-not-present
Quiet:          disabled
Layered Build:      disabled
Docker Endpoint:    tcp://192.168.99.100:2376

I0106 20:56:46.664528 48617 docker.go:224] Image openshift/python-27-centos7:latest available locally
I0106 20:56:46.790968 48617 scm.go:17] DownloadForSource /tmp/example
I0106 20:56:46.791096 48617 git.go:198] useCopy proto spec false local git repo false has git bin true
I0106 20:56:46.791110 48617 git.go:204] makePathAbsolute /tmp/example
I0106 20:56:46.791115 48617 scm.go:20] return from ParseFile file exists true proto specified false use copy true
I0106 20:56:46.791119 48617 scm.go:27] new source from parse file /tmp/example
I0106 20:56:46.919638 48617 sti.go:140] Preparing to build my-image
I0106 20:56:46.920750 48617 download.go:27] Copying sources from "/tmp/example" to "/var/folders/0p/4vcv19pj5d72m_bx0h40sw340000gp/T/sti218781277/upload/src"
I0106 20:56:46.920789 48617 fs.go:97] cp -a /tmp/example/ /var/folders/0p/4vcv19pj5d72m_bx0h40sw340000gp/T/sti218781277/upload/src
I0106 20:56:46.957774 48617 docker.go:224] Image openshift/python-27-centos7:latest available locally
I0106 20:56:46.957797 48617 docker.go:344] Image contains io.openshift.s2i.scripts-url set to 'image:///usr/libexec/s2i'
I0106 20:56:46.957821 48617 download.go:57] Using image internal scripts from: image:///usr/libexec/s2i/assemble
I0106 20:56:46.957831 48617 download.go:57] Using image internal scripts from: image:///usr/libexec/s2i/run
I0106 20:56:46.959506 48617 docker.go:224] Image openshift/python-27-centos7:latest available locally
I0106 20:56:46.959523 48617 docker.go:344] Image contains io.openshift.s2i.scripts-url set to 'image:///usr/libexec/s2i'
I0106 20:56:46.959538 48617 download.go:57] Using image internal scripts from: image:///usr/libexec/s2i/save-artifacts
I0106 20:56:46.959547 48617 sti.go:221] Using assemble from image:///usr/libexec/s2i
I0106 20:56:46.959553 48617 sti.go:221] Using run from image:///usr/libexec/s2i
I0106 20:56:46.959558 48617 sti.go:221] Using save-artifacts from image:///usr/libexec/s2i
I0106 20:56:46.959598 48617 ignore.go:58] .s2iignore file does not exist
I0106 20:56:46.959608 48617 sti.go:148] Clean build will be performed
I0106 20:56:46.959614 48617 sti.go:151] Performing source build from file:///tmp/example
I0106 20:56:46.959618 48617 sti.go:164] Running "assemble" in "my-image"
I0106 20:56:46.959627 48617 sti.go:412] Using image name openshift/python-27-centos7
I0106 20:56:46.959639 48617 sti.go:416] No .sti/environment provided (no environment file found in application sources)
I0106 20:56:46.961732 48617 tar.go:177] Adding to tar: /var/folders/0p/4vcv19pj5d72m_bx0h40sw340000gp/T/sti218781277/upload/src/requirements.txt as src/requirements.txt
I0106 20:56:46.963590 48617 docker.go:344] Image contains io.openshift.s2i.scripts-url set to 'image:///usr/libexec/s2i'
I0106 20:56:46.963605 48617 docker.go:399] Base directory for STI scripts is '/usr/libexec/s2i'. Untarring destination is '/tmp'.
I0106 20:56:46.963615 48617 docker.go:529] Creating container using config: {Hostname: Domainname: User: Memory:0 MemorySwap:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:true AttachStderr:false PortSpecs:[] ExposedPorts:map[] Tty:false OpenStdin:true StdinOnce:true Env:[] Cmd:[/bin/sh -c tar -C /tmp -xf - && /usr/libexec/s2i/assemble] DNS:[] Image:openshift/python-27-centos7:latest Volumes:map[] VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Mounts:[] Labels:map[]}
I0106 20:56:46.986281 48617 docker.go:543] Attaching to container
I0106 20:56:47.009378 48617 docker.go:549] Starting container
I0106 20:56:47.010403 48617 tar.go:177] Adding to tar: /var/folders/0p/4vcv19pj5d72m_bx0h40sw340000gp/T/sti218781277/upload/src/wsgi.py as src/wsgi.py
I0106 20:56:47.067859 48617 docker.go:479] Waiting for container
I0106 20:56:47.072938 48617 sti.go:492] ---> Copying application source ...
I0106 20:56:47.076852 48617 sti.go:492] ---> Installing dependencies ...
I0106 20:56:47.266589 48617 sti.go:492] Downloading/unpacking antigravity (from -r requirements.txt (line 2))
I0106 20:56:47.510395 48617 sti.go:492]   Downloading antigravity-0.1.zip
I0106 20:56:47.518420 48617 sti.go:492]   Running setup.py (path:/tmp/pip-build-XnV64H/antigravity/setup.py) egg_info for package antigravity
I0106 20:56:47.599559 48617 sti.go:492]
I0106 20:56:47.611548 48617 sti.go:492] Installing collected packages: antigravity
I0106 20:56:47.612169 48617 sti.go:492]   Running setup.py install for antigravity
I0106 20:56:47.694971 48617 sti.go:492]
I0106 20:56:47.715335 48617 sti.go:492] Successfully installed antigravity
I0106 20:56:47.715688 48617 sti.go:492] Cleaning up...
I0106 20:56:47.852496 48617 docker.go:481] Container wait returns with 0 and <nil>
I0106 20:56:47.852528 48617 docker.go:488] Container exited
I0106 20:56:47.852534 48617 docker.go:571] Invoking postExecution function
I0106 20:56:47.852577 48617 sti.go:270] No .sti/environment provided (no environment file found in application sources)
I0106 20:56:47.856595 48617 docker.go:606] Committing container with config: {Hostname: Domainname: User:1001 Memory:0 MemorySwap:0 CPUShares:0 CPUSet: AttachStdin:false AttachStdout:false AttachStderr:false PortSpecs:[] ExposedPorts:map[] Tty:false OpenStdin:false StdinOnce:false Env:[] Cmd:[/usr/libexec/s2i/run] DNS:[] Image: Volumes:map[] VolumesFrom: WorkingDir: MacAddress: Entrypoint:[] NetworkDisabled:false SecurityOpts:[] OnBuild:[] Mounts:[] Labels:map[io.openshift.s2i.scripts-url:image:///usr/libexec/s2i License:GPLv2 io.openshift.builder-base-version:71a650e io.openshift.s2i.build.image:openshift/python-27-centos7 io.k8s.display-name:my-image Vendor:CentOS io.k8s.description:Platform for building and running Python 2.7 applications io.openshift.builder-version:fb12296 io.openshift.tags:builder,python,python27,rh-python27 io.s2i.scripts-url:image:///usr/libexec/s2i io.openshift.s2i.build.commit.author: <> io.openshift.s2i.build.source-location:/tmp/example io.openshift.expose-services:8080:http]}
I0106 20:56:47.904609 48617 sti.go:315] Successfully built my-image
I0106 20:56:47.910388 48617 cleanup.go:23] Removing temporary directory /var/folders/0p/4vcv19pj5d72m_bx0h40sw340000gp/T/sti218781277
I0106 20:56:47.910419 48617 fs.go:117] Removing directory '/var/folders/0p/4vcv19pj5d72m_bx0h40sw340000gp/T/sti218781277'

@mfojtik
Copy link
Contributor

mfojtik commented Jan 6, 2016

@rhcarvalho can you please take a look at this?

@rhcarvalho
Copy link
Contributor

@bparees FYI, this is one of the things we were talking about yesterday on IRC, before I see @GrahamDumpleton complaining about it :)

This is the same for all other images. In the discussion yesterday I was raising the point why don't we copy the source to the final location == /opt/app-root/src == $PWD == WORKDIR instead of using /tmp/src and having all assemble scripts have to copy source around.

@rhcarvalho
Copy link
Contributor

We use a docker COPY instruction to copy the source code into the image.

@mfojtik what do you think about changing the defaultDestination, or, more specifically, change the COPY instruction to copy the source to the WORKDIR?

@mfojtik
Copy link
Contributor

mfojtik commented Jan 7, 2016

@rhcarvalho that will require to change all assemble scripts in images we already released, right?

@rhcarvalho
Copy link
Contributor

@mfojtik yes, we'd need to simplify the assemble scripts, remove that line.
We could do it in a backwards-compatible way so that old scripts and simplified scripts could both work.

The quick fix for this issue is to change all assemble scripts to also rm -rf /tmp/src and possibly /tmp/scripts, if not /tmp/*. But then instead of making the assemble scripts more palatable, we'd just be making them more complex.

@mfojtik
Copy link
Contributor

mfojtik commented Jan 7, 2016

@rhcarvalho why not remove that folder in S2I directly? We can add that removal to a wrapper for the assemble script, so that folder is always removed when the assemble finishes. Similar to the wrapper I'm doing for removing the secrets. One less thing to worry about in assemble script then.

@rhcarvalho
Copy link
Contributor

And why not the "wrapper" put the source in the right place as well? :)

I would like to tell people that their assemble scripts need not to be any more complex than:

pip install -r requirements.txt --use-mirrors
nosetests
python setup.py install
python setup.py test

And we'd autogenerate those and bake it into their image so nobody needs to write a single line of code in the easy case, and can customize it with 2-lines if needed.

@rhcarvalho
Copy link
Contributor

Customize it by starting from the 2-liner we'd have generated in the first place...

@mfojtik
Copy link
Contributor

mfojtik commented Jan 7, 2016

@rhcarvalho the wrapper has to know what is the "right place", right? It is already known for all images we own as OpenShift, but you don't know what the right place is for upstream images.

@rhcarvalho
Copy link
Contributor

For me the right place is always the WORKDIR. That's the case for ONBUILD Docker images and all other circumstances I've seem.

@mfojtik
Copy link
Contributor

mfojtik commented Jan 7, 2016

@rhcarvalho some images I've seen on Dockerhub does not even set WORKDIR, what would you do in that case? Copy the sources to / ?

@rhcarvalho
Copy link
Contributor

yes, because when you docker exec or oc exec into your container, that's where you'll end up, and your code will be there for you.
Not that / is the best place to have your code at... but we also know a great deal of DockerHub images simply won't work for OpenShift out-of-the-box, so maybe looking at all of them as a whole is not representative.

I had in mind the golang image that we use in https://github.com/openshift/golang-ex, it puts your code in the WORKDIR that happens to be in the GOPATH.
https://github.com/docker-library/golang/blob/3427e88341de17a4d8921b859180a2649e1ab96e/1.4/onbuild/Dockerfile

@GrahamDumpleton
Copy link
Contributor Author

I can't see that you have a choice but to leave the current behaviour as the default. You will break the builder of anyone out there who has /tmp/src hardwired in assemble script. IOW, isn't just the official S2I builders you have to worry about, people will have copied what RH did when looking at existing builders.

You also can't just assume you can instead copy to /opt/app-root/src as people may not have used that directory for the application code and copied /tmp/src elsewhere. In my own S2I builder for Python I used /app as the history of the base image that I added S2I support to, was that was the location used and I couldn't change it.

What I suggest going forward to make it more flexible is to optionally allow a S2I builder to set a label like:

LABEL io.openshift.s2i.scripts-url=image:///usr/libexec/s2i

So something like:

LABEL io.openshift.s2i.src-path=image:///tmp/src

Needs a better label name, but the idea is that you could specify in a S2I builder where the source was unpacked to. The s2i command would then unpack stuff to that location.

This could be set in sti-base to be the current value used or you could leave it out of sti-base and just override in places like sti-python if want to override.

This would then allow the assemble script in conjunction with that to implement a few scenarios.

Use:

LABEL io.openshift.s2i.src-path=image:///opt/app-root/src

if it wanted it copied direct into place, be it the recommended place or elsewhere.

Use:

LABEL io.openshift.s2i.src-path=image:///tmp/mysrc

if wanted to use a different location, where it then either copied the whole lot into a single place as often done, or even where it copied different parts of the directory into different locations.

In this case it would be the assemble script responsibility to remove temporary area.

To make sure not duplicating information in different places, the Dockerfile as well as set the label, should also set:

# Where s2i will place source code when unpacked
ENV STI_SRC_PATH=/opt/apt-root/src

That way the assemble script can use environment variable rather than hard coding. Then promote as best practice that the environment variable is used.

@bparees
Copy link
Collaborator

bparees commented Jan 8, 2016

btw another historic reason we use /tmp/src is that's not just the application source code that gets put in /tmp. We also put the run/assemble scripts there if they came from an external location, because it all comes in from one single tar file.

So the motivation was "we're going to inject this tar of stuff, some of which is source, some of which will be other stuff, so we'll just extract it in /tmp and then let the assemble script worry about which bits it cares about for what"

if we start injecting directly to a different location, the external scripts are going to show up in that directory also unless we create a more sophisticate tar structure.

In anycase the "quickfix" to this is simply to change the cp to a mv, for assemble scripts where that makes sense.

@bparees bparees added the P3 Priority 3 label Feb 3, 2016
@GrahamDumpleton
Copy link
Contributor Author

Just got hit by this problem with /tmp/src not being removed in real world use case.

It not being removed means you cannot use layered S2I builds with the default Python S2I builder, without needing to provide your own .s2i/bin/assemble which cleans up the directory.

That is, use the S2I builder to create a new image using oc new-build, where that image is then itself used an S2I builder.

This layered build might for example be used to create a new S2I builder pre-populated with Python packages or applications to effectively create a new base image, but using S2I. If as part of that first phase, files were removed from /opt/app-root/src so they will not interfere with subsequent S2I run, you still get problems because /tmp/src wasn't removed and so on second S2I run new source files get merged with old ones and both copied into /opt/app-root/src.

@bparees
Copy link
Collaborator

bparees commented Jun 8, 2016

I think i stand by my above comment that the easy fix here is to change our default assemble scripts to "mv" instead of "cp" the source. i'd be happy to accept a PR of that form.

changing the basic s2i flow to remove the /tmp/src content is unlikely because for compiled languages, we want to (at least by default) include the source that compiled the artifacts in the image.

@GrahamDumpleton
Copy link
Contributor Author

Depending on how you intend doing it, mv instead of cp then prevents you from easily doing layered S2I builders where you do want to have some stuff selectively merged with what is already in the src directory. You might also even feasibly want src to be pre-populated with some files in the base image and a builder to add them on top. IOW, you can't naively just mv the whole src directory such that it replaces any existing src directory in its entirety. Any mv would need to be done on a file per file basis creating extra directories as necessary where a corresponding directory doesn't exist.

In respect of the compiled languages, are you saying that the source code is compiled out of the tmp directory?

@bparees
Copy link
Collaborator

bparees commented Jun 8, 2016

Depending on how you intend doing it, mv instead of cp then prevents you from easily doing layered S2I builders where you do want to have some stuff selectively merged with what is already in the src directory.

I was just thinking we'd change:
https://github.com/openshift/s2i-python/blob/master/2.7/s2i/bin/assemble#L14

to:
mv /tmp/src/* ./

(with some dotglob magic to make sure we get dot-files)

that will run into problems w/ merging directories if there is overlap, so if you want to get really fancy i suppose we could use rsync, but at some point i feel like if you're really trying to do what you're doing (provide a new s2i builder image that's built on top of the python one, but includes some framework source code) then maybe you really should be providing a custom assemble script to handle merging the user's source code into what you are providing in your new s2i builder image.

In respect of the compiled languages, are you saying that the source code is compiled out of the tmp directory?

well i was thinking it was, but i'm clearly wrong, at least for wildfly:
https://github.com/openshift-s2i/s2i-wildfly/blob/master/10.0/s2i/bin/assemble#L165-L171

@bparees
Copy link
Collaborator

bparees commented Jul 12, 2016

as an update, it looks like we are going to adopt the mv approach in the php image:
https://github.com/sclorg/s2i-php-container/pull/114/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Priority 3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants