-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared library loading from /tmp is broken when deleting the loaded library file to prevent outside access. #1911
Comments
@srid Do you have an idea of what the fix should be? |
I think it has something to do with the way lxc mount /tmp. I"ll investigate. |
Possibly related to #4301 |
Yeah, looks like exactly the same issue, running under devicemapper fixes it (and I guess so would running /tmp as tmpfs or a volume). I will close that ticket in favour of this one. The only possibly useful comment I made on that ticket: Looking at http://blog.dotcloud.com/kernel-secrets-from-the-paas-garage-part-34-a it seems mmap used to be a problem, but that was a while ago. |
regarding the handling of wrapped dynamic libraries. The basic flow of operation is to copy such libraries into a temp file, hand them to the OS loader for processing, and then to delete them immediately, to prevent them form being accessible to other executables. On platforms where that is not possible the library is left in place and things are arranged to delete it on regular process exit. An example of the latter are older revisions of HPUX which report that the file is busy when trying to delete it. Younger revisions of HPUX have changed to allow the deletion, but are also buggy, the OS loader mangles its data structures so that a second library loaded in this manner fails. More recently it was found that Linux which is usually ok with deleting the file and gets everything right shows the same trouble as modern HPUX when the "docker" containerization system is involved, or more specifically the AUFS in use there. Deleting the loaded library file mangles data structures and breaks loading of the following libraries. For a demonstration which does not involve Tcl at all see the ticket moby/moby#1911 in the docker tracker. This of course breaks the use of wrapped executables within docker containers. This commit introduces the function TclSkipUnlink() which centralizes the handling of such exceptions to unlinking the library after unload, and provides code handling the known cases. IOW HPUX is generally forced to not unlink, and ditto when we detect that the copied library file resides within an AUFS. The latter must however be explicitly activated by setting the define -DTCL_TEMPLOAD_NO_UNLINK during build. We still need proper configure tests to set it on the relevant platforms (i.e. Linux). The AUFS detection and handling can be overridden by the environment variable TCL_TEMPLOAD_NO_UNLINK which can force the behaviour either way (skip or not). In case the user knows best, or wishes to test if the problem with AUFS has been fixed.
@srid can you reattach the files (or put them in a repo somewhere since that's all I'll do if you reattach)? (Edit: I'm wanting to check if this is still an issue) |
Looks like it works ok in ubuntu 14.04, but not 12.04: Both machines:
This one works:
This one is broken:
However, Ubuntu 12.04 end of life is not until 2017, so 'upgrade' is not really a good solution... |
Can't reproduce any more, even with an old version of docker. I assume something was fixed in an Ubuntu package. |
regarding the handling of wrapped dynamic libraries. The basic flow of operation is to copy such libraries into a temp file, hand them to the OS loader for processing, and then to delete them immediately, to prevent them from being accessible to other executables. On platforms where that is not possible the library is left in place and things are arranged to delete it on regular process exit. An example of the latter are older revisions of HPUX which report that the file is busy when trying to delete it. Younger revisions of HPUX have changed to allow the deletion, but are also buggy, the OS loader mangles its data structures so that a second library loaded in this manner fails. More recently it was found that Linux which is usually ok with deleting the file and gets everything right shows the same trouble as modern HPUX when the "docker" containerization system is involved, or more specifically the AUFS in use there. Deleting the loaded library file mangles data structures and breaks loading of the following libraries. For a demonstration which does not involve Tcl at all see the ticket moby/moby#1911 in the docker tracker. This of course breaks the use of wrapped executables within docker containers. This commit introduces the function TclSkipUnlink() which centralizes the handling of such exceptions to unlinking the library after unload, and provides code handling the known cases. IOW HPUX is generally forced to not unlink, and ditto when we detect that the copied library file resides within an AUFS. The latter must however be explicitly activated by setting the define -DTCL_TEMPLOAD_NO_UNLINK during build. We still need proper configure tests to set it on the relevant platforms (i.e. Linux). The AUFS detection and handling can be overridden by the environment variable TCL_TEMPLOAD_NO_UNLINK which can force the behaviour either way (skip or not). In case the user knows best, or wishes to test if the problem with AUFS has been fixed.
regarding the handling of wrapped dynamic libraries. The basic flow of operation is to copy such libraries into a temp file, hand them to the OS loader for processing, and then to delete them immediately, to prevent them from being accessible to other executables. On platforms where that is not possible the library is left in place and things are arranged to delete it on regular process exit. An example of the latter are older revisions of HPUX which report that the file is busy when trying to delete it. Younger revisions of HPUX have changed to allow the deletion, but are also buggy, the OS loader mangles its data structures so that a second library loaded in this manner fails. More recently it was found that Linux which is usually ok with deleting the file and gets everything right shows the same trouble as modern HPUX when the "docker" containerization system is involved, or more specifically the AUFS in use there. Deleting the loaded library file mangles data structures and breaks loading of the following libraries. For a demonstration which does not involve Tcl at all see the ticket moby/moby#1911 in the docker tracker. This of course breaks the use of wrapped executables within docker containers. This commit introduces the function TclSkipUnlink() which centralizes the handling of such exceptions to unlinking the library after unload, and provides code handling the known cases. IOW HPUX is generally forced to not unlink, and ditto when we detect that the copied library file resides within an AUFS. The latter must however be explicitly activated by setting the define -DTCL_TEMPLOAD_NO_UNLINK during build. We still need proper configure tests to set it on the relevant platforms (i.e. Linux). The AUFS detection and handling can be overridden by the environment variable TCL_TEMPLOAD_NO_UNLINK which can force the behaviour either way (skip or not). In case the user knows best, or wishes to test if the problem with AUFS has been fixed.
regarding the handling of wrapped dynamic libraries. The basic flow of operation is to copy such libraries into a temp file, hand them to the OS loader for processing, and then to delete them immediately, to prevent them from being accessible to other executables. On platforms where that is not possible the library is left in place and things are arranged to delete it on regular process exit. An example of the latter are older revisions of HPUX which report that the file is busy when trying to delete it. Younger revisions of HPUX have changed to allow the deletion, but are also buggy, the OS loader mangles its data structures so that a second library loaded in this manner fails. More recently it was found that Linux which is usually ok with deleting the file and gets everything right shows the same trouble as modern HPUX when the "docker" containerization system is involved, or more specifically the AUFS in use there. Deleting the loaded library file mangles data structures and breaks loading of the following libraries. For a demonstration which does not involve Tcl at all see the ticket moby/moby#1911 in the docker tracker. This of course breaks the use of wrapped executables within docker containers. This commit introduces the function TclSkipUnlink() which centralizes the handling of such exceptions to unlinking the library after unload, and provides code handling the known cases. IOW HPUX is generally forced to not unlink, and ditto when we detect that the copied library file resides within an AUFS. The latter must however be explicitly activated by setting the define -DTCL_TEMPLOAD_NO_UNLINK during build. We still need proper configure tests to set it on the relevant platforms (i.e. Linux). The AUFS detection and handling can be overridden by the environment variable TCL_TEMPLOAD_NO_UNLINK which can force the behaviour either way (skip or not). In case the user knows best, or wishes to test if the problem with AUFS has been fixed.
…aster Lock goroutine to OS thread while changing NS
[ bug description was provided by colleague @andreas-kupries ]
Various methods for creating a single-file executable for scripting
applications allow the wrapping of shared libraries into their executable. When
using these shared libraries the underlying application code will copy the
library out of the internal virtual filesystem to /tmp to make them visible to
libdl for actual loading. For proper hygiene these temporary files are deleted
from /tmp immediately after libdl loaded them. The process (OS and libdl)
is/are still able to access the file through its fd and/or mmap handle. (On
systems which do not allow that, like older HPUX, the temp files are marked to
be deleted on process exit).
Regardless, when doing this in a docker container only the first shared library
is loaded properly in this way, and a second shared library is not. In the
attached example minimally demonstrating the effect a symbol looked up in the
second library is improperly resolved to a pointer in the first library.
In the original bug the effect actually was failure to find a function symbol
definitely present in the 2ndly loaded library, as per 'nm's output.
The problem goes away when either not going through /tmp to load the library,
or when not deleting it immediately after loading, but deferring this to
process-exit.
Outside of a docker container the issue does not happen, the system is able to
load as many shared libraries via the /tmp and deleting them after load without
problem.
https://github.com/ActiveState/docker-issue-1911
Sources (.c and shell scripts) for a demo of the shlib issue.
Four C files: 2 variants of a main application, and two minimal "packages".
Each package exports the function "fun".
The two main variants differ only in a single (un)commented line.
Outside of a docker container both variants are ok, printing A and B, for the
two packages they load. Inside of a container the bad variant does not error,
but will print A twice. I.e. the symbol for 'fun' which should have been from
the "shb" library was taken from "sha".
Session output:
Outside:
Inside docker:
Oh, and changing the if(1) in main-bad to if(0), preventing the demo from going
through /tmp fixes the issue as well.
So, final: This looks to be in the intersection of libdl, lxc containers, and
the filesystem responsible for /tmp inside a docker container. Removing the
loaded shlib breaks some data structures to the point where a second, following
shlib also coming from /tmp is mis-processed (wrong function pointer delivered,
or, in the original example, a symbol not found).
The text was updated successfully, but these errors were encountered: