Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YARN-1964 Launching containers from docker #7

Closed

Conversation

ashahab-altiscale
Copy link

This adds a new ContainerExecutor called DockerContainerExecutor.
This executor launches a container in a docker container, providing
a full filesystem namespace and software isolation for the container.

@ashahab-altiscale ashahab-altiscale force-pushed the aws-yarn-1964-trunk-env-in-dce branch 2 times, most recently from d8d0c26 to 3d0f789 Compare October 20, 2014 20:37
@ashahab-altiscale ashahab-altiscale force-pushed the aws-yarn-1964-trunk-env-in-dce branch 3 times, most recently from 9fc389a to 90a74f7 Compare November 8, 2014 04:02
This adds a new ContainerExecutor called DockerContainerExecutor.
This executor launches a container in a docker container, providing
a full filesystem namespace and software isolation for the container.
This removes the option to provide arbitrary options to the docker run command for DockerContainerExecutor.
Now the yarn administrator will set up the docker executor, and the user will provide the docker image.
The docker image must container all resources and environment variables setup to run the users job.
@resouer
Copy link

resouer commented Dec 10, 2014

I'm really interested in this feature, but why it is not merged yet?
But why I can see this patch has been added to 2.6.0?
http://hadoop.apache.org/releases.html#18+November%2C+2014%3A+Release+2.6.0+available

I'm really confused ....

@ashahab-altiscale
Copy link
Author

It has been merged to trunk and branch-2.6

On Wed, Dec 10, 2014 at 12:01 AM, Harry Zhang notifications@github.com
wrote:

I'm really interested in this feature, but why it is not merged yet?
But why I can see this patch has been added to 2.6.0?

http://hadoop.apache.org/releases.html#18+November%2C+2014%3A+Release+2.6.0+available

I'm really confused ....


Reply to this email directly or view it on GitHub
#7 (comment).

@resouer
Copy link

resouer commented Dec 10, 2014

So is it possible for me to launch Dockers in Yarn to run Spark jobs now? Is there a guide for me to do so?

@ashahab-altiscale
Copy link
Author

No, but feel free to contribute that guide to hadoop.

On Wed, Dec 10, 2014 at 1:06 AM, Harry Zhang notifications@github.com
wrote:

So is it possible for me to launch Dockers in Yarn to run Spark jobs now?
Is there a guide for me to do so?


Reply to this email directly or view it on GitHub
#7 (comment).

@resouer
Copy link

resouer commented Dec 11, 2014

Thanks! I'd like to.

BTW, what's the current status of this great yarn-docker work.

  1. Can I use it now?
  2. Do I need to use customized docker?
  3. What I can do if I want to contribute to make it better?

@modeyang
Copy link

hi, ashahab-altiscale,
has any way or plan apply docker container to mapreduce workers within hadoop ?

mekasone pushed a commit to mekasone/hadoop that referenced this pull request Feb 19, 2017
@OneCricketeer
Copy link

Should probably be closed?

Superseded by https://issues.apache.org/jira/browse/YARN-5388

@aajisaka
Copy link
Member

This issue has been fixed. Closing.

@aajisaka aajisaka closed this Jan 24, 2019
chancez pushed a commit to chancez/hadoop that referenced this pull request Jun 16, 2019
Dockerfile*: Remove hadoop home symlink, just put everything into /opt/hadoop
qinghui-xu pushed a commit to qinghui-xu/hadoop that referenced this pull request Feb 4, 2022
This method was not present in HDP2 and has been added in HDP3. But instead of providing res.remainingPath to the res.targetFileSystem, it provides the complete path given as argument to the method.

Detail of the issue:
   - when ViewFsFileSystem acts on a given path, it tries to 'resolve' the path to get the underlying FileSystem that can access it. For instance, for a path like /tmp/a/b or viewfs://root/tmp/a/b, due to our configuration, the resolved fs is a ChRootedFileSystem, that targets the URI hdfs://root-preprod-pa4/tmp (which it self relies on a distributed filesystem that targets the couple of namenodes for the root namespace).
   - such a ChRootedFileSystem expects to be given either absolute paths (no scheme nor authority) or paths targeting the same scheme://authority, and it will then add the path prefix to the path (as chroot would do). For instance, if the ChRootedFileSystem URI is hdfs://root-preprod-pa4/tmp, it will transform:
       - /a/b/c to /tmp/a/b/c and will use the DistributedFileSystem for hdfs://root-preprod-pa4 to perform the remote call
       - hdfs://root-preprod-pa4/a/b/c to /tmp/a/b/c (and same as above)
       - /tmp/a/b/c or hdfs://root-preprod-pa4/tmp to /tmp/tmp/a/b/c (note the double /tmp, that was just for fun...)

In the end, if get status is called on a viewfs file system that will use some underlying ChRootedFileSystem, using an absolute path, it will not fail, but preprend a prefix, causing most of the time undesired path (the /tmp/tmp example above), and if it is called using a scheme+authority (viewfs://root/tmp/a/b), it will fail, because it is not the scheme expected by the ChRootedFileSystem.
As a matter of fact, the resolve method does the job of finding the underlying filesystem and remove path prefixes in the case of ChRootedFileSystem and puts the result in the remainingPath variable.

Co-authored-by: William Montaz <w.montaz@criteo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants