-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-17880. Build 2.10.x with docker #3349
HADOOP-17880. Build 2.10.x with docker #3349
Conversation
@GauthamBanasandra can you take a look at this PR? |
@ZhendongBai could you please mention the HADOOP JIRA in the title of this PR? |
@GauthamBanasandra thanks a lot for your review, and just now I create a Haooop Jira issue with description, and mention it in PR description. please review again. |
dev-support/docker/Dockerfile
Outdated
@@ -18,234 +17,80 @@ | |||
# Dockerfile for installing the necessary dependencies for building Hadoop. | |||
# See BUILDING.txt. | |||
|
|||
FROM ubuntu:xenial | |||
FROM centos:7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been using Ubuntu as the default build environment. May I know why you would want to use Centos 7 instead of Ubuntu Xenial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GauthamBanasandra for ubuntu, openjdk-7 not found installation candidate, the error shows as bellow:
.....
ERROR [ 8/27] RUN apt-get -q update && apt-get -q install -y --no-install-recommends openjdk-7-jdk && apt-get clean && rm -rf /var/lib/apt/lists/*
#11 94.95 Get:20 http://archive.ubuntu.com/ubuntu xenial-backports/universe amd64 Packages [12.7 kB]
#11 95.07 Fetched 19.4 MB in 1min 34s (205 kB/s)
#11 95.07 Reading package lists...
#11 95.99 Reading package lists...
#11 96.88 Building dependency tree...
#11 97.02 Reading state information...
#11 97.04 Package openjdk-7-jdk is not available, but is referred to by another package.
#11 97.04 This may mean that the package is missing, has been obsoleted, or
#11 97.04 is only available from another source
#11 97.04
#11 97.04 E: Package 'openjdk-7-jdk' has no installation candidate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because Java 7 isn't there in the Ubuntu Xenial toolchain. Have you tried installing the Java 7 by downloading its tarball? Maybe this'll help - https://linuxconfig.org/oracle-java-jdk-7-on-ubuntu-linux-source-or-rpm-installation
I would advise against changing the base image to Centos 7 if this is the only reason. Here are my suggestions -
- If you still want to use Centos 7, I would suggest that you create a separate Dockerfile for it (like Dockerfile_centos_7) and leave the original Dockerfile as it is.
- Or, just install Java 7 from the tarball in Dockerfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GauthamBanasandra ok, thanks a lot,besides jdk7 not found problem, python pylint is not installed sucessfully,and the python dependencies have problems. So I decide to give up fixing the ubuntu bugs, and choose to create a separate Dockerfile for centos later.
@ZhendongBai you would need to mention the Hadoop JIRA in the title of the PR, as mentioned in the description. |
@GauthamBanasandra ok, add the jira at the start of PR title. |
@GauthamBanasandra I update the PR, and add the new Dockerfile named Dockerfile_centos7, and keep the old Dockerfile unchanged, for reasons:
|
Ok, sounds good @ZhendongBai . Since there's no CI setup for this branch. Could you please build this branch locally, |
@GauthamBanasandra Ok, I rerun the docker build and mvn commands, after |
@ZhendongBai could you please use the following command to build and upload the build log?
|
@@ -0,0 +1,96 @@ | |||
# Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please rename this file to Dockerfile_centos_7
? Just so that we're consistent with the filename in trunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GauthamBanasandra I rename Dockerfile_centos7 to Dockerfile_centos_7 to keep consistent with the filename in trunk. and mvn clean package -Dhttps.protocols=TLSv1.2 -DskipTests -Pnative,dist -Drequire.fuse -Drequire.openssl -Drequire.snappy -Drequire.valgrind -Drequire.zstd -Drequire.test.libhadoop -Pyarn-ui -Dtar -Dmaven.javadoc.skip=true > build.log 2>&1
logs here: build.log, and because some javadocs are illegal, and javadocs check failed, I add -Dmaven.javadoc.skip=true
to build command. please review again, thanks.
…os7 to Dockerfile_centos_7
BUILDING.txt
Outdated
as the proposed solution: | ||
https://github.com/boot2docker/boot2docker/issues/64 | ||
An alternative solution to this problem is to install Linux native inside a virtual machine | ||
and run your IDE and Docker etc inside that VM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would need to restore these lines as Hadoop 2.10.x is now buildable using Dockerfile
on Virtualbox (as of this commit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GauthamBanasandra but when this PR merged, at some point, we can build hadoop 2.10.0 just with docker directly. And I don't know when the BUILDING.txt should change and how to describe that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should retain the ability to build using Virtualbox and your PR would just add the ability to build using Docker, instead of deprecating the ability to build using Virtualbox. With that said, you wouldn't need to modify the docs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will revert the file.
BUILDING.txt
Outdated
as the proposed solution: | ||
https://github.com/boot2docker/boot2docker/issues/64 | ||
An alternative solution to this problem is to install Linux native inside a virtual machine | ||
and run your IDE and Docker etc inside that VM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should retain the ability to build using Virtualbox and your PR would just add the ability to build using Docker, instead of deprecating the ability to build using Virtualbox. With that said, you wouldn't need to modify the docs here.
@GauthamBanasandra pls review again, thanks a lot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, one final change that I would like to request from you @ZhendongBai - please change the title to something appropriate. Something like HADOOP-17880. Build 2.10.x with docker
in both Github and JIRA. I'll approve this PR after you make this change.
@GauthamBanasandra I already changed the title for PR and JIRA, because this is my first time contribute to Hadoop, cost your time too much, thanks a lot. |
@ZhendongBai You should submit the patch based on apache:branch-2.10 instead of apache:branch-2.10.0. branch-2.10.0 is obsolete one for already released 2.10.0. |
RUN pkg-resolver/install-zstandard.sh centos:7 | ||
RUN pkg-resolver/install-yasm.sh centos:7 | ||
RUN pkg-resolver/install-protobuf.sh centos:7 | ||
RUN pkg-resolver/install-boost.sh centos:7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need boost for branch-2.10? If we can omit this, building time and footprint of the image can be reduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I upgraded Boost when I wrote this PR #2051. This was on Hadoop 3.x. Since this PR isn't backported to Hadoop 2.x, Boost isn't needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZhendongBai if you make any change to the Dockerfile_centos_7, please do a local build and upload the build log as a comment so that we can verify. We need this step since we don't have a pre-commit CI for Hadoop 2.x branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZhendongBai if you make any change to the Dockerfile_centos_7, please do a local build and upload the build log as a comment so that we can verify. We need this step since we don't have a pre-commit CI for Hadoop 2.x branch.
@iwasakims @GauthamBanasandra thanks,I will try to remove boost and do a local build and upload the build log.
@iwasakims thanks,I will close this pr, and commit a new pr based on branch-2.10 |
@iwasakims @GauthamBanasandra I commit a new pr base on branch-2.10, #3535, and close this pr, please review again, thanks. |
Description of PR
this pr support build hadoop 2.10.x with docker.
How was this patch tested?
test on mac x86_64
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?