Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE-48][FEATURE] Add Uniffle Dockerfile #132

Merged
merged 1 commit into from
Aug 10, 2022

Conversation

wangao1236
Copy link
Collaborator

@wangao1236 wangao1236 commented Aug 5, 2022

What changes were proposed in this pull request?

To solve issue #48 ,I add the Dockerfile for Uniffle.

Why are the changes needed?

Support K8S Operator

Does this PR introduce any user-facing change?

Yes, we will add the document later

How was this patch tested?

Manual test

@wangao1236
Copy link
Collaborator Author

/assign @jerqi

@wangao1236 wangao1236 changed the title Add dockerfile of rss [ISSUE-48] [Feature] Add dockerfile of rss Aug 5, 2022
@jerqi jerqi changed the title [ISSUE-48] [Feature] Add dockerfile of rss [ISSUE-48][FEATURE] Add docker file Aug 5, 2022
@jerqi jerqi changed the title [ISSUE-48][FEATURE] Add docker file [ISSUE-48][FEATURE] Add Dockerfile Aug 5, 2022
@jerqi jerqi changed the title [ISSUE-48][FEATURE] Add Dockerfile [ISSUE-48][FEATURE] Add Uniffle Dockerfile Aug 5, 2022
deploy/kubernetes/docker/Dockerfile Outdated Show resolved Hide resolved
deploy/kubernetes/docker/build.sh Outdated Show resolved Hide resolved
deploy/kubernetes/docker/hadoopconfig/.gitkeep Outdated Show resolved Hide resolved
deploy/kubernetes/docker/start.sh Show resolved Hide resolved
deploy/kubernetes/docker/build.sh Show resolved Hide resolved
deploy/kubernetes/docker/build.sh Outdated Show resolved Hide resolved
@jerqi
Copy link
Contributor

jerqi commented Aug 5, 2022

cc @zuston @czy006 Could you help me review this patch?

.gitignore Outdated Show resolved Hide resolved
@jerqi
Copy link
Contributor

jerqi commented Aug 5, 2022

Could you remove the file https://github.com/apache/incubator-uniffle/blob/master/deploy/kubernetes/docker/.gitkeep? Because the directory docker has files, don't need a keep file.

frankliee
frankliee previously approved these changes Aug 5, 2022
@@ -0,0 +1,29 @@
FROM openjdk:8-jdk-slim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to use other general-purpose images, such as alpine or centos.
Many companies may have their own JDK choices, so we should treat JDK as 3rd party lib hadoop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use dependency_cache to save our build time

for example:
FROM openjdk:8-jdk-slim as dependency_cache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine.
Can we use 8-jre-slim here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine. Can we use 8-jre-slim here?

What's your reason?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine. Can we use 8-jre-slim here?

What's your reason?

It's smaller.

Copy link
Contributor

@jerqi jerqi Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use dependency_cache to save our build time

for example: FROM openjdk:8-jdk-slim as dependency_cache

It's great. Could you raise a follow-up pr which uses dependency_cache?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems 8-jdk-slim image is based on Debian stable, which is fine. Can we use 8-jre-slim here?

What's your reason?

It's smaller.

It may need more tests. If 8-jdk-slim is fine, we can choose it first. When our e2e test is ready, wen can use e2e test to verify the 8-jre-slim.

.gitignore Outdated
@@ -21,4 +21,5 @@ metastore_db/
derby.log
dependency-reduced-pom.xml
rss-*.tgz

hadoop-*.tar.gz
/deploy/kubernetes/hadoopconfig/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use absolute path here? if hadoopconfig is useful, why do we add it to .gitignore file?

Copy link
Member

@zuston zuston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Left some minor comments

IMAGE=$REGISTRY/rss-server:$RSS_VERSION

echo "building image: $IMAGE"
docker build -t "$IMAGE" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using LABEL to add more metadata, like build-time/author/branch/commit-id?


if [ "$SERVICE_NAME" == "coordinator" ];then

bash ${basedir}/bin/start-coordinator.sh &
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly exit when executing startup script failed.

bash ${basedir}/bin/start-coordinator.sh || exit 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give me your reason?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start.sh will monitor the process existence, if the coordinator startup failed, we should exit directly, instead of waiting the non-existence PID and then exit 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We kill server or coordinator, the process will not exit immediately sometimes. The process will occupy the port, we should wait for the port for a while.

RUN tar -zxvf /data/rssadmin/hadoop-${HADOOP_VERSION}.tar.gz -C /data/rssadmin
RUN mv /data/rssadmin/hadoop-${HADOOP_VERSION} /data/rssadmin/hadoop
RUN rm -rf /data/rssadmin/hadoop-${HADOOP_VERSION}.tar.gz
COPY hadoopconfig/ /data/rssadmin/hadoop/etc/hadoop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What files in hadoopconfig folder? core-site/hdfs-site.xml?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

ARG HADOOP_VERSION
ARG RSS_VERSION

RUN apt-get update && apt-get install -y zlib1g zlib1g-dev lzop lsof netcat dnsutils less procps iputils-ping \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider kerberos? If yes, we need to setup kerberos client.

Maybe we shouldn't consider this in early version.

Copy link
Contributor

@jerqi jerqi Aug 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't consider the kerberos dependency here. It's ok for us if there are no big conflicts, it will be convenient to modify Dockerfile after we merge #53

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @zuston @frankliee @kaijchen @czy006 Do you have any other suggestion?

@jerqi
Copy link
Contributor

jerqi commented Aug 10, 2022

@zuston @kaijchen @frankliee @czy006 Thanks for your review, if you have any other suggestion, you can raise a follow-up pr. Merged.

@jerqi jerqi merged commit 18663ed into apache:master Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants