Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Arm supporting docker image #4703

Closed
wants to merge 1 commit into from

Conversation

Lewuathe
Copy link
Member

@Lewuathe Lewuathe commented Aug 5, 2020

Since Presto now supports Arm architecture officially, it would be great if we could also try the Arm platform with docker out-of-the-box. This change introduces a new docker image suffixed with -arm64v8 supporting Arm platform. I confirmed the image works well with the Graviton2 processor in AWS.

It's related to #2262.

Although I assumed build-remote.sh would distribute the image in the DockerHub, it might not be correct because I could not find the usage of the script in the repository. Does just building an image with built-remote.sh correctly push it to DockerHub?

@cla-bot cla-bot bot added the cla-signed label Aug 5, 2020
docker/Dockerfile-arm64v8 Outdated Show resolved Hide resolved
docker/README.md Outdated Show resolved Hide resolved
docker/build-local.sh Outdated Show resolved Hide resolved
@electrum
Copy link
Member

electrum commented Aug 6, 2020

Thanks for contributing this. It's certainly useful for ARM servers, and will be needed with the upcoming ARM-based Macs. In my very brief research, it looks like we can build a single image with support for multiple architectures, rather than separately tagged images: https://medium.com/nttlabs/buildx-multiarch-2c6c2df00ca2

Do you have any experience with that? If it works, that seems like a better approach.

build-local.sh is for building from a server tarball that you built in your local source tree. build-remote.sh is for building from a version in a Maven repository (typically this is a release version, but could be a snapshot). The Presto maintainers publish the image to Docker hub after a Presto release with this: https://github.com/prestosql/release-scripts/blob/master/presto-docker.sh

@Lewuathe
Copy link
Member Author

Lewuathe commented Aug 6, 2020

@electrum

it looks like we can build a single image with support for multiple architectures, rather than separately tagged images: https://medium.com/nttlabs/buildx-multiarch-2c6c2df00ca2

All I tried was to use the OpenJDK image supporting Armv8 as the base image. To use the different base images, we need to use the different Dockerfile which prevents us from building a single image supporting multiple architectures with a single command.

But I found the CentOS image supports multiple architectures. Instead of using OpenJDK image, we may be able to build a single image with CentOS base by manually installing Zulu supporting Arm. I'll try.

I think I can work on that after merging #4642.

@electrum
Copy link
Member

@Lewuathe the switch to Zulu JDK is merged now

@Lewuathe Lewuathe force-pushed the distribute-arm-image branch 2 times, most recently from f2150fd to 0f34881 Compare August 17, 2020 07:32
@Lewuathe
Copy link
Member Author

Lewuathe commented Aug 17, 2020

Unfortunately, we could not build multi-arch images with one line command due to several reasons.

--load option is not supported

We could not use BuildKit to build multiple architectures with one command with --load option.

docker/buildx#59

According to the above issue, --load option does not allow us to build multi-arch images. Without --load option, the build image will not be visible in the list of images shown by docker images.

Zulu OpenJDK for ARM64 is not available

Additionally, Zulu OpenJDK for ARM64 is not distributed in RPM format. That indicates we cannot use the same Dockerfile for amd64 and arm64/v8. The PR has another Dockerfile for ARM64.

CentOS 7 image does not support arm64/v8

Looking at the list of available images, CentOS 7 for arm64/v8 is not distributed.

Therefore, we need to prepare the separated Dockerfile and build it with an additional command. I confirmed the image is working with Graviton 2 processor (m6g.medium)

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great progress!

It seems like we're still not building a multi architecture image, since the ARM image has a separate tag. Based on the buildx docs, it seems like we need to invoke buildx with multiple platforms at once:

docker buildx build --platform=linux/amd64,linux/arm64/v8

For this to work, I think we need to use a single Dockerfile -- I couldn't find a way to use a separate one per platform. It seems you can use TARGETPLATFORM to do conditional logic based on the target platform.

It's fine to use CentOS 8 for both architectures. We simply haven't had a need to upgrade yet, but it would be easier and better to upgrade so that both architectures are consistent.

.github/workflows/ci.yml Outdated Show resolved Hide resolved
docker/Dockerfile-arm64v8 Outdated Show resolved Hide resolved

rm -r ${WORK_DIR}

# Source common testing functions
. container-test.sh

test_container "presto:${PRESTO_VERSION}"

# Push multi-arch images supporting amd64 and arm64 architecture.
DOCKER_CLI_EXPERIMENTAL=enabled docker buildx build ${WORK_DIR} --platform linux/amd64,linux/arm64/v8 -f Dockerfile --build-arg "PRESTO_VERSION=${PRESTO_VERSION}" -t "presto:${PRESTO_VERSION}" --push
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although --load is not supported for the multi-arch image, we can use --push to publish the images at once. Is it okay to push images here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The published images will look like this.

Screen Shot 2020-09-10 at 16 26 02

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @Lewuathe What about using aarch64 instead of arm64/v8? I found presto use aarch64 in source code here and looks like aarch64 is more popular.

Copy link
Member Author

@Lewuathe Lewuathe Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, there is no aarch64 parameter in the list supported by Docker builder. We need to specify arm64/v8 as a TARGETPLATFORM parameter. But arm64/v8 and aarch64 are basically the same thing, IIUC.

$ docker buildx ls
NAME/NODE            DRIVER/ENDPOINT             STATUS  PLATFORMS
default              docker
  default            default                     running linux/amd64, linux/arm64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this a separate script like push-image.sh, so that we can build images separately from pushing them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what --push does. It says it pushes it to a repository, but which one? We don't specify prestosql anywhere as the Docker Hub repository. Is there a way to build the ARM64 image without pushing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should prepare the separate script push-image.sh so that we can push both images by --push option to prestosql orgniazation. But I do not have a way to check pushing to prestosql organization properly. How do you usually push Docker images to prestosql?

@electrum
Copy link
Member

Why do we need --load? I don't understand what that does (reading the manual didn't help).

@Lewuathe
Copy link
Member Author

@electrum --load and --push specifies the location where the resulting image appears. Without --load, the image won't be visible with docker images command, which means we cannot test the image with test_container script.

@Lewuathe
Copy link
Member Author

Lewuathe commented Nov 4, 2020

@electrum Could you take a look into this when you get a chance?

Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in following up. It might help for us to chat on Slack about this.

docker/README.md Outdated Show resolved Hide resolved

rm -r ${WORK_DIR}

# Source common testing functions
. container-test.sh

test_container "presto:${PRESTO_VERSION}"

# Push multi-arch images supporting amd64 and arm64 architecture.
DOCKER_CLI_EXPERIMENTAL=enabled docker buildx build ${WORK_DIR} --platform linux/amd64,linux/arm64/v8 -f Dockerfile --build-arg "PRESTO_VERSION=${PRESTO_VERSION}" -t "presto:${PRESTO_VERSION}" --push
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this a separate script like push-image.sh, so that we can build images separately from pushing them.


rm -r ${WORK_DIR}

# Source common testing functions
. container-test.sh

test_container "presto:${PRESTO_VERSION}"

# Push multi-arch images supporting amd64 and arm64 architecture.
DOCKER_CLI_EXPERIMENTAL=enabled docker buildx build ${WORK_DIR} --platform linux/amd64,linux/arm64/v8 -f Dockerfile --build-arg "PRESTO_VERSION=${PRESTO_VERSION}" -t "presto:${PRESTO_VERSION}" --push
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what --push does. It says it pushes it to a repository, but which one? We don't specify prestosql anywhere as the Docker Hub repository. Is there a way to build the ARM64 image without pushing?

@Lewuathe Lewuathe force-pushed the distribute-arm-image branch 2 times, most recently from 371591e to de5f902 Compare November 27, 2020 02:17
Since Presto now supports Arm architecture officially, it would be great
if we could also try Arm platform with docker out-of-the-box. This
change introduces new docker image suffixed with `-arm64v8` supporting
Arm platform.
@Lewuathe
Copy link
Member Author

I updated the scripts accordingly and confirmed the image works properly in t4g instance type with Ubuntu Bionic.

@GaruGaru
Copy link
Member

@electrum what are the needed steps to see this merged ?
This would be a nice feature since most cloud providers are providing arm64 vms with nice performance/price ratio, I guess that it would also enable apple M1 users to run trino locally using the official docker image

@martin-g
Copy link

There are some conflicting files. The first step is to resolve the conflicts.

@svarosky90
Copy link

svarosky90 commented Mar 11, 2021

@Lewuathe Are you still working on this ?
Is there any possibility to work on this pull request? Maybe open a new one related to #4703 and try to speed up the process?

@Lewuathe
Copy link
Member Author

@svarosky90 It may be better to open another pull request after I resolve the conflict. But I'm not sure what steps will be necessary to be merged because I do not get any feedback from the last update.

Anyway, I'll reopen the pull request.

@Lewuathe
Copy link
Member Author

@electrum The new image seems to use the image zulu-openjdk-centos.
https://hub.docker.com/r/azul/zulu-openjdk-centos/tags?page=1&ordering=last_updated

But it does not support ARM architecture. Do you think it's okay to create another Dockerfile for ARM or replacing the zulu image with manual Zulu installation as we did before?

@svarosky90
Copy link

Any update on this? It would be very useful running trino on arm based machines.

@Lewuathe
Copy link
Member Author

Lewuathe commented Jun 6, 2021

@svarosky90 If we can create a separate image for Arm, it must be easy. But if we need to support multi-arch image, there are several things to be clarified (e.g. #4703 (comment)).

@electrum Is it okay to build another image from a separate Dockerfile?

@electrum
Copy link
Member

@Lewuathe Apologies for not following up. I contacted Azul about supporting arm64 in their Docker images. Until that happens, I think we could go with the simple approach of using the CentOS image as the base and installing Zulu using the RPM. Then we should be able to use the normal Docker multi-arch build stuff. This prevents sharing the same base image across multiple Trino releases, but I think that's ok. Thoughts?

@electrum
Copy link
Member

It looks like we need to use the tarball to install Zulu, since the RPM is only for amd64 :(

@Lewuathe
Copy link
Member Author

@electrum Thank you for the feedback!

I contacted Azul about supporting arm64 in their Docker images.

Sounds nice. I'm looking forward to that.

I think we could go with the simple approach of using the CentOS image as the base and installing Zulu using the RPM. Then we should be able to use the normal Docker multi-arch build stuff.

It also looks good. That's pretty similar to the original approach we've tried previously. Thanks. I'll try to update accordingly.

@electrum
Copy link
Member

I've been playing around with this, mostly to learn more about how this works. If you don't mind, I'll put up a new PR with some modifications to your original changes.

@Lewuathe
Copy link
Member Author

If you don't mind, I'll put up a new PR with some modifications to your original changes.

That sounds nice. It's totally okay, I want to take a look at the PR then.

@electrum
Copy link
Member

@Lewuathe I created a new PR with the changes: #8397

I now finally understand how this actually works and the problems you ran into. Apologies that it took so long. My conclusion is that Docker multi-architecture is a mess.

I ended up going back to your original approach of separate Dockerfiles, since that is simple, supports separate base images, and allows us to test the individual images before pushing them.

I'm using use OpenJDK from CentOS because Zulu crashes in Docker for Mac -- which I also reported to Azul.

It took me a long time to figure out that multi-arch manifests are a feature of the registry, at a higher level than images, and don't exist for local images. This is why --push is needed for docker buildx build, and also why docker manifest create requires that you first push the constituent images to the registry. None of the documentation or tutorials that I found explained this. I was also surprised that while you can push tags with docker push, there is no command to delete them.

@electrum
Copy link
Member

Replaced by #8397

@electrum electrum closed this Jun 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

6 participants