Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add docker build of container #583

Merged
merged 1 commit into from
May 2, 2023

Conversation

vsoch
Copy link
Contributor

@vsoch vsoch commented May 1, 2023

It would be nice to have a quick demo container or Dockerfile to build one for quickly testing out the software without needing to install anything to my system. This addition will add the Dockerfile, instructions in the README for build and usage, along with an automated build to deploy a container alongside the repository.

I'm testing out hyperqueue to see how it could integrate with Flux Framework or Kubernetes, and step 1 (I think for me!) is getting it into a nice container where I can easily develop. Another addition could be a VSCode developer environment, if you are interested in that! It would be a similar strategy, but moving the Dockerfile to be in a .devcontainer folder and then adjusting the build command a bit.

@Kobzol
Copy link
Collaborator

Kobzol commented May 2, 2023

Hi! We haven't really felt the need for Docker so far, because HQ has mostly trivial build (C + Rust toolchain) and runtime (C standard library) dependencies, and also because Docker typically cannot be used on HPC clusters.

For people that do not want to compile their own version of HQ, they can download the prebuilt binaries from GitHub. Are the prebuilt binaries working on your system or do you need e.g. a different prebuilt architecture?

Having said that, providing a simple Dockerfile doesn't seem like a bad thing, so I would accept a PR for that.

Regarding the CI workflow for publishing Docker images, that's a harder sell. As mentioned, HQ is mostly designed for HPC environments, where Docker images cannot be used directly. Do you have some use-case where you would have a complicated Docker container system (e.g. in Kubernetes or docker-compose) that would want to use a HyperQueue image downloaded from a container registry, and integrate it with some other services? If yes, could you describe it a bit more?

@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

Hi! We haven't really felt the need for Docker so far, because HQ has mostly trivial build (C + Rust toolchain) and runtime (C standard library) dependencies, and also because Docker typically cannot be used on HPC clusters.

The use case isn't HPC clusters, but in cloud and on Kubernetes. E.g., I develop the Flux Operator: https://github.com/flux-framework/flux-operator

Are the prebuilt binaries working on your system or do you need e.g. a different prebuilt architecture?

I typically develop in containerized environments. I'm more the developer use case than the "I want the final binary" use case!

Regarding the CI workflow for publishing Docker images, that's a harder sell. As mentioned, HQ is mostly designed for HPC environments, where Docker images cannot be used directly. Do you have some use-case where you would have a complicated Docker container system (e.g. in Kubernetes or docker-compose) that would want to use a HyperQueue image downloaded from a container registry, and integrate it with some other services? If yes, could you describe it a bit more?

Yes - both docker-compose and Kubernetes, actually. docker-compose is one step before Kubernetes - a nice way to develop locally before bringing to a Kubernetes cluster. We have the same for the flux operator too :) https://github.com/rse-ops/flux-compose.

I guess I find it a bit strange to turn away other developers that are interested/excited to innovate with your project, but it's not my project so not my choice.

@Kobzol
Copy link
Collaborator

Kobzol commented May 2, 2023

We definitely don't intend to turn you away! :) It's just that every thing added to the repository has a maintenance cost, so we have to evaluate whether it is worth it or not.

I would start and focus this PR on the Dockerfile. Your version is mostly OK, however, it will not allow quick local rebuilds (this is nontrivial with Rust). If we say that this Dockerfile should serve for people that actually want to develop HyperQueue, then we should modify it. If we say that it doesn't matter, then it should be fine.

@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

Sure - would be happy to do that! My (selfish) use case was to have it ready for a development environment or an operator container, but using the prebuilt one (that I wouldn't have to wait for). But I think that's a great idea that it should be optimized for quick development too!

I'm fairly green at rust - can you share some tips for best practices for docker builds? I'm assuming we want to add some set of Cargo.toml files first to build the deps and then the code (do you have an example for your setup?)

And I totally understand - hopefully adding a build that automates itself doesn't add too much developer turmoil but does empower folks like myself to extend in cool ways!

@spirali
Copy link
Collaborator

spirali commented May 2, 2023

Thank you for your contribution. We are welcoming PRs including better docs about HQ usage in cloud or better "getting started" section.

My main problem with this PR is that it promotes "docker & building HQ from scratch" as a main way how to demonstrate HQ, while you can just download it a run it.

I would be super happy if this PR kicks off a form of developer's guide or developer's docker. But I would like to have main README oriented only on users, not developers of HQ. Developer's docs should be somewhere else.

@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

I would be super happy if this PR kicks off a form of developer's guide or developer's docker. But I would like to have main README oriented only on users, not developers of HQ. Developer's docs should be somewhere else.

Where would you like developer docs?

@spirali
Copy link
Collaborator

spirali commented May 2, 2023

I think that we can start it as a subsection in the user guide. But I would leave the final decision on @Kobzol as he established the current version of the documentation,

@Kobzol
Copy link
Collaborator

Kobzol commented May 2, 2023

I would start with the Docker image itself, if you don't mind. I have some experience with making fast-to-rebuild Docker images, so I'll send an example. We'll also want to add some .dockerignore I suppose.

I have one question: for your use-case, do you only need the hq binary being available in the Docker image, or do you also need the Python API?

@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

If I'm going to be running some workflow that uses the Python API I'd anticipate yes? But I apologize I haven't done this yet so I might not yet have the expertise to best answer this. This also makes sense for the developer use case - they could test that API easily.

@Kobzol
Copy link
Collaborator

Kobzol commented May 2, 2023

Ok, let's stick with the hq binary for now, it's unclear how would the Python API be used in the Dockerfile to me at this moment, and it's not publicly supported yet anyway.

This Dockerfile should do the trick

FROM lukemathwalker/cargo-chef:latest-rust-1 AS chef

ENV CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse
WORKDIR /app

FROM chef as planner

COPY . .
RUN cargo chef prepare --recipe-path recipe.json

FROM chef AS builder
WORKDIR /build
COPY --from=planner /app/recipe.json recipe.json

# Build dependencies and cache them in a Docker layer
RUN cargo chef cook --release --recipe-path recipe.json

# Build HyperQueue itself
COPY . .
RUN cargo build --release

FROM ubuntu:22.04 AS runtime

WORKDIR /
COPY --from=builder /build/target/release/hq hq

ENTRYPOINT ["./hq"]

It's a bit involved, but it allows fast rebuilds and has a direct hq entrypoint, so users can then run something like docker run hq server start. It also produces a relatively small image (< 100 MiB).

Could you remove the workflow change from this PR please, update the Dockerfile and also please add a .dockerignore file with at least the target/ directory? I would be OK with merging that :)

@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

All set! I updated the README to coincide with the changes to the Dockerfile. I assume the decision to not have it on the path was explicit, so I just updated the example to target the executable directly.

@Kobzol
Copy link
Collaborator

Kobzol commented May 2, 2023

It wasn't exactly an explicit decision, I just didn't think of it 😅 It could be useful for people running hq client commands inside the image interactively, if someone needs that.

The changes to the README are still a bit too visible for our tastes. The README should provide the most important information and the "fast path" of documentation for HQ users (not necessarily HQ developers), and probably the vast majority of them will not need to use Docker, therefore this rather long section about Docker probably shouldn't be in the beginning of it. It's also a bit of a "first impression" issue - one of the big benefits of HQ is that it practically doesn't have any external dependencies and thus it is very easy to deploy. If we started our README with a mention of Docker, which is not very HPC friendly and evokes the feeling of "I have many dependencies and I need to be containerized", it might discourage some users.

For that reason, I think that for now we can just skip any mention of Docker in the documentation. It's not the primary way of distributing or using HQ at the moment, so it doesn't need to be in the documentation "fast path". At the same time, people that will actually want to use HQ with Docker should quickly notice the root Dockerfile, and in that case they will also know what to do (docker build + docker run), so telling them this information explicitly is probably not needed.

So if you don't have a strong counter-opinion, please remove the README changes and keep only the Dockerfile and dockerignore files. In a follow-up PR, we can add the workflow for publishing HQ into some container registry and then mention the container ghcr.io link somewhere.

it would be nice to have a quick demo container
or Dockerfile to build one for quickly testing out
the software without needing to install anything
to my system. This addition will add the Dockerfile,
instructions in the README for build and usage,
along with an automated build to deploy a container
alongside the repository.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

So if you don't have a strong counter-opinion, please remove the README changes and keep only the Dockerfile and dockerignore files. In a follow-up PR, we can add the workflow for publishing HQ into some container registry and then mention the container ghcr.io link somewhere.

Sure, happy to do that (and done)! I'll keep the content locally so (barring my computer doesn't blow up) we have access to it later. What I'll also do for my own development needs is just have an automated build that clones your repository, and then builds the image (and I can write the instructions there for now).

And let me know if you are interested in the automated build at some point - at least for Dockerfile that I provide, I would feel uneasy not having the build regularly tested.

@Kobzol
Copy link
Collaborator

Kobzol commented May 2, 2023

Thanks! If we already have the Dockerfile, I think that the automated build (done e.g. once per week, or even after every stable/nightly release) + push to some GitHub registry wouldn't hurt. I'll check the workflow later and think about where we could put the link about it.

@Kobzol Kobzol enabled auto-merge (rebase) May 2, 2023 22:02
@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

okay! Let me know if/when you want me to do a second PR to follow up to add that workflow, and how you'd like that. Even if you don't want to push to a registry, checking the build would be a good idea I think!

@Kobzol Kobzol merged commit d023e8d into It4innovations:main May 2, 2023
6 checks passed
@vsoch vsoch deleted the add/docker-build branch May 2, 2023 22:13
@vsoch
Copy link
Contributor Author

vsoch commented May 2, 2023

Awesome - thank you @Kobzol ! To give you a good time anticipation, I should have time in the next two weeks to start with my experiments in Kubernetes, so I greatly appreciate this! And of course ping me in the meantime for what we chat about above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants