fix: DRY + speed up docker build #2016

akx · 2024-05-06T16:03:52Z

Description

This PR:

Refactors the Dockerfile to use an in-container setup script instead of commands being mashed into RUN statements.
DRYs out the repeated package lists, PyTorch package index URL determination etc.
Switches to using uv for installing Torch as well (it's faster!)
Adds a check that the functionality to pre-load whisper and embedding models actually works

This should have no effect for the end user other than a possibly slightly smaller image.
For developers, this is easier to maintainer.
For CI and builders, this is faster.

Testing & review

I checked that an image built with and without USE_OLLAMA works as before.

I didn't check the CUDA configuration, since I have no CUDA-enabled Docker hardware at hand right now.

Changelog Entry

Changed

Optimized Docker build a bit.

Yanyutin753 · 2024-05-06T16:39:42Z

Hey, bro, I think it's in the dev branch ，not the main branch

justinh-rahb · 2024-05-06T16:42:23Z

@akx Love to see some work on making Docker builds faster but I will note that we've had some past issues getting uv to work reliably for all build platform combinations, and for PyTorch at all. Please ensure you've thoroughly tested this 🙏

tjbck · 2024-05-06T21:53:09Z

Thanks for the PR, LGTM but more testing wanted here!

akx · 2024-05-07T07:06:14Z

Hey, bro, I think it's in the dev branch ，not the main branch

I couldn't find any instructions on which branch to target, but I noticed a lot of work happening on both dev and main.

akx · 2024-05-07T08:32:34Z

@justinh-rahb:

we've had some past issues getting uv to work reliably for all build platform combinations

What are the Docker platform combinations to target here? linux/amd64, linux/arm64? Others?

I can write e.g. a Makefile to build all images. :)

, and for PyTorch at all.

uv is moving pretty fast right now (disclaimer: I'm a (very) minor contributor) and I see its version hasn't (and isn't, even with this PR) pinned, so chances are it now does work – very much seems to be able to be imported, anyway:

$ docker build -t owu-builtin-ollama . --build-arg="USE_OLLAMA=true" && docker run -it owu-builtin-ollama python -c "import torch; print(torch.__version__)"
 => => writing image sha256:56268c22bee4e9edfc507cd7358945e4acd7f4f32e1c358dcb0f03067acdc1a7                                                                                                                                                  0.0s
 => => naming to docker.io/library/owu-builtin-ollama                                                                                                                                                                                         0.0s
2.3.0

(same results repeated for no builtin ollama)

Is there a particular e.g. CUDAness scenario in which it hasn't worked here?
Is there something I could manually test that'd certifiably use Torch within the image?

@tjbck

LGTM but more testing wanted here!

As above: anything you'd particularly like me to test (within the constraints here, i.e. I'm working on Apple Silicon, no CUDA here)?

justinh-rahb · 2024-05-07T11:20:39Z

I'll go through our convos the last time I messed around with uv installing torch for which platform I had issues with.

tjbck · 2024-05-07T18:42:56Z

We would need testing with all 6 of our variants!

justinh-rahb · 2024-05-07T18:48:39Z

And test runs of Github action workflows

justinh-rahb · 2024-06-03T15:16:37Z

@akx I think there's been some signifcant changes as of late to our build workflow. If this PR is still applicable I encourage you to keep working on it, or if we've resolved some of the original issues that prompted you to create it perhaps it can now be closed.

akx · 2024-06-03T15:23:54Z

@justinh-rahb The repetition that this PR was fixing is still there in the Dockerfile as far as I can see. I'll maybe revisit this once someone takes a look at my other PRs (#2041, #2233) – it's discouraging to keep rebasing them to no review or interest.

justinh-rahb · 2024-06-03T15:36:57Z

The repetition that this PR was fixing is still there in the Dockerfile as far as I can see. I'll maybe revisit this once someone takes a look at my other PRs (#2041, #2233) – it's discouraging to keep rebasing them to no review or interest.

I understand your frustration but please also remember that we're also volunteers that have regular jobs too, and the project has been moving very quickly. I am personally trying to get the backlog looked at right now, thus why I've been checking in with the open PRs and communicating with our contributors in the backchannels to get things moving along.

akx · 2024-06-03T17:03:46Z

I understand your frustration but please also remember that we're also volunteers that have regular jobs too, and the project has been moving very quickly.

Absolutely, I'm in the same position. Sorry if I sounded a bit unkind there :)

tjbck changed the base branch from main to dev May 6, 2024 21:52

akx mentioned this pull request May 7, 2024

doc: note PRs should target dev #2038

Merged

fix: DRY + speed up docker build

fc20b40

akx force-pushed the docker-setup-cleanup branch from 14bdf6d to fc20b40 Compare May 8, 2024 09:49

akx marked this pull request as draft May 8, 2024 10:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: DRY + speed up docker build #2016

fix: DRY + speed up docker build #2016

akx commented May 6, 2024 •

edited

Yanyutin753 commented May 6, 2024 •

edited

justinh-rahb commented May 6, 2024

tjbck commented May 6, 2024

akx commented May 7, 2024 •

edited

akx commented May 7, 2024 •

edited

justinh-rahb commented May 7, 2024

tjbck commented May 7, 2024

justinh-rahb commented May 7, 2024

justinh-rahb commented Jun 3, 2024

akx commented Jun 3, 2024

justinh-rahb commented Jun 3, 2024

akx commented Jun 3, 2024

fix: DRY + speed up docker build #2016

Are you sure you want to change the base?

fix: DRY + speed up docker build #2016

Conversation

akx commented May 6, 2024 • edited

Description

Testing & review

Changelog Entry

Changed

Yanyutin753 commented May 6, 2024 • edited

justinh-rahb commented May 6, 2024

tjbck commented May 6, 2024

akx commented May 7, 2024 • edited

akx commented May 7, 2024 • edited

justinh-rahb commented May 7, 2024

tjbck commented May 7, 2024

justinh-rahb commented May 7, 2024

justinh-rahb commented Jun 3, 2024

akx commented Jun 3, 2024

justinh-rahb commented Jun 3, 2024

akx commented Jun 3, 2024

akx commented May 6, 2024 •

edited

Yanyutin753 commented May 6, 2024 •

edited

akx commented May 7, 2024 •

edited

akx commented May 7, 2024 •

edited