Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: 馃帯 build the images before running the e2e tests #716

Merged
merged 20 commits into from
Jan 27, 2023
Merged

Conversation

severo
Copy link
Collaborator

@severo severo commented Jan 27, 2023

See
#712 (comment). It removes the need to edit the chart/docker-images.yaml file.

See
#712 (comment).
It removes the need to edit the chart/docker-images.yaml file.
because it seems that the images are build without getting access to the
cache from the docker hub.
@severo
Copy link
Collaborator Author

severo commented Jan 27, 2023

In the e2e action, I call docker compose to build the images from the code and launch the services:

https://github.com/huggingface/datasets-server/blob/720609d41ce1687df78fbac0f0ac6d58d71a0c8f/.github/workflows/_e2e_tests.yml#L31-L47

I'm using cache-from= in the docker-compose file to try to avoid building again layers that already exist in the last built and published image

https://github.com/huggingface/datasets-server/blob/720609d41ce1687df78fbac0f0ac6d58d71a0c8f/tools/docker-compose-datasets-server.yml#L25-L26

The remote cache is the image with the buildcache tag, e.g., see https://hub.docker.com/layers/huggingface/datasets-server-services-admin/buildcache/images/sha256-9807c6e74b6aff0fefccf7a9f1212ac3bbdb1b74e23693f619f7210999269ade?context=explore.

The last merge to the main branch has generally created this cache through the action that builds and pushes the docker images:

https://github.com/huggingface/datasets-server/blob/720609d41ce1687df78fbac0f0ac6d58d71a0c8f/.github/workflows/_build_push_docker_hub.yml#L44-L55

We can see in the logs that the cache has been requested and read without an error:

https://github.com/huggingface/datasets-server/actions/runs/4026203407/jobs/6920413701#step:5:3636

But still: it seems not to be used at all, and all the layers are built again. For example, step 2 (apt install) is run again:

https://github.com/huggingface/datasets-server/actions/runs/4026203407/jobs/6920413701#step:5:3636

What can be the reason?

@severo
Copy link
Collaborator Author

severo commented Jan 27, 2023

I don't understand why the cache (gha) is not used when we use several images. It worked when we only built one image (admin). I guess that the cache size (10GB) has been reached, leading to the images being deleted one after the other.

So:

  • we cannot use docker.io for the cache for security reasons when the PR comes from a fork
  • we cannot use GitHub cache because of the 10GB limit

Maybe the only solution, for now, is to build without a cache at all. The action takes 16 minutes (see https://github.com/huggingface/datasets-server/actions/runs/4025495859/jobs/6918785905)

@severo severo changed the title ci: 馃帯 run the e2e tests on build images ci: 馃帯 build the images before running the e2e tests Jan 27, 2023
@severo
Copy link
Collaborator Author

severo commented Jan 27, 2023

Merge as is: it only makes the e2e tests build the images every time, without using a cache. Part of #712.

@severo severo merged commit 85e7aac into main Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant