Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ci pipeline complete #235

Merged
merged 8 commits into from
Aug 23, 2023
Merged

Ci pipeline complete #235

merged 8 commits into from
Aug 23, 2023

Conversation

Diewertje11
Copy link
Collaborator

@Diewertje11 Diewertje11 commented Aug 21, 2023

This resolves issue #231. In the old workflow, the docker was build with the requirements from master, but the tests were run in a clone of the branch of the push/PR was made from, with a different version of ffmpeg than the container had. In this way we did not test if the code actually runs in the container.

The logic of this workflow is:

  • Run the pipeline in "dev", "master", and PRs to "dev", and "master"
  • If any of the files we related to the container has changed, then rebuild it.
  • On a PR, this new image gets the tag of the current branch, in a push it gets the tag of what we push to (dev or master)
  • To fetch the correct image for the tests: If we're in PR, and the files changed, or we're in dev and master, then fetch current branch. Otherwise, the PR target
  • Run the tests unconditionally

This results in that the dev or master docker image only gets updated on a push, independent on if the tests pass/fail during that push. (That they will pas should be checked first in a PR). The dev image only gets updated if the requirements/Docker file changed in the push. The master image always gets updated, since we want to have the new version of code available for the users on SNIC. (See issue #234)

Further improvements to the pipeline and the docker files are mentioned in issue #233.

pabloyoyoista and others added 8 commits August 22, 2023 13:37
This commit contains a workflow that is inspired by the workflow
deploy-to-ghcr.yml that we had first and is now deleted. Only that now
it only contains the part of building the container. Testing the NB will
be done in a separate workflow.

But the following changes are made:
* Remove the cashe docker layers. It is not allowed to save privileged
  data. when running hte post action, it results in:
  Warning: EACCES: permission denied, scandir '/var/lib/docker'
* Recursively checkout submodules in the checkout repository step. We
  need this since the yolov5_tracker also has submodules, and when it
  was set to true, it only checks out 1 layer of submodules. we need
  all.
* The tag of the container is the branch name from which we do the PR or
  the push. This is so that we can push a container to the registry to
  run the tests in.
Most remarkably, get rid of tabs, of additional --- in the
comments, and get installation things to run all at once
This commit has a few changes to keep the size of the container to the
minimum:
* Make sure to always clean after installing, and don't cache
* Also use "no-install-recommends" for apt, so that we only get what we
need and not anything else.
* Use apt-get remove to uninstall things we no longer need
* Revisit what needs and what does not need to be installed, next to
  reducing the size, this also makes the process more clear

It also adds:
* install libgl1, GL which are needed for opencv
* add build-essentials and python3-dev to fix the docker build by
  allowing to build wheels
This has two well-desired consequences and an annoyance:
* There's now one single copy of the repository in the container
  image, instead of one in /usr/src/app/kso and another one in $HOME.
* We now install the dependencies for the commit we're building, and
  not from master, which could possibly not match.
* As a consequence of building from the current commit,
  jupyter_contrib_extensions is no longer installed by default, and it
  needs to be installed manually, as well as modifying the calls
We don't know why we enable the extensions. But they fail due to
missing modules. It turns out something obscurely changed between
master and dev, and we tracked it down to a change in the
some dependency in kso_utils. Until the reason for that
has been figured out, let's just add it manually.
There is an issue created for this, see issue #225 in kso.
The following changes are made:
* Keep git installed, since we need it to checkout the repository for notebook testing, since otherwise you cannot checkout the repository in the next .yml
* Add libglib2.0-0 dependency to make cv2 work

To make compiled ffmpeg work (the ffmpeg that was used in the previous
ci-pipeline to test the notebooks was not the one from the container, so
the one from the container might have not worked before for the
extract_clips CPU code):

* Remove the disabling of ffprobe, this is needed to load video from html links, which we need
* Enable ffmpeg to use libx264, to do this, also gpl is needed
* Add openssl, needed for https downloads in ffmpeg
Initially the idea was to create 3 workflows: build-container,
run-tests, update-container. But due to limitations in the github
workflows, everything is merged together in 1 workflow. The limitations
were:
* The test get triggered via the 1st one (workflow_run) and by a change
  in something in the code, causing it to run double. There is no actual
  path-ignore for this.
* Triggering the update-container via 2 workflow runs via part 1 and 2,
  causes that we cannot have the branch or target branch information
  anymore. This makes it hard to update the correct container.

The current approach has 1 workflow that builds the container and runs the tests. It is run all the time, but in PRs we do
not build the container unnecessarily.

This is not the most optimized way to execute things. Since we first, build,
push, and then run the tests. Meaning that if the tests execute, the container
was already pushed. But it is nonetheless the most straight-forward way
to execute the whole process. Dealing with the container is very, very
annoying due to its size and complexity, and github actions are extremely
rigid and have many limitation with respect of when to run the workflows,
and dealing with containers in general. So as a first approach on which to
further iterate, this should certainly work.

Codeveloped-by: Pablo Correa Gomez <pablo.correa.gomez@combine.se>
The docker image contains a copy of the code that is used to run things
on SNIC for the users that do not do development. Since we want this to
be up to date with the last commits in master, we always build a new
image when we push to master, even if nothing was changed to the
requirements or the docker file.

It can be considered for the future to work more with releases, so that
we do not need to push new images. See issue #234.
@Diewertje11 Diewertje11 marked this pull request as ready for review August 22, 2023 12:15
Copy link
Collaborator

@jannesgg jannesgg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge these changes.

@jannesgg jannesgg linked an issue Aug 23, 2023 that may be closed by this pull request
@jannesgg jannesgg merged commit 3665cc9 into dev Aug 23, 2023
7 checks passed
@jannesgg jannesgg deleted the ci-pipeline-complete branch August 23, 2023 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix testing GA workflow integration
3 participants