Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide officially distributed docker images for SIO2 #126

Open
LiquidPL opened this issue Jan 3, 2023 · 7 comments · May be fixed by #134
Open

Provide officially distributed docker images for SIO2 #126

LiquidPL opened this issue Jan 3, 2023 · 7 comments · May be fixed by #134

Comments

@LiquidPL
Copy link
Contributor

LiquidPL commented Jan 3, 2023

For some context: I run an instance of SIO2 (https://wilk.radom.pl), and have been traditionally doing that using my own Dockerfiles (see: docker-oioioi, docker-sio2). However due to lack of time, and generally not being a high school student anymore, these repos have largely been left unmaintained, and the version of SIO being use is really old.

I wanted to ask if it would be possible to provide automatically built images of oioioi (possibly also filetracker and sioworkers) on Docker Hub or GitHub container registry? I can provide help with bringing any necessary Dockerfiles up to a decent state, and overall maintenance/reducing image size (the image of oioioi that I've built from master today is almost 5GB in size, though that's probably in large part because of texlive).

@A-dead-pixel
Copy link
Contributor

A-dead-pixel commented Jan 4, 2023

I'm not an expert, but I think the image pushing part could be done with a github action.

As for bringing down the image size, I achieved 3.38GB with the changes listed below:

  • change the command for downloading sandboxes to fetch only the actually used ones and clear the cache afterwards. One's needed sandboxes may differ from mine, especially the gcc and sio2jail part.
RUN ./manage.py supervisor > /dev/null --daemonize --nolaunch=uwsgi \
        --nolaunch=rankingsd --nolaunch=mailnotifyd && \
    ./manage.py download_sandboxes -q -y -c /sio2/sandboxes compiler-fpc.2_6_2 \
        compiler-gcc.10_2_1 exec-sandbox vcpu_exec-sandbox proot-sandbox \
        proot-sandbox_amd64 null-sandbox sio2jail_exec-sandbox-1.4.2 && \
    ./manage.py supervisor stop all && \
    rm -rf /sio2/deployment/cache
  • add && pip3 cache purge to each pip command, but this has very little effect

Apart from this, I think the image should be split for workers and web. The only issue with that would be the need to remove oioioi.sioworkers.backends.LocalBackend, but imo this should be done anyway, as none of the documented and non-deprecated deployment methods utilize it. Some tests may need a change, as it is also used there. That would enable testing sioworkers in a way they're actually used.

Additionally, I think sox and flite (the debian packages' names are the same) should be included in the Dockerfile as to get rid of the warnings. They aren't big packages.

@LiquidPL
Copy link
Contributor Author

LiquidPL commented Jan 4, 2023

Looking at this, I'm not sure if sandboxes/compilers should even be downloaded as a part of the image build:

  1. They are persisted in the image, even if later they are updated manually on a live instance the newly downloaded versions will be stored wherever filetracker stores its data, while the ones in the image remain unused, taking up space for no reason.
  2. The way it is right now, there is a filetracker database inside of the image, and I'm not sure how will it work when new data are going to be written to it (but I think there might be some unexpected things going on when image side db/sandboxes are updated, while the copy in filetracker has/expects the old files), this would require input from someone who knows how that stuff works.
  3. This doesn't account for people running filetracker outside of the core oioioi container, in which case, this space is being completely unused.

Personally I'd recommend not putting sandboxes/compilers in the image, and instead telling users to run ./manage.py download_sandboxes on first install (unless it's needed for the production deployments on sio2.mimuw or szkopuł).

For general image size reduction, we should completely remove both pip and apt caches (as in, rm -r ...), as I'm not sure how much the commands remove.

Alternatively, we could use the multi-stage builds from Docker, in which case the build/compile would happen in one stage, and then in final stage we'd only install the production dependencies, and copy the built app from the first stage. This would ensure we've got no leftover caches and the like.

@A-dead-pixel
Copy link
Contributor

The removal of sandboxes is a great idea!
Thanks to that, I usually won't have to wait >1 minute for an image rebuild during development.
At the only time this matters - as you mentioned, at a clean install - one needs to create a superuser anyway, so we could make sth like first_start.sh which does those 2 things (or just mention it in the README).

Regarding the caches, I see 15MB in python bytecode (.pyc) and 3.7MB of leftovers in .cache/pip.

Apt seems to leave 34MB in /var/lib/apt/lists, which can safely by removed (you will need to run apt update if you want to install sth inside the container).

If you want to see for yourself, you can use for example gdu - a TUI disk usage analyzer.

@A-dead-pixel
Copy link
Contributor

A-dead-pixel commented Jan 4, 2023

Though for periodic short-term deployments without access to the internet (most of the ones I run), the sandboxes contained in the image are desirable, unless one wants to put the files on a local webserver and modify the url.

If we stick to this approach, some viable solutions would be:

  • including a stage for this and not using it by default
  • using some ARG.
  • using a second Dockerfile

What do you think?

@LiquidPL
Copy link
Contributor Author

LiquidPL commented Jan 4, 2023

./manage.py download_sandboxes has an option to load sandboxes from a local directory instead of downloading, so one could bindmount a directory with them to a running oioioi container, and then install them in this way.

Alternatively, we could build a separate Docker image that has them embedded, and offer two versions - one with sandboxes, one without.

@A-dead-pixel
Copy link
Contributor

I think the bind-mount solution and a mention in the README (possibly with a commented line in docker-compose.yml) will suffice.
Then, the init shell script should use /sio2/sandboxes just like the command in the Dockerfile.

@LiquidPL
Copy link
Contributor Author

LiquidPL commented Jan 8, 2023

I've spent some time on preparing a new Dockerfile, while also cleaning up some stuff (removing unnecessary things, cleaning up the development images, etc.). I can now build Docker images that will work both in production, and with all the development and testing tooling available in the repo (easy_toolbox.py, static/cypress tests).

The clean production ready image takes up 2.76 GB, while the development image - 4.59GB (however this includes all sandboxes as I didn't get around to removing them yet for the development images - though it might be a good idea to not remove them in this case).

I'll clean this up a little further, and will try getting out a PR tomorrow or on Monday. I've also prepared a sioworkers Dockerfile, which I will PR separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants