Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Switch from ubuntu base docker image to debian #879

Closed
wants to merge 10 commits into from
Closed

[WIP] Switch from ubuntu base docker image to debian #879

wants to merge 10 commits into from

Conversation

Langhalsdino
Copy link
Contributor

This pull request switched the docker images from being ubuntu based to debian.
More about the reasons in #7

Why did i choose python:3.6.9-jessie for cvat-backend?
Sadly only jessie has pre compiled ffmpeg and gstreamer libraries, therefore we can not use stretch or jessie. Furthermore to reduce the compile time, i decided to go with python 3.6.9 base image that includes pip3, python3, ...

Why did i choose node:10.16.3-buster-slim for cvat-ui?
Mostly, since buster is the newest stable debian release and the node container has node 10.16.3 preinstalled (previously used node version).

I did not test all of the features, such as CUDA, Autosegmentation, ... and would appreciate help with testing it. There are probably further adjustments needed. Feel free contribute or share bugs :)

@Langhalsdino
Copy link
Contributor Author

Langhalsdino commented Nov 28, 2019

Out of curiosity, did you recently add PR quality review for the docker files, since the current docker file do not comply with the ruels listed in the tooling. There is a lot of refactoring needed :/

We are just using workarounds to ignore
Pin versions in apt get install. Instead of `apt-get install <package>` use `apt-get install <package>=<version>`

@nmanovic
Copy link
Contributor

@Langhalsdino , the PR cannot be built by travis. Also it is not enough just change one base image on another. You need to provide some evidence that the change works as expected. It is a complex process. I will suggest starting from some simple changes to speed up the build. Some ideas:

  • split cvat image on a couple of containers (don't try to build many features in one container). For example, supervisord is a bad practice in common case. Need to follow idea: one container = one process.
  • create a base cvat image which shouldn't change too often
  • Probably in some cases multistage docker build can help to reduce the size of images (https://docs.docker.com/develop/develop-images/multistage-build/)
  • Build components in a separate container (it is not easy and will require many changes).

Also when you try to "improve" build you need to provide some KPIs: initial build time, increment build time, size of images, etc.

P.S. If you are ready to help I will be glad to guide you. If you don't have a lot of time I will recommend to close the PR.

@nmanovic nmanovic changed the title Switch from ubuntu base docker image to debian [WIP] Switch from ubuntu base docker image to debian Nov 29, 2019
@Langhalsdino
Copy link
Contributor Author

Langhalsdino commented Nov 29, 2019

Ahh sorry, did a small change in the docker file on the last commit and did not test it.
The Build CI is successful, even though it is not complying with the PR Quality Review.

I get what you describing, about best practices and what we should ultimately thrive for.
My argument is more about doing small fast changes in order to solve your concern regarding #7 , that you are not able to redistribute the ubuntu docker image. As mentioned in the thread, listing prebuild containers on docker hub would be a major step towards reducing the entry barrier and getting cvat in the hands of more people. Afterwards we can work together in order remove clutter from the prebuild image, splitting them, ... .

I did not only change the base image, there are a couple of adjustments that needed to be made in order to get it working. It tested the following configuration and succeeded:

TF_ANNOTATION: "no"
AUTO_SEGMENTATION: "no"
DJANGO_CONFIGURATION: "production"
OPENVINO_TOOLKIT: "no"

Do you need screenshots as proof? Test are succeeding, too :)

Regarding the "faster" build times:
Building the frontend (including pulling the base image/only build of base image time/ total size):
Current development branch: 3:18.09 / 3:16.89 / 899MB
Updated version: 2:36.12 / 2:13.94 / 525MB

Overall here is my proposal - Lets have a three stage process:

  1. Solve the legal issues and set the current docker images up, so that they can be published to the docker hub registry and more people have easier access to the project.
  2. You guide me through the process of improving the docker images (multi stage build), since there is probably a lot that i can learn. And i would be happy to work with you on the task.
  3. Regarding Add more environment variables to configure CVAT container #445 I created a few k8s templates that enabled me to deploy everything my k8s cluster. After we got a public docker registry that lists the docker images, i would like to add my templates as a chart to the cvat project and reference the public docker registry :)

Btw. i would like to test the PR Quality locally, so that i do not need to push it all the time, is there a way to perform the test locally?

@nmanovic
Copy link
Contributor

@Langhalsdino ,

Sadly only jessie has pre compiled ffmpeg and gstreamer libraries, therefore we can not use stretch or jessie.

Which version of ffmpeg does it use? Does it behave the same as the current version of ffmpeg? I have to say that the change can be really critical.

@nmanovic
Copy link
Contributor

My argument is more about doing small fast changes in order to solve your concern regarding #7 , that you are not able to redistribute the ubuntu docker image.

@Langhalsdino , I will initiate a new discussion with our legal and open source department for sure but it will not be easy or quick process. I will come back with some notes.

@Langhalsdino
Copy link
Contributor Author

@Langhalsdino ,

Sadly only jessie has pre compiled ffmpeg and gstreamer libraries, therefore we can not use stretch or jessie.

Which version of ffmpeg does it use? Does it behave the same as the current version of ffmpeg? I have to say that the change can be really critical.

The main challenge is, that the current docker file does not describe a fixed version ether, but i am using the same gstreamer0.10-ffmpeg version. Gstreamer is probably the one that caused the most headaches and it was the reason, why i needed to use jessie and not buster.

If i am not mistake a fresh build should currently grab ffmpeg_2.8.6-1 on ubuntu xenial and jessie is currently getting ffmpeg_2.6.9 . While looking at the change logs i did not see a red flag, but honestly i do not know the exact internals that you are using, so we might need to check it.
Can you point me to the right place in the code base that i should check out?

@Langhalsdino
Copy link
Contributor Author

My argument is more about doing small fast changes in order to solve your concern regarding #7 , that you are not able to redistribute the ubuntu docker image.

@Langhalsdino , I will initiate a new discussion with our legal and open source department for sure but it will not be easy or quick process. I will come back with some notes.

Sure, lets start the discussion :)
Just to be on the same page, why do we need to have the discussion, when the code is release under the MIT license?

@Langhalsdino
Copy link
Contributor Author

Langhalsdino commented Nov 29, 2019

The biggest challenge will probably be CUDA support 🙈 But we can probably publish a base image that has features like CUDA, ... disabled.

@nmanovic
Copy link
Contributor

@Langhalsdino ,

Just to be on the same page, why do we need to have the discussion, when the code is release under the MIT license?

Unfortunately it doesn't matter under which license the project was published. We should not violate licenses/patents of other companies. Internally we have strict rules and guidelines.

First of all thanks for the PR. It seems I have new facts now and I can get right steps to publish a docker image for the tool on DockerHub (using https://hub.docker.com/u/openvino as an example). I don't think that you can help in the process because it is very specific. Thus I'm going to close the PR. But I promise to come back with a PR to solve the issue in the nearest future :).

At the same time if you can help us with issues which I described in the PR it will be awesome. Once again thanks for your help and bringing the problem to the table again.

@nmanovic nmanovic closed this Nov 29, 2019
@Langhalsdino
Copy link
Contributor Author

Langhalsdino commented Dec 7, 2019

Sure, i would love to help.
But this seems like you would like to do most at intel behind close doors, so i am not sure if i am even able to help you :)

But we may have started off on the wrong foot, so how can i help you and contribute :)

@nmanovic
Copy link
Contributor

nmanovic commented Dec 8, 2019

@Langhalsdino , indeed legal questions are tough for all of us and should be handled on our side to avoid any issues. Now I have recommendations that even ubuntu base image should work if we do things right. But before to publish an image on docker hub I have to review with appropriate departments all packages which we are using to create the image. I would happy delegate such kind of activities but I cannot.

At the same time we have the plenty of technical issues summited by different users. If you can help with any of them it will be awesome. If you come up with a good idea and prototype how to improve our docker build you are welcome. But if you suggest submitting something on the docker hub it is not a technical problem.

@Langhalsdino
Copy link
Contributor Author

Langhalsdino commented Dec 8, 2019

At the same time we have the plenty of technical issues summited by different users. If you can help with any of them it will be awesome. If you come up with a good idea and prototype how to improve our docker build you are welcome. But if you suggest submitting something on the docker hub it is not a technical problem.

Sure, i get the legal troubles. As mentioned above, especially using packages such as ffmpeg with all the provided codecs need probably a lot of in-depth analysis before you can publish it on docker hub. And most likely cuda support will not make it to docker hub.

Regarding the "faster" build times:
Building the frontend (including pulling the base image/only build of base image time/ total size):
Current development branch: 3:18.09 / 3:16.89 / 899MB
Updated version: 2:36.12 / 2:13.94 / 525MB

Even-though, some of the suggested changes by this PR include faster build times and smaller docker images. Maybe we can start with this and move forward to increase the quality of docker files :)
Should I open a new issue or directly a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants