Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Darts lite/deployment version #1891

Closed
lowickert opened this issue Jul 11, 2023 · 4 comments · Fixed by #1878
Closed

Darts lite/deployment version #1891

lowickert opened this issue Jul 11, 2023 · 4 comments · Fixed by #1878
Labels
devops CI/CD, packaging, code maintenance, ...

Comments

@lowickert
Copy link

Is your feature request related to a current problem? Please describe.
I'm currently deploying an application build with darts in a docker container. During the building process of the image I notice that darts depends on a lot of libraries, that might be important in the development process but bloat the deployment container. For example I dont need jupyter in an application running models on a server.

Describe proposed solution
It would be great to have a basic light / deployment version of darts covering only the core functionalities (trained models, ad, timeseries, etc..). Ideally, other functionality could be installed separately, I think the dask-library is doing something similar. If I overlooked something like this and that already exists / was already discussed, sorry.

Describe potential alternatives
A "quick" fix could be a prepacked official docker darts image so that all packages for darts do not need to be installed manually in a dockerfile but could be loaded as one from the docker-hub.

@lowickert lowickert added the triage Issue waiting for triaging label Jul 11, 2023
@dennisbader
Copy link
Collaborator

dennisbader commented Jul 11, 2023

We currently have a PR that introduces a lighter Dockerfile (altough only adressing the jupyter dependency). See #1878

@madtoinou madtoinou added devops CI/CD, packaging, code maintenance, ... and removed triage Issue waiting for triaging labels Jul 12, 2023
@alexcolpitts96
Copy link
Contributor

@lowickert I have been using darts in a research and production environment for roughly the last 18 months (I think?). It is tough for the image to be large, but it is also hard to strip out dependencies without breaking darts.

The biggest culprits are pytorch, catboost, and xgboost since they are all relatively large packages. It might be interesting to investigate splitting the requirements files into more sub-files. Torch is already split out of the core requirements (requirements/torch.txt), but having further segmentation might help with flexibility and build size.

If you are trying to keep your image size smaller (or at least what you have to build), try using a base image like NVIDIA base images here. Most of the big requirements are already covered.

What are your deployment configuration? Are you spinning up a REST API and handling inference that way or are you trying to spin up a new container for each prediction? If you are trying to do the latter then you will run into a lot of performance issues with bigger images while the former will be performant even if the image is large.

@lowickert
Copy link
Author

@lowickert I have been using darts in a research and production environment for roughly the last 18 months (I think?). It is tough for the image to be large, but it is also hard to strip out dependencies without breaking darts.

The biggest culprits are pytorch, catboost, and xgboost since they are all relatively large packages. It might be interesting to investigate splitting the requirements files into more sub-files. Torch is already split out of the core requirements (requirements/torch.txt), but having further segmentation might help with flexibility and build size.

If you are trying to keep your image size smaller (or at least what you have to build), try using a base image like NVIDIA base images here. Most of the big requirements are already covered.

What are your deployment configuration? Are you spinning up a REST API and handling inference that way or are you trying to spin up a new container for each prediction? If you are trying to do the latter then you will run into a lot of performance issues with bigger images while the former will be performant even if the image is large.

Yes, in the end I did it as you described. We have the model deployed using a fast api restapi with docker containers building up on a pytorch image. That worked quite well.

@madtoinou madtoinou linked a pull request Aug 7, 2023 that will close this issue
@madtoinou
Copy link
Collaborator

thanks @alexcolpitts96 for the insights about your environment setup.

@lowickert, darts 0.25.0 (released today) requirements were updated and Catboost, LightGBM and Prophet became optional dependencies. I hope that it will help a little bit with the size of the image.

Keeping this open, hopefully the docker image without the jupyter dependency won't break the workflow of too many users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops CI/CD, packaging, code maintenance, ...
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants