New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Requirements] Support extras #620
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* add complete-api (and api out of complete) * test also complete and complete-api
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #551
This PR leverages setuptools'
extras_require
to enable a user to install only the dependencies they really need. This will give:The extras we will support are:
api
- already existed - needed dependencies to run the APIs3
-boto3~=1.9
- needed dependencies to use s3 as the storage layerazure-blob-storage
-azure-storage-blob~=12.0
- needed dependencies to use azure blob storage as the storage layercomplete
- all of the above excludingapi
complete-api
- all of the above includingapi
For examples on how to install package with extras using
pip
see thisI wanted to evaluate how effective was this change on the downloaded dependencies size. I couldn't find an automatic tool online to do that, so what I did is executing
pip install --no-cache-dir .
which prints output such as:and parsed its output. I'm uploading here the code I used so we can re-use it in the future:
package_sizes.tar.gz
I also used https://github.com/naiquevin/pipdeptree to easily understand the dependency tree
The old dependencies size was
83.39 MB
, the new size is81 MB
(2.8% decrease)The packages we don't install anymore are:
Notes:
boto3
moved to thes3
extra, it is still installed by default since it's a sub-requirement ofnuclio-jupyter
(boto3
and its sub requirements are 7.4MB) - We may be able to remove it, I'm checking itkfp
but since its code is not very ordered doing it meaning to add animport
inside a lot of functions so I decided to skip it for nowv3io
-v3io-frames~=0.8.5
,v3io~=0.5.0
- needed dependencies to use v3io as the storage layer, the total size it could save is 5.75MB, but since it's widely used by most of MLRun users decided to keep it as part of the basedask
-dask~=2.12
- needed dependencies to run dask functions - the total size it could save is 848 KB, low size + widely used - decided to keep it as wellSome analysis:
numpy
andpyarrow
are sub-requirements ofpandas
so practicallypandas
by itself is 39MB, almost half of the size (though most data scientists will already havepandas
in their venv)notebook
is a sub requirement ofnuclio-jupyter
, we may be able to remove it from there, I'm checking itbotocore
is a sub requirement ofboto3
details aboveAlso:
google-auth<2.0dev,>=1.19.1
from requirements, it was added in Requirements fixes #373 to fix some conflict which doesn't happen anymorechardet>=3.0.2, <4.0
to requirements to fix a conflict