Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce requirements.txt to minimal set needed to run client #57

Merged
merged 6 commits into from
Oct 6, 2021
Merged

Conversation

jpswinski
Copy link
Member

I removed everything in the requirements.txt file that wasn't needed by the client, leaving only:
requests
numpy
geopandas
shapely

I then created a fresh conda environment and used python setup.py install to setup the environment, and I was able to successfully run the pytest unit tests.

Copy link
Contributor

@tsutterley tsutterley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 thanks @jpswinski!

@scottyhq
Copy link
Contributor

scottyhq commented Oct 6, 2021

I think this can be merged as-is, but i do have some thoughts / questions for follow-up work:

  1. current tests pass, but it would be great to add a codecoverage report to know that all the functions are actually tested https://about.codecov.io/tool/pytest/

  2. i think this all works currently because the https://github.com/ICESat2-SlideRule/sliderule-python/blob/897f751674c088129d45f4e05818402ea4365dc2/ci/environment-3.8.yml#L6 brings in a lot of dependencies including scipy, h5py, etc. @jpswinski could you try changing that to geopandas-base? See more discussion here Split in geopandas-base and geopandas outputs conda-forge/geopandas-feedstock#90 . I'm guessing if you do that, tests will fail due to the situation described below:

could be changed to (borrowed from how xarray does things):

try:
    import scipy.io
    has_scipy = True
except ModuleNotFoundError:
    has_scipy = False

Then down in the one or two functions requiring scipy you could start with some logic to raise an error that scipy is missing and needs to be installed https://github.com/ICESat2-SlideRule/sliderule-python/blob/3b406492fee0a0abc928b4aa2d485d930e5a94b7/sliderule/io.py#L198

@scottyhq
Copy link
Contributor

scottyhq commented Oct 6, 2021

This sequence of commands will illustrate the issue:

gh pr checkout 57
conda create -n test python
conda activate test
pip install -e . 
python 
import sliderule.io

Maybe it's not such a big deal, because a user familiar with python will understand from the error message they need to pip install scipy or conda install scipy

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/scott/GitHub/sliderule-python/sliderule/io.py", line 32, in <module>
    import scipy.io
ModuleNotFoundError: No module named 'scipy'

@tsutterley
Copy link
Contributor

i think the io and ipysliderule modules are optional imports (we may need to rework __init__.py). io and ipysliderule have some functions used in notebooks but aren't necessary to run the client.

i only used scipy.io for netCDF read/write as it was already in the environment.yml for some other tasks. Same with pytables for HDF5 read/write as comes as part of the larger geopandas install.

@jpswinski
Copy link
Member Author

@scottyhq I guess one question is what modules we want supported out-of-the-box with the requirements.txt file. The requirements.txt file on this PR will support sliderule.py, icesat2.py, and ipxapi.py, but it doesn't support io.py and ipysliderule.py.

I think we could add scipy and h5py to the requirements.txt file and support io.py. But ipysliderule.py has the ipywidgets and ipyleaflet modules, along with tkinter that I think we don't want to include in requirements.txt because then when we run python setup.py install those requirements fail and it was causing issues with things like binder.

On the other hand, if we have modules in our client that require these dependencies, then maybe we need to have everything listed. The problem then is that, at least for me, it is failing to resolve all of the requirements using pip, and so I am forced to used conda-forge.

@scottyhq
Copy link
Contributor

scottyhq commented Oct 6, 2021

one question is what modules we want supported out-of-the-box with the requirements.txt file.

Yes, it's a balance between maintainer ease and user ease, and also bloating of installation size! I'd advocate for the bare minimum in requirements.txt so that pip install sliderule downloads the smallest number of dependency packages to import sliderule and run some key functions. Then the try: except: blocks with a helpful user-facing messages to handle the less-common functions or submodules like the ipysliderule.py that have large dependencies.

Something to keep in mind for further down the road is that you can also configure optional imports with pip. See for example https://github.com/pydata/xarray/blob/5499949b5937277dcd599a182201d0e2fc5e818e/setup.cfg#L78-L84

That kind of package configuration would allow for various installation options: pip install sliderule[icesat2], pip install sliderule[interactive] etc...

@jpswinski
Copy link
Member Author

@tsutterley @scottyhq I just pushed an update for adding warnings to the io.py and ipysliderule.py modules on the packages that aren't included in the requirements.txt. Please take a look and let me know if anything needs to change (or feel free to checkout the branch and make the changes directly).

Comment on lines +5 to +7
- python
- pip
- pytest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will soon grab python 3.10 now that it is out. Also, the other ci environments get geopandas from conda-forge (geospatial packages installed from conda-forge tend to be more reliable compared to installed from pip). But wanted to also test the case where all dependencies (including geopandas) come from pip install sliderule

sliderule/io.py Outdated
Comment on lines 37 to 41
try:
import scipy.io
import h5py
except ModuleNotFoundError as e:
sys.stderr.write("Warning: missing packages, some functions will throw an exception if called. (%s)\n" % (str(e)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably best to wrap each import separately. If both scipy and h5py are not in the environment, the exception is raised for the first error: (No module named 'scipy')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just made the change in both modules and pushed the commit.

Copy link
Contributor

@scottyhq scottyhq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for doing this @jpswinski !

@jpswinski jpswinski merged commit ca2add2 into main Oct 6, 2021
@jpswinski jpswinski deleted the reqt branch February 9, 2022 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants