Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making ramp_workflow pip installable #50

Merged
merged 18 commits into from Sep 20, 2017
Merged

Conversation

aboucaud
Copy link
Contributor

This PR intends to clean up the dependency tree of ramp_workflow.

The idea is to trim off all dependencies related to a particular ramp-kit.

The ramp-kits will therefore need to have a specific requirement.txt file listing the dependencies, which will be installed then using pip or conda.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was also working on this (I thought I had mentioned that), but did not yet do the requirements parsing of starting kits, so this is better

- source install_requirements.sh
- pip install codecov
- pip install -q flake8
- pip install -r testing-requirements.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are using conda envs, let's install with conda here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the command ? I don't use conda..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, bad idea. If a single library is not installable by conda the whole thing crashes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use an environement.yml file then (example here) that can be used to update an existing conda environment and can specify both conda and pip packages ?

Also as a side note, I find https://github.com/astropy/ci-helpers very useful for setting up continuous integration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one is not installable? All in that list are for me installable with conda

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but that is probably because I have conda-forge channel added. OK, then leave it be.

Regarding jupyter, what is this used for? Is it only for nbconvert? (then can also only install that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rth How do you use an environment.yml to specify pip packages ?

IMO ci-helpers is quite a big machinery for pure python packages. However we may use it in the future, especially for the very simple way to define build matrices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, jupyter is only for nbconvert, I only call it in the backend to generate the page on ramp.studio. So it should be tested in the travis of the kit, but not in rampwf.

Copy link
Collaborator

@rth rth Sep 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aboucaud Well one could do, conda env create [or update] -f environement.yml where, environement.yml looks like, e.g.,

name: test-env
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.5
  - numpy=1.13.1
  - scipy=0.19.1
  - conda_package_1
  - conda_package_2
  - pip :
    - pip_package_1 
    - ./  # install the current package

but in the end it's not that different from just running conda install and pip install separately..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems convenient indeed. I'll keep that in mind.

'Operating System :: MacOS'],
install_requires=[
'numpy>=1.13',
'scipy>=0.19',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scipy is not needed I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, but scikit-learn expects it and it is not installed as dependency

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to read from requrements.txt here? If you install scikit-learn with conda, scipy will get installed... Also, joblib is imported but is not in the requirements...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree its a duplication. But I don't know what is best regarding pypi.

Concerning joblib, I moved the imports to get rid of the dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's leave them separate, the small duplication is not that bad. There is some slight ideological difference between requirements.txt (deploy specific versions) and install_requires (specify minimum dependencies the deployed environment should adhere to). See https://packaging.python.org/discussions/install-requires-vs-requirements/

setup.py Outdated
install_requires=[
'numpy>=1.13',
'scipy>=0.19',
'scikit-learn>=0.18'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although pandas is not a strict requirement (it is currently not directly imported), it is still used a lot in the code (code that assumes pandas data structures are used), so therefore would add it here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should be the version required ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure. Let's maybe take 0.19 for now ?
(actually, we should in principle also test for all minimal dependencies, but can be for other PR)

@jorisvandenbossche
Copy link
Member

I also think it might be better to split the requirements parsing into another PR, so the actual setup.py updates can already be merged (or, for now still install some of the extra deps in travis.yml until all starting_kits have gotten a requirements file)

@aboucaud
Copy link
Contributor Author

I pushed a requirements.txt in every starting kit tested so far

@aboucaud
Copy link
Contributor Author

Only the python2 build fails because of the following import error

from tensorflow.core.framework.graph_pb2 import *
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    import sys
    _b=sys.version_info[0]<3 and (lambda x:x) or (lambda x:x.encode('latin1'))
>   from google.protobuf import descriptor as _descriptor
E   ImportError: No module named google.protobuf

/home/travis/miniconda/envs/testenv/lib/python2.7/site-packages/tensorflow/core/framework/graph_pb2.py:6: ImportError

This is specific to tensorflow and could maybe be solved easily by installing the protobuf lib, but since the former builds seems to pass and you did not have that dependency, I find it weird.

Any idea @kegl ?

@kegl
Copy link
Contributor

kegl commented Sep 15, 2017

Have never seen it. Pinging @mehdidc :)

@codecov
Copy link

codecov bot commented Sep 15, 2017

Codecov Report

Merging #50 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #50      +/-   ##
==========================================
+ Coverage   92.13%   92.14%   +0.01%     
==========================================
  Files          51       51              
  Lines        1195     1197       +2     
==========================================
+ Hits         1101     1103       +2     
  Misses         94       94
Impacted Files Coverage Δ
rampwf/utils/testing.py 100% <ø> (ø) ⬆️
rampwf/kits/ramp_kit.py 95.23% <100%> (+0.79%) ⬆️
rampwf/workflows/image_classifier.py 75.24% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b57cbf2...bddbf52. Read the comment docs.

@jorisvandenbossche
Copy link
Member

This is specific to tensorflow and could maybe be solved easily by installing the protobuf lib, but since the former builds seems to pass and you did not have that dependency, I find it weird.

Strange thing is that protobuf is actually installed in the failing travis build: https://travis-ci.org/paris-saclay-cds/ramp-workflow/jobs/276098845#L1314.

@aboucaud
Copy link
Contributor Author

aboucaud commented Sep 16, 2017

Mmmmh I find several complains online regarding protobuf and tensorflow (like this one) but only fixups, no actual recipe to make it work every time.

Both libraries versions seem to be tightly linked so what I can think of is:

  • protobuf released a new version 14h ago (see here) so it could just be bad luck that a build or a wheel was updated for py3.x but not py2.7 on pypi => the build may pass or fail erratically
  • we should (as was said earlier) specify versions for these particular packages. Especially the ones with incompatible API's (like keras of tf). @mehdidc & @kegl, what versions do you usually run MNIST with ?

@mehdidc
Copy link
Contributor

mehdidc commented Sep 16, 2017

@aboucaud in pollenating insects 3, we use tensorflow 1.3.0 and protobuf 3.4.0

@aboucaud
Copy link
Contributor Author

The MNIST ramp-kit build and the ramp-workflow build are identical but one fails and the other passes..

Now I'm really confused.. 😩

@jorisvandenbossche
Copy link
Member

Maybe try to list protobuf explicitly in the requirements for MNIST ? (don't know why that would matter, but in all the issues/questions about this issue online I saw a lot of 'solution' (like just reinstalling) for which I don't know why it would matter ...)

@aboucaud
Copy link
Contributor Author

At this point I really don't understand the difference between the two builds (MNIST alone or called by the ramp-workflow tests) on python2.7
Pinging @glemaitre @agramfort for new insights..

@jorisvandenbossche
Copy link
Member

Comaparing the two last builds, there is a difference in the protobuf version (3.3 vs 3.4)

@jorisvandenbossche
Copy link
Member

Also, to really see whether the same works on MNIST, I think we need to adapt the travis.yml file on MNIST to use the requirements.txt file

@aboucaud
Copy link
Contributor Author

aboucaud commented Sep 19, 2017

Comaparing the two last builds, there is a difference in the protobuf version (3.3 vs 3.4)

I tried this change this morning to see if that specific version was troubling. Apparently not.

Also, to really see whether the same works on MNIST, I think we need to adapt the travis.yml file on MNIST to use the requirements.txt file

I agree that's what we should do, but it also means taking care of installing the proper requirements for ramp-workflow as long as this PR is not merged.

@jorisvandenbossche
Copy link
Member

Also, to really see whether the same works on MNIST, I think we need to adapt the travis.yml file on MNIST to use the requirements.txt file

Trying that in ramp-kits/MNIST#3

@jorisvandenbossche
Copy link
Member

@aboucaud you can remove that last commit again. I just tried to make it even more comparable to how the env is set up in the MNIST repo, but also this attempt still fails with the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants