Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Address comments for sklearn-contrib integration #238

Merged

Conversation

wdevazelhes
Copy link
Member

@wdevazelhes wdevazelhes commented Jul 26, 2019

Hi, we've made a request for inclusion in scikit-learn-contrib, this PR intends to address the comments of the issue:
scikit-learn-contrib/scikit-learn-contrib#40

TODO:

  • Fix flake8 errors (there remains some error due to unused imports in metric_learn/__init__.py, but I guess these are needed right ?) And also inverse_covariance.quic is imported but unused, but this is normal since it's just to define the variable HAS_SKGGM. I don't know if there's another way to bypass this. Here is the log of flake8 after the fixes:
./test/metric_learn_test.py:17:3: F401 'inverse_covariance.quic' imported but unused
./metric_learn/__init__.py:3:1: F401 '.constraints.Constraints' imported but unused
./metric_learn/__init__.py:4:1: F401 '.covariance.Covariance' imported but unused
./metric_learn/__init__.py:5:1: F401 '.itml.ITML' imported but unused
./metric_learn/__init__.py:5:1: F401 '.itml.ITML_Supervised' imported but unused
./metric_learn/__init__.py:6:1: F401 '.lmnn.LMNN' imported but unused
./metric_learn/__init__.py:7:1: F401 '.lsml.LSML' imported but unused
./metric_learn/__init__.py:7:1: F401 '.lsml.LSML_Supervised' imported but unused
./metric_learn/__init__.py:8:1: F401 '.sdml.SDML_Supervised' imported but unused
./metric_learn/__init__.py:8:1: F401 '.sdml.SDML' imported but unused
./metric_learn/__init__.py:9:1: F401 '.nca.NCA' imported but unused
./metric_learn/__init__.py:10:1: F401 '.lfda.LFDA' imported but unused
./metric_learn/__init__.py:11:1: F401 '.rca.RCA' imported but unused
./metric_learn/__init__.py:11:1: F401 '.rca.RCA_Supervised' imported but unused
./metric_learn/__init__.py:12:1: F401 '.mlkr.MLKR' imported but unused
./metric_learn/__init__.py:13:1: F401 '.mmc.MMC_Supervised' imported but unused
./metric_learn/__init__.py:13:1: F401 '.mmc.MMC' imported but unused
./metric_learn/__init__.py:15:1: F401 '._version.__version__' imported but unused

Note that I ignored some errors (E111 (indentation is not a multiple of four),E114 (indentation is not a multiple of four (comment)))

  • Put Python 3.7 in the CI tests

William de Vazelhes added 2 commits July 26, 2019 09:20
Mostly:

- blank spaces (missing or to be removed)
- lines too long
- unused variables or imports
- bad indentation
@@ -1,4 +1,3 @@
import pytest
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and the tests are run even if I remove this line

.travis.yml Outdated
else
pip install scikit-learn==0.20.3;
fi
- pip install wheel cython numpy scipy codecov pytest-cov scikit-learn
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think now for python versions that are not supported by scikit-learn 0.21 and up, pip will now automatically install the V0.20.3. Let's see if it works

@wdevazelhes
Copy link
Member Author

I think the coverage failed because sometimes I broke some too long lines, adding lines of code but not increasing coverage. I think the coverage is still high so we should be able to merge

@wdevazelhes wdevazelhes changed the title [WIP] Address comments for sklearn-contrib integration [MRG] Address comments for sklearn-contrib integration Jul 26, 2019
@bellet
Copy link
Member

bellet commented Jul 27, 2019

About the imports in init, using __all__ as suggested in the thread below should solve the problem
https://stackoverflow.com/questions/31079047/python-pep8-class-in-init-imported-but-not-used

@bellet
Copy link
Member

bellet commented Jul 27, 2019

For the skggm import, a hacky workaround could be to add an assert(quic) ;-)

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise nice job on the pep8 corrections

@wdevazelhes
Copy link
Member Author

Thanks @bellet, I just addressed your comments

@wdevazelhes
Copy link
Member Author

I also tried to add an automatic flake8 check that will be run on the whole code after each PR (last commit), now that it's all compliant it's possible to do that and not bother about doing it just on the diff
Let's see if it works

@bellet
Copy link
Member

bellet commented Jul 28, 2019

I think there is a syntax problem with the if statement in bash for string comparison. You are missing the double square brackets [[

@wdevazelhes
Copy link
Member Author

I think there is a syntax problem with the if statement in bash for string comparison. You are missing the double square brackets [[

That's right, it should work now

@wdevazelhes
Copy link
Member Author

I just realized I was using an old flake8 so in fact there are some flake8 errors that are not solved, which explains the new fail
I'll fix them too

@wdevazelhes
Copy link
Member Author

I just updated the list of errors, removing W504, because it makes impossible situations when activated at the same time as W503 (see this for instance https://gitlab.com/pycqa/flake8/issues/466, also note that scikit-learn has disabled it (see https://github.com/scikit-learn/scikit-learn/blob/4cf4e6e519189c8969f5820ebbca388cf87bf0cb/setup.cfg#L29))

@wdevazelhes
Copy link
Member Author

It appears to work now

.travis.yml Outdated
pip install git+https://github.com/skggm/skggm.git@a0ed406586c4364ea3297a658f415e13b5cbdaf8;
fi
script:
# we run flake8 to ensure syntax (no need to run it on every version)
- if [[ $TRAVIS_PYTHON_VERSION == "3.7" ]];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for style, maybe put then on the same line as if

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, thanks
done

.travis.yml Outdated
pip install git+https://github.com/skggm/skggm.git@a0ed406586c4364ea3297a658f415e13b5cbdaf8;
fi
script:
# we run flake8 to ensure syntax (no need to run it on every version)
- if [[ $TRAVIS_PYTHON_VERSION == "3.7" ]];
then flake8 --ignore=E111,E114,W504;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also ignore the other errors that are ignored by default by flake8, just like sklearn does, to avoid unimportant failures which could pop up in subsequent PRs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I used the syntax --extend-ignore (https://flake8.pycqa.org/en/3.7.8/user/options.html?highlight=extend-ignore#cmdoption-flake8-extend-ignore), so that flake8 will ignore the default errors (including W504), and also E111 and E114

@bellet
Copy link
Member

bellet commented Jul 29, 2019

I also tried to add an automatic flake8 check that will be run on the whole code after each PR (last commit), now that it's all compliant it's possible to do that and not bother about doing it just on the diff
Let's see if it works

About this: I am not sure what is commonly done, but checking only the diff seems to me like a better solution in the long run: if at some point we have a very good reason to merge some PR which is not compliant, we can do so and continue to check compliance of newly added code. This is what is done in sklearn (script is here: https://github.com/scikit-learn/scikit-learn/blob/master/build_tools/circle/flake8_diff.sh).

Not sure how difficult it is to make travis check only the diff but I think it may be preferable?

@wdevazelhes
Copy link
Member Author

About this: I am not sure what is commonly done, but checking only the diff seems to me like a better solution in the long run: if at some point we have a very good reason to merge some PR which is not compliant, we can do so and continue to check compliance of newly added code. This is what is done in sklearn (script is here: https://github.com/scikit-learn/scikit-learn/blob/master/build_tools/circle/flake8_diff.sh).

Not sure how difficult it is to make travis check only the diff but I think it may be preferable?

Yes I agree it would be a good idea, and in theory it shouldn't be so difficult (locally that's what I do with the command git diff HEAD..master | flake8 --diff for instance, but in practice doing it online with travis it seems to be more difficult (given the big file in scikit-learn)... I think it's because on travis, the tests are run when the PR is merged with master already, so one has to go see in the remote what was the last commit before the PR is merged or sth... I'll give it a try I think the most simple is simply to copy paste scikit-learn file (and remove parts which are useless for us (like the "exclude" part), but some parts will be hard to figure out whether they're useless or not for me so we might just keep them ?) what do you think ?

@wdevazelhes
Copy link
Member Author

As discussed with @bellet and @nvauquie, I modified .travis.yml so as to run different jobs with different configurations, and it seems to work (see the result of 383a903). We've decided to go back to running a full PEP8 check on every file, since some errors could be missed by a diff checking (e.g. deleting a line with some function used, but forgetting to delete the import). I left the script build_toold/travis/flake_diff.sh if needed in case we want to test on the diff.
I removed the PEP8 errors introduced for testing, so we should be good to merge

@wdevazelhes
Copy link
Member Author

It seems that there is a problem with the code coverage now... I forgot one of the fields after_success though, I fixed it, let's see if it works after that

@wdevazelhes
Copy link
Member Author

It seems to work now, so I guess we can merge ?

@bellet
Copy link
Member

bellet commented Jul 31, 2019

Nice!
I think there is still a small issue that the code coverage will run even if the tests fail
To fix this you could probably use stages (the first one for tests and the second one for flake8)? see example in
https://docs.travis-ci.com/user/build-stages
(stage name could be "Syntax checking" and the job name "PEP8" or "flake8")

@bellet
Copy link
Member

bellet commented Jul 31, 2019

I guess that's fine, there is indeed a single report generated that takes into account each build (https://codecov.io/github/metric-learn/metric-learn/commit/d244bc59ad55ad8257eb82e21b466930c636ebc4), but the results are not shown here
On the other hand the pep8 check passes

Maybe you can quickly try to introduce an error which is not in all builds to see what happens? (easiest way is probably to add an error in one of the skggm tests)

.travis.yml Outdated
before_install:
- sudo apt-get install liblapack-dev
- pip install --upgrade pip pytest
- pip install wheel cython numpy scipy codecov pytest-cov scikit-learn flake8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove flake8 on pytest jobs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, thanks

.travis.yml Outdated
script:
- pytest test --cov;
- bash <(curl -s https://codecov.io/bash)
- name: "Syntax checking"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe name this "Syntax checking with flake8" to give an explicit mention of the tool used to check the syntax (for new contributors)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, will do

@terrytangyuan
Copy link
Member

I’ve never done flake8 check only on the diff before. What’s the motivation for that? If we are worried about having to fix a lot of flake8 errors, I’ve had good experience with automatic code style fixer like black before.

@bellet
Copy link
Member

bellet commented Aug 1, 2019

I’ve never done flake8 check only on the diff before. What’s the motivation for that? If we are worried about having to fix a lot of flake8 errors, I’ve had good experience with automatic code style fixer like black before.

We considered it in case we ever need to let some flake8 errors through at some point (this is also what sklearn does). But for reasons explained above we agree that, at least for now, flake8 check on the whole project is better

Thanks for the pointer to black, seems nice :-)

@bellet
Copy link
Member

bellet commented Aug 1, 2019

Looks like things are working as expected? Let's remove the error and we should be good to merge?

test_flake.py Outdated
@@ -0,0 +1,3 @@
import nimp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this file do we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, thanks, I forgot I created it in metric-learn, and added it by mistake...

@terrytangyuan
Copy link
Member

terrytangyuan commented Aug 1, 2019

Using Travis stages is fine too but keep in mind that any subsequent stages won’t start until it’s previous stage passes. This might reduce Travis build time but will increase development time since developers won’t see all the errors besides lint check at once.

@bellet
Copy link
Member

bellet commented Aug 1, 2019

Using Travis stages is fine too but keep in mind that any subsequent stages won’t start until it’s previous stage passes. This might reduce Travis build time but will increase development time since developers won’t see all the errors besides lint check at once.

Yep. In the end we are not using stages and running everything in parallel (see last CI results)

@wdevazelhes
Copy link
Member Author

After looking again at the build that failed that made me remove the after_success flag, I realized there was a timeout error in the python 3.7 build, which is maybe responsible for the fact that the coverage didn't print, so I'll just try to relaunch it, and I guess if it works it's even better/cleaner no ?

@wdevazelhes
Copy link
Member Author

Hurray! :D It worked this time, when there was no timeout error, so I guess it's ok to merge ?

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Merging

@bellet bellet merged commit fa339f8 into scikit-learn-contrib:master Aug 1, 2019
@bellet
Copy link
Member

bellet commented Aug 1, 2019

You can now update the sklearn-contrib integration issue :-)

@wdevazelhes
Copy link
Member Author

You can now update the sklearn-contrib integration issue :-)

Done :-)

@bellet bellet mentioned this pull request Sep 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants