[MRG] Address comments for sklearn-contrib integration #238

wdevazelhes · 2019-07-26T08:44:12Z

Hi, we've made a request for inclusion in scikit-learn-contrib, this PR intends to address the comments of the issue:
scikit-learn-contrib/scikit-learn-contrib#40

TODO:

Fix flake8 errors (there remains some error due to unused imports in metric_learn/__init__.py, but I guess these are needed right ?) And also inverse_covariance.quic is imported but unused, but this is normal since it's just to define the variable HAS_SKGGM. I don't know if there's another way to bypass this. Here is the log of flake8 after the fixes:

./test/metric_learn_test.py:17:3: F401 'inverse_covariance.quic' imported but unused
./metric_learn/__init__.py:3:1: F401 '.constraints.Constraints' imported but unused
./metric_learn/__init__.py:4:1: F401 '.covariance.Covariance' imported but unused
./metric_learn/__init__.py:5:1: F401 '.itml.ITML' imported but unused
./metric_learn/__init__.py:5:1: F401 '.itml.ITML_Supervised' imported but unused
./metric_learn/__init__.py:6:1: F401 '.lmnn.LMNN' imported but unused
./metric_learn/__init__.py:7:1: F401 '.lsml.LSML' imported but unused
./metric_learn/__init__.py:7:1: F401 '.lsml.LSML_Supervised' imported but unused
./metric_learn/__init__.py:8:1: F401 '.sdml.SDML_Supervised' imported but unused
./metric_learn/__init__.py:8:1: F401 '.sdml.SDML' imported but unused
./metric_learn/__init__.py:9:1: F401 '.nca.NCA' imported but unused
./metric_learn/__init__.py:10:1: F401 '.lfda.LFDA' imported but unused
./metric_learn/__init__.py:11:1: F401 '.rca.RCA' imported but unused
./metric_learn/__init__.py:11:1: F401 '.rca.RCA_Supervised' imported but unused
./metric_learn/__init__.py:12:1: F401 '.mlkr.MLKR' imported but unused
./metric_learn/__init__.py:13:1: F401 '.mmc.MMC_Supervised' imported but unused
./metric_learn/__init__.py:13:1: F401 '.mmc.MMC' imported but unused
./metric_learn/__init__.py:15:1: F401 '._version.__version__' imported but unused

Note that I ignored some errors (E111 (indentation is not a multiple of four),E114 (indentation is not a multiple of four (comment)))

Put Python 3.7 in the CI tests

Mostly: - blank spaces (missing or to be removed) - lines too long - unused variables or imports - bad indentation

wdevazelhes · 2019-07-26T08:44:47Z

test/test_fit_transform.py

@@ -1,4 +1,3 @@
-import pytest


I checked and the tests are run even if I remove this line

wdevazelhes · 2019-07-26T08:53:05Z

.travis.yml

-    else
-        pip install scikit-learn==0.20.3;
-    fi
+  - pip install wheel cython numpy scipy codecov pytest-cov scikit-learn


I think now for python versions that are not supported by scikit-learn 0.21 and up, pip will now automatically install the V0.20.3. Let's see if it works

wdevazelhes · 2019-07-26T09:48:25Z

I think the coverage failed because sometimes I broke some too long lines, adding lines of code but not increasing coverage. I think the coverage is still high so we should be able to merge

bellet · 2019-07-27T15:24:06Z

About the imports in init, using __all__ as suggested in the thread below should solve the problem
https://stackoverflow.com/questions/31079047/python-pep8-class-in-init-imported-but-not-used

bellet · 2019-07-27T15:35:30Z

For the skggm import, a hacky workaround could be to add an assert(quic) ;-)

bellet

Otherwise nice job on the pep8 corrections

wdevazelhes · 2019-07-28T19:54:48Z

Thanks @bellet, I just addressed your comments

wdevazelhes · 2019-07-28T20:17:14Z

I also tried to add an automatic flake8 check that will be run on the whole code after each PR (last commit), now that it's all compliant it's possible to do that and not bother about doing it just on the diff
Let's see if it works

bellet · 2019-07-28T22:08:15Z

I think there is a syntax problem with the if statement in bash for string comparison. You are missing the double square brackets [[

wdevazelhes · 2019-07-29T07:06:14Z

I think there is a syntax problem with the if statement in bash for string comparison. You are missing the double square brackets [[

That's right, it should work now

wdevazelhes · 2019-07-29T07:29:55Z

I just realized I was using an old flake8 so in fact there are some flake8 errors that are not solved, which explains the new fail
I'll fix them too

wdevazelhes · 2019-07-29T08:12:34Z

I just updated the list of errors, removing W504, because it makes impossible situations when activated at the same time as W503 (see this for instance https://gitlab.com/pycqa/flake8/issues/466, also note that scikit-learn has disabled it (see https://github.com/scikit-learn/scikit-learn/blob/4cf4e6e519189c8969f5820ebbca388cf87bf0cb/setup.cfg#L29))

wdevazelhes · 2019-07-29T08:13:12Z

It appears to work now

bellet · 2019-07-29T08:58:57Z

.travis.yml

              pip install git+https://github.com/skggm/skggm.git@a0ed406586c4364ea3297a658f415e13b5cbdaf8;
    fi
 script:
+  # we run flake8 to ensure syntax (no need to run it on every version)
+  - if [[ $TRAVIS_PYTHON_VERSION == "3.7" ]];


for style, maybe put then on the same line as if

That's right, thanks
done

bellet · 2019-07-29T09:01:29Z

.travis.yml

              pip install git+https://github.com/skggm/skggm.git@a0ed406586c4364ea3297a658f415e13b5cbdaf8;
    fi
 script:
+  # we run flake8 to ensure syntax (no need to run it on every version)
+  - if [[ $TRAVIS_PYTHON_VERSION == "3.7" ]];
+      then flake8 --ignore=E111,E114,W504;


I think we should also ignore the other errors that are ignored by default by flake8, just like sklearn does, to avoid unimportant failures which could pop up in subsequent PRs

I agree, I used the syntax --extend-ignore (https://flake8.pycqa.org/en/3.7.8/user/options.html?highlight=extend-ignore#cmdoption-flake8-extend-ignore), so that flake8 will ignore the default errors (including W504), and also E111 and E114

bellet · 2019-07-29T09:21:26Z

I also tried to add an automatic flake8 check that will be run on the whole code after each PR (last commit), now that it's all compliant it's possible to do that and not bother about doing it just on the diff
Let's see if it works

About this: I am not sure what is commonly done, but checking only the diff seems to me like a better solution in the long run: if at some point we have a very good reason to merge some PR which is not compliant, we can do so and continue to check compliance of newly added code. This is what is done in sklearn (script is here: https://github.com/scikit-learn/scikit-learn/blob/master/build_tools/circle/flake8_diff.sh).

Not sure how difficult it is to make travis check only the diff but I think it may be preferable?

wdevazelhes · 2019-07-29T09:58:51Z

About this: I am not sure what is commonly done, but checking only the diff seems to me like a better solution in the long run: if at some point we have a very good reason to merge some PR which is not compliant, we can do so and continue to check compliance of newly added code. This is what is done in sklearn (script is here: https://github.com/scikit-learn/scikit-learn/blob/master/build_tools/circle/flake8_diff.sh).

Not sure how difficult it is to make travis check only the diff but I think it may be preferable?

Yes I agree it would be a good idea, and in theory it shouldn't be so difficult (locally that's what I do with the command git diff HEAD..master | flake8 --diff for instance, but in practice doing it online with travis it seems to be more difficult (given the big file in scikit-learn)... I think it's because on travis, the tests are run when the PR is merged with master already, so one has to go see in the remote what was the last commit before the PR is merged or sth... I'll give it a try I think the most simple is simply to copy paste scikit-learn file (and remove parts which are useless for us (like the "exclude" part), but some parts will be hard to figure out whether they're useless or not for me so we might just keep them ?) what do you think ?

wdevazelhes · 2019-07-31T13:47:25Z

As discussed with @bellet and @nvauquie, I modified .travis.yml so as to run different jobs with different configurations, and it seems to work (see the result of 383a903). We've decided to go back to running a full PEP8 check on every file, since some errors could be missed by a diff checking (e.g. deleting a line with some function used, but forgetting to delete the import). I left the script build_toold/travis/flake_diff.sh if needed in case we want to test on the diff.
I removed the PEP8 errors introduced for testing, so we should be good to merge

wdevazelhes · 2019-07-31T14:14:44Z

It seems that there is a problem with the code coverage now... I forgot one of the fields after_success though, I fixed it, let's see if it works after that

wdevazelhes · 2019-07-31T14:45:19Z

It seems to work now, so I guess we can merge ?

bellet · 2019-07-31T15:50:49Z

Nice!
I think there is still a small issue that the code coverage will run even if the tests fail
To fix this you could probably use stages (the first one for tests and the second one for flake8)? see example in
https://docs.travis-ci.com/user/build-stages
(stage name could be "Syntax checking" and the job name "PEP8" or "flake8")

bellet · 2019-07-31T17:11:49Z

I guess that's fine, there is indeed a single report generated that takes into account each build (https://codecov.io/github/metric-learn/metric-learn/commit/d244bc59ad55ad8257eb82e21b466930c636ebc4), but the results are not shown here
On the other hand the pep8 check passes

Maybe you can quickly try to introduce an error which is not in all builds to see what happens? (easiest way is probably to add an error in one of the skggm tests)

bellet · 2019-07-31T17:14:21Z

.travis.yml

+    before_install:
+      - sudo apt-get install liblapack-dev
+      - pip install --upgrade pip pytest
+      - pip install wheel cython numpy scipy codecov pytest-cov scikit-learn flake8


can remove flake8 on pytest jobs

That's right, thanks

bellet · 2019-07-31T17:15:35Z

.travis.yml

+    script:
+      - pytest test --cov;
+      - bash <(curl -s https://codecov.io/bash)
+  - name: "Syntax checking"


maybe name this "Syntax checking with flake8" to give an explicit mention of the tool used to check the syntax (for new contributors)

I agree, will do

terrytangyuan · 2019-08-01T07:45:06Z

I’ve never done flake8 check only on the diff before. What’s the motivation for that? If we are worried about having to fix a lot of flake8 errors, I’ve had good experience with automatic code style fixer like black before.

bellet · 2019-08-01T07:49:49Z

I’ve never done flake8 check only on the diff before. What’s the motivation for that? If we are worried about having to fix a lot of flake8 errors, I’ve had good experience with automatic code style fixer like black before.

We considered it in case we ever need to let some flake8 errors through at some point (this is also what sklearn does). But for reasons explained above we agree that, at least for now, flake8 check on the whole project is better

Thanks for the pointer to black, seems nice :-)

bellet · 2019-08-01T07:51:29Z

Looks like things are working as expected? Let's remove the error and we should be good to merge?

bellet · 2019-08-01T07:55:12Z

test_flake.py

@@ -0,0 +1,3 @@
+import nimp


we don't need this file do we?

That's right, thanks, I forgot I created it in metric-learn, and added it by mistake...

terrytangyuan · 2019-08-01T08:05:30Z

Using Travis stages is fine too but keep in mind that any subsequent stages won’t start until it’s previous stage passes. This might reduce Travis build time but will increase development time since developers won’t see all the errors besides lint check at once.

bellet · 2019-08-01T08:08:02Z

Using Travis stages is fine too but keep in mind that any subsequent stages won’t start until it’s previous stage passes. This might reduce Travis build time but will increase development time since developers won’t see all the errors besides lint check at once.

Yep. In the end we are not using stages and running everything in parallel (see last CI results)

wdevazelhes · 2019-08-01T08:14:01Z

After looking again at the build that failed that made me remove the after_success flag, I realized there was a timeout error in the python 3.7 build, which is maybe responsible for the fact that the coverage didn't print, so I'll just try to relaunch it, and I guess if it works it's even better/cleaner no ?

wdevazelhes · 2019-08-01T08:44:53Z

Hurray! :D It worked this time, when there was no timeout error, so I guess it's ok to merge ?

bellet

LGTM! Merging

bellet · 2019-08-01T09:20:19Z

You can now update the sklearn-contrib integration issue :-)

wdevazelhes · 2019-08-01T09:25:49Z

You can now update the sklearn-contrib integration issue :-)

Done :-)

William de Vazelhes added 2 commits July 26, 2019 09:20

Fix flake8 errors:

f0d6f70

Mostly: - blank spaces (missing or to be removed) - lines too long - unused variables or imports - bad indentation

Remove unused pytest import

bb8744b

wdevazelhes commented Jul 26, 2019

View reviewed changes

test/test_fit_transform.py

@@ -1,4 +1,3 @@

import pytest

Copy link

Member Author

wdevazelhes Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and the tests are run even if I remove this line

Update .travis.yml to put also python 3.7

5b4e3b5

wdevazelhes commented Jul 26, 2019

View reviewed changes

wdevazelhes changed the title ~~[WIP] Address comments for sklearn-contrib integration~~ [MRG] Address comments for sklearn-contrib integration Jul 26, 2019

wdevazelhes requested review from perimosocordiae and bellet July 26, 2019 09:48

bellet approved these changes Jul 27, 2019

View reviewed changes

no more flake8 error

999bff6

Add flake8 check

208946c

Fix flake8 command

03d617e

Solve W605 errors and update list of ignored errors

004fbe6

bellet reviewed Jul 29, 2019

View reviewed changes

Improve flake8 command

afa21b4

William de Vazelhes added 3 commits July 29, 2019 14:19

Add file to run travis flake8 only on diff

6d8f2c5

Fix typo

92a2098

Put good name for flake8 file

6515b53

William de Vazelhes added 3 commits July 31, 2019 15:30

Test new way to specify travis jobs

d91bbba

Go back to full PEP8

383a903

Remove PEP8 errors introduced for testing

c2486b5

Fix missing entry

7c03b3a

Try something with removing 'after_success'

8b437a1

Put an error in test to see what happens

d244bc5

bellet reviewed Jul 31, 2019

View reviewed changes

William de Vazelhes added 2 commits August 1, 2019 09:01

Remove unused flake8 install for pytest runs and rename flake8 job

4ea7c7f

Add error for SDML, and remove the other error

8108276

bellet reviewed Aug 1, 2019

View reviewed changes

Remove file

96372a9

William de Vazelhes added 2 commits August 1, 2019 10:16

Put again after success

fbb24d8

Remove skggm error

1f392ea

bellet approved these changes Aug 1, 2019

View reviewed changes

bellet merged commit fa339f8 into scikit-learn-contrib:master Aug 1, 2019

bellet mentioned this pull request Sep 3, 2019

Fix some PEP8 errors #140

Closed

[MRG] Address comments for sklearn-contrib integration #238

[MRG] Address comments for sklearn-contrib integration #238

Conversation

wdevazelhes commented Jul 26, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wdevazelhes commented Jul 26, 2019

bellet commented Jul 27, 2019 • edited

bellet commented Jul 27, 2019

bellet left a comment

Choose a reason for hiding this comment

wdevazelhes commented Jul 28, 2019

wdevazelhes commented Jul 28, 2019

bellet commented Jul 28, 2019

wdevazelhes commented Jul 29, 2019

wdevazelhes commented Jul 29, 2019

wdevazelhes commented Jul 29, 2019

wdevazelhes commented Jul 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bellet commented Jul 29, 2019

wdevazelhes commented Jul 29, 2019

wdevazelhes commented Jul 31, 2019

wdevazelhes commented Jul 31, 2019

wdevazelhes commented Jul 31, 2019

bellet commented Jul 31, 2019

bellet commented Jul 31, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

terrytangyuan commented Aug 1, 2019

bellet commented Aug 1, 2019 • edited

bellet commented Aug 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

terrytangyuan commented Aug 1, 2019 • edited

bellet commented Aug 1, 2019

wdevazelhes commented Aug 1, 2019

wdevazelhes commented Aug 1, 2019

bellet left a comment

Choose a reason for hiding this comment

bellet commented Aug 1, 2019

wdevazelhes commented Aug 1, 2019

wdevazelhes commented Jul 26, 2019 •

edited

bellet commented Jul 27, 2019 •

edited

bellet commented Aug 1, 2019 •

edited

terrytangyuan commented Aug 1, 2019 •

edited