[MRG] Flow to test for sklearn compatibility #116

GillesVandewiele · 2019-07-02T11:25:52Z

Hello,

This is a PR which allows to test automatically for all tslearn estimators whether they comply to the required checks of sklearn, allowing them to be used in their utilities such as GridSearchCV, Pipeline, ... The code to do this is currently located in tslearn/testing_utils.py, but should be moved to tslearn/testing when available.

I also included an example demonstrating how GlobalGAKMeans can now be used with an sklearn pipeline, in tslearn/docs/examples/plot_gakkmeans_sklearn.

All feedback is more than welcome!

Kind regards,
Gilles

pep8speaks · 2019-07-02T11:25:56Z

Hello @GillesVandewiele! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file tslearn/shapelets.py:

Line 600:80: E501 line too long (84 > 79 characters)
Line 679:80: E501 line too long (111 > 79 characters)

Comment last updated at 2019-08-18 09:19:11 UTC

This reverts commit 99dc221.

rtavenar · 2019-07-02T12:00:47Z

Hi @GillesVandewiele

Thanks, this helps a lot. If you are OK with that, it would be better to fix all estimators in this PR before merging so that we do not forget to change others later. Also, your testing_utils.py will be considered as a standard test if it works well, then that means that all tests should pass before merging, hence all estimators should be fixed before merging.

GillesVandewiele · 2019-07-02T12:17:11Z

Hi @rtavenar,

Sure no problem at all. Do you happen to know whether I can run the script travis is running locally (and how I would do that)?

Something that is important to take into account is that we do not break any of the current tslearn functionality, for which there is no unit test in place currently, by refactoring to aggressively.

rtavenar · 2019-07-02T12:27:32Z

All functionalities in tslearn should be unit tested at the moment, though I suspect this is not the case, but we're doing our best for that.

The command Travis CI is running once everything is installed is:
https://github.com/rtavenar/tslearn/blob/master/.travis.yml#L57
and you can set ${KERAS_IGNORE} to an empty string

…le that shows parameter tuning

rtavenar

Hi @GillesVandewiele

Thanks for this nice piece of work. This is a very good thing to ensure that our estimators are valid sklearn ones. I suggested a few changes, let me know what you think.

I also plan to invite @johannfaouzi as an additional reviewer for this PR, once I find how to do so :)

Also, of course, tests should pass on Travis before merging

tslearn/clustering.py

tslearn/docs/examples/plot_gakkmeans_sklearn.py

tslearn/docs/examples/plot_knnts_sklearn.py

tslearn/preprocessing.py

tslearn/utils.py

tslearn/testing_utils.py

rtavenar · 2019-07-02T19:50:16Z

tslearn/testing_utils.py

+        print('{} is sklearn compliant.'.format(estimator[0]))
+
+
+check_all_estimators()


No need for that call if it is in a test suite

Yes, the tests are run automatically with a bash script. All the functions that are tests should start with test_ and should raise an error if the test is not successful (using assert or a function from np.testing or sklearn.testing for instance).

There are many folders with tests in scikit-learn, you can have a look at sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py for instance.

You can also catch errors and warnings with pytest (if you want to test that an error or a warning is raised):

https://docs.pytest.org/en/latest/assert.html

https://docs.pytest.org/en/latest/warnings.html

johannfaouzi

Great PR! It's nice to hear from you this close after the conference ;)

I briefly looked at the PR and left some comments, I'll try to have a deeper look at it soon.

tslearn/utils.py

johannfaouzi · 2019-07-03T09:10:09Z

tslearn/testing_utils.py

+        print('{} is sklearn compliant.'.format(estimator[0]))
+
+
+check_all_estimators()


Yes, the tests are run automatically with a bash script. All the functions that are tests should start with test_ and should raise an error if the test is not successful (using assert or a function from np.testing or sklearn.testing for instance).

There are many folders with tests in scikit-learn, you can have a look at sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py for instance.

tslearn/preprocessing.py

johannfaouzi · 2019-07-03T09:19:48Z

tslearn/testing_utils.py

+        print('{} is sklearn compliant.'.format(estimator[0]))
+
+
+check_all_estimators()


You can also catch errors and warnings with pytest (if you want to test that an error or a warning is raised):

https://docs.pytest.org/en/latest/assert.html

https://docs.pytest.org/en/latest/warnings.html

tslearn/clustering.py

tslearn/neighbors.py

tslearn/svm.py

tslearn/tests/sklearn_patches.py

rtavenar · 2019-07-26T14:50:04Z

This is great!

You really did an amazing job at that PR!

I have added some comments. You can take them into account, wait for feedback from @johannfaouzi and then we will be good.

Ah, and you can also add your name in third position in the list of authors for tslearn reference (in README.md and in the bibtex variable defined in tslearn/__init__.py (and if you could add Johann there as well that would be great :).

Best,
Romain

GillesVandewiele · 2019-07-26T14:51:29Z

Thank you @rtavenar! I did underestimate the amount of work it was going to take, but I'm happy that it is done :)

I will implement your comments tomorrow and add @johannfaouzi and myself to the author list, thank you for that!

GillesVandewiele · 2019-07-28T14:00:57Z

I implemented your comments and managed to reduce the travis build times to < 20 minutes (python2.7 takes roughly 5 minutes longer than the other versions).

johannfaouzi

Thank you very much @GillesVandewiele for this big PR. It represents a huge amount of work. Sorry for the delay for my review.

I left a lot of comments, but most of them are redundant. My comments are not necessarily right, so take your time, the discussion is obviously open, and your remarks are more than welcomed.

My remarks are sometimes about Romain's code, but since you made some changes in the same class/function, maybe you can work on them.

tslearn/docs/examples/plot_knnts_sklearn.py

tslearn/clustering.py

tslearn/shapelets.py

tslearn/svm.py

GillesVandewiele · 2019-07-31T10:41:12Z

Thank you so much @johannfaouzi.

I will integrate your comments, or reply to them, as soon as I have some spare time :). I should be able to do this by the end of this week. If I'm right, Romain is currently on holidays anyway, so there's no urge.

rtavenar · 2019-08-12T08:26:01Z

@GillesVandewiele

Since I have been away for some time, I am unsure I followed all your updates. It seems most comments have been taken into account (are now outdated). Are there still questions pending? Or are we ready for merge? @johannfaouzi what do you think?

GillesVandewiele · 2019-08-12T08:27:14Z

Hi Romain, hope you enjoyed your holidays! There's a few issues left open where would like to hear your opinion on. I'll place a reaction on those.

Normally, I should be able to integrate these last comments this weekend, after which it should be ready to merge :).

GillesVandewiele · 2019-08-16T07:38:01Z

Alright, I integrated the final comments. Thanks for the feedback @rtavenar and @johannfaouzi!

Travis is building, and no pep8 issues left. I think this is now ready to get merged 🙌 :)

johannfaouzi · 2019-08-18T09:13:11Z

LGTM. Thanks for the huge PR!

GillesVandewiele · 2019-08-18T09:19:36Z

Thanks @johannfaouzi! I did notice that I still used verbose_level at some places in shapelets.py, I quickly replaced those usages as well :)

rtavenar · 2019-08-18T12:24:52Z

Thanks a lot @GillesVandewiele for this huge PR!

rtavenar and others added 5 commits June 28, 2019 15:52

Attempt to fix default image for sphinx-gallery

4418bd5

retrieve a list of all BaseEstimators and apply check_estimator

a960321

small bug in get_estimators + GAKKmeans compliance with sklearn

bb6f337

Make sure GAKKmeans can work with 3D data, and example of pipeline

7ded700

added the example file to demonstrate pipeline

99b89f4

GillesVandewiele added 5 commits July 2, 2019 13:29

pep8 issues

4f3ed8f

pep8 issues 2

99dc221

pep8 issues 2

a35e24e

Revert "pep8 issues 2"

42c20fb

This reverts commit 99dc221.

remove pandas import

808406a

rtavenar changed the title ~~Flow to test for sklearn compatibility + fixing 1 estimator as an example~~ [WIP] Flow to test for sklearn compatibility Jul 2, 2019

GillesVandewiele and others added 4 commits July 2, 2019 15:32

some small GAK-KMeans fixes and KNN-TimeSeries is now sklearn-compliant

f40a8ba

pep8 issues and allow a string ("dtw") to be passed as metric

17c350a

fixed bug in fit and transform method of MinMaxScaler and added examp…

641dbd8

…le that shows parameter tuning

nicer output of test script & pep8 issue

6681816

rtavenar self-requested a review July 2, 2019 16:52

rtavenar self-assigned this Jul 2, 2019

added binder links + rtd theme as default for local build

cb4459d

rtavenar previously requested changes Jul 2, 2019

View reviewed changes

rtavenar assigned rtavenar and unassigned rtavenar Jul 2, 2019

johannfaouzi reviewed Jul 3, 2019

View reviewed changes

GillesVandewiele commented Jul 3, 2019

View reviewed changes

tslearn/neighbors.py Show resolved Hide resolved

rtavenar added this to In progress in Towards v.0.2 Jul 3, 2019

rtavenar removed this from In progress in Towards v.0.2 Jul 3, 2019

addressing some of the comments

ea062b8

rtavenar reviewed Jul 26, 2019

View reviewed changes

tslearn/svm.py Outdated Show resolved Hide resolved

rtavenar reviewed Jul 26, 2019

View reviewed changes

tslearn/svm.py Outdated Show resolved Hide resolved

rtavenar reviewed Jul 26, 2019

View reviewed changes

tslearn/tests/sklearn_patches.py Show resolved Hide resolved

GillesVandewiele added 4 commits July 28, 2019 13:12

lower max_iter during test_estimator and doc updates

d089d9f

fixing doctest & faster travis build times

00dd47e

fixing doctest & faster travis build times

a313be0

fixing doctest & faster travis build times

d2220da

GillesVandewiele added 2 commits July 28, 2019 16:03

Update author list

9217344

Update author list

843d84b

johannfaouzi reviewed Jul 31, 2019

View reviewed changes

integrating comments #1

0d386b3

GillesVandewiele added 5 commits August 16, 2019 08:28

integrating comments (part 2)

cd834f6

rename variable feature_range to value_range

1319965

set verbose in init method

a412e2f

replace now deprecated code in test and plot scripts

d3f68d0

replace now deprecated code in doctests

ed7eb8b

johannfaouzi approved these changes Aug 18, 2019

View reviewed changes

remove all usages of verbose_level in shapelets

398d974

rtavenar merged commit 6655070 into tslearn-team:dev Aug 18, 2019

rtavenar changed the title ~~[WIP] Flow to test for sklearn compatibility~~ [MRG] Flow to test for sklearn compatibility Aug 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Flow to test for sklearn compatibility #116

[MRG] Flow to test for sklearn compatibility #116

GillesVandewiele commented Jul 2, 2019

pep8speaks commented Jul 2, 2019 •

edited

rtavenar commented Jul 2, 2019

GillesVandewiele commented Jul 2, 2019

rtavenar commented Jul 2, 2019 •

edited

rtavenar left a comment •

edited

rtavenar Jul 2, 2019

johannfaouzi Jul 3, 2019

johannfaouzi Jul 3, 2019

johannfaouzi left a comment •

edited

johannfaouzi Jul 3, 2019

johannfaouzi Jul 3, 2019

rtavenar commented Jul 26, 2019

GillesVandewiele commented Jul 26, 2019

GillesVandewiele commented Jul 28, 2019

johannfaouzi left a comment •

edited

GillesVandewiele commented Jul 31, 2019

rtavenar commented Aug 12, 2019

GillesVandewiele commented Aug 12, 2019

GillesVandewiele commented Aug 16, 2019

johannfaouzi commented Aug 18, 2019

GillesVandewiele commented Aug 18, 2019

rtavenar commented Aug 18, 2019

		print('{} is sklearn compliant.'.format(estimator[0]))


		check_all_estimators()

[MRG] Flow to test for sklearn compatibility #116

[MRG] Flow to test for sklearn compatibility #116

Conversation

GillesVandewiele commented Jul 2, 2019

pep8speaks commented Jul 2, 2019 • edited

Comment last updated at 2019-08-18 09:19:11 UTC

rtavenar commented Jul 2, 2019

GillesVandewiele commented Jul 2, 2019

rtavenar commented Jul 2, 2019 • edited

rtavenar left a comment • edited

Choose a reason for hiding this comment

rtavenar Jul 2, 2019

Choose a reason for hiding this comment

johannfaouzi Jul 3, 2019

Choose a reason for hiding this comment

johannfaouzi Jul 3, 2019

Choose a reason for hiding this comment

johannfaouzi left a comment • edited

Choose a reason for hiding this comment

johannfaouzi Jul 3, 2019

Choose a reason for hiding this comment

johannfaouzi Jul 3, 2019

Choose a reason for hiding this comment

rtavenar commented Jul 26, 2019

GillesVandewiele commented Jul 26, 2019

GillesVandewiele commented Jul 28, 2019

johannfaouzi left a comment • edited

Choose a reason for hiding this comment

GillesVandewiele commented Jul 31, 2019

rtavenar commented Aug 12, 2019

GillesVandewiele commented Aug 12, 2019

GillesVandewiele commented Aug 16, 2019

johannfaouzi commented Aug 18, 2019

GillesVandewiele commented Aug 18, 2019

rtavenar commented Aug 18, 2019

pep8speaks commented Jul 2, 2019 •

edited

rtavenar commented Jul 2, 2019 •

edited

rtavenar left a comment •

edited

johannfaouzi left a comment •

edited

johannfaouzi left a comment •

edited