Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Expose interpolation thresholds as public fitted attribute of IsotonicRegression #16289

Merged
merged 32 commits into from Jun 25, 2020

Conversation

kishimoto-banana
Copy link
Contributor

@kishimoto-banana kishimoto-banana commented Jan 29, 2020

I want to reconstruct fitted stepwise interpolation function of isotonic regression in other programming languages (Scala, Go, ...).

Therefore, I want to get interpolation points (input to function f_).

However, interpolation points (_necessary_X_, _necessary_y_) are internal variable of IsotonicRegression class according to PEP8 (leading underscores).

This PR changes interpolation points to attributes of IsotonicRegression class.

@agramfort
Copy link
Member

why exposing more public attribute helps you in other languages?

@kishimoto-banana
Copy link
Contributor Author

I want to save interpolation points (_necessary_X_, _necessary_y_) to files and read those files in other langages in order to reconsruct fitted stepwise interpolation function.

But because _necessary_X_ and _necessary_y_ are internal variable of IsotonicRegression class, getting them seems strange.

@agramfort
Copy link
Member

agramfort commented Jan 30, 2020 via email

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that in general we should not expose every thing as public attribute just for the purpose of making it easier for people to write model exporters, I am not opposed to exposing those attributes for this particular case, for model inspection purpose. There are only two attributes, this class has no other fitted attributes and those 2 arrays could be meaningfully plotted to graphically explain what the model does.

Actually this example (examples/plot_isotonic_regression.py) would really benefit to be updated to add a side plot of the X_threshods_ vs y_thresholds_ for education purpose.

But if we expose new public attributes we need more tests for the presence of those attributes and their values (e.g. that both arrays have the same length, that there values are increasing, in the range of the training data and that the length is not larger than the training set). This could be a good opportunity to check that the de-duplication code works as expected on some toy training sets with just a few well chosen samples that would illustrate some extreme cases.

sklearn/isotonic.py Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
@agramfort
Copy link
Member

agramfort commented Jan 30, 2020 via email

@kishimoto-banana
Copy link
Contributor Author

Thank you for following.
I will rename and add explanation for X_thresholds_ and y_thresholds_.

sklearn/isotonic.py Outdated Show resolved Hide resolved
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kishimoto-banana are you also interested in updating the example as described in #16289 (review)? or do you want someone else to take over this PR to finish that part?

sklearn/isotonic.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
@ogrisel ogrisel changed the title Change interpolation point from internal variable to attribute of IsotonicRegression Expose interpolation thresholds as public fitted attribute of IsotonicRegression Jan 31, 2020
sklearn/isotonic.py Outdated Show resolved Hide resolved
kishimoto-banana and others added 3 commits February 1, 2020 00:18
Co-Authored-By: Olivier Grisel <olivier.grisel@ensta.org>
@kishimoto-banana
Copy link
Contributor Author

@ogrisel Thank you for your comments.
I hope to take over updating the example to someone else.

sklearn/isotonic.py Outdated Show resolved Hide resolved
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

@ogrisel
Copy link
Member

ogrisel commented Feb 2, 2020

I cannot access the codecov report at the moment. I get HTTP error 500 whenever I click on the reports.

@ogrisel
Copy link
Member

ogrisel commented Feb 2, 2020

codecov was broken when uploading the coverage data of some of the jobs:

https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=13126&view=logs&j=78a0bf4f-79e5-5387-94ec-13e67d216d6e&t=80793e4d-ec39-5212-d625-56647bcbed6e&l=43

Hence the decrease in coverage.

@ogrisel ogrisel changed the title Expose interpolation thresholds as public fitted attribute of IsotonicRegression [MRG] Expose interpolation thresholds as public fitted attribute of IsotonicRegression Feb 2, 2020
@ogrisel
Copy link
Member

ogrisel commented Feb 7, 2020

Alright, code coverage is now happy.

@ogrisel
Copy link
Member

ogrisel commented Feb 7, 2020

Here is the plot of the updated example:

image

@ogrisel
Copy link
Member

ogrisel commented Mar 12, 2020

Thanks @thomasjpfan for fixing the PR. Would you like to review it?

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kishimoto-banana , looks good.

In the prediction function plot it'd be good to also plot the extrapolation lines, i.e. beyond X.min() and X.max(). If it makes the code less clear, a simple sentence could be enough:

Beyond the limits of the training data (i.e. before X.min() and after X.max()), the prediction corresponds to pred(X.min()) and pred(X.max()) respectively.

@ogrisel I'm not quite sure about keeping backward compat here?

examples/plot_isotonic_regression.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
@ogrisel
Copy link
Member

ogrisel commented Jun 23, 2020

I believe I have addressed all the pending review comments @NicolasHug @thomasjpfan.

@ogrisel
Copy link
Member

ogrisel commented Jun 23, 2020

Here is the new example figure with clipping based extrapolation:

image

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both, some minor comments but LGTM anyway

examples/miscellaneous/plot_isotonic_regression.py Outdated Show resolved Hide resolved
examples/miscellaneous/plot_isotonic_regression.py Outdated Show resolved Hide resolved
doc/whats_new/v0.24.rst Outdated Show resolved Hide resolved
examples/miscellaneous/plot_isotonic_regression.py Outdated Show resolved Hide resolved
examples/miscellaneous/plot_isotonic_regression.py Outdated Show resolved Hide resolved
sklearn/isotonic.py Outdated Show resolved Hide resolved
ogrisel and others added 2 commits June 25, 2020 19:39
@ogrisel ogrisel merged commit acb8ac5 into scikit-learn:master Jun 25, 2020
7 checks passed
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Jul 17, 2020
…sotonicRegression (scikit-learn#16289)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
…sotonicRegression (scikit-learn#16289)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants