Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add LinearRegression class #134

Merged
merged 4 commits into from Mar 20, 2022

Conversation

mathause
Copy link
Member

@mathause mathause commented Feb 23, 2022

  • Closes #xxx
  • Tests added
  • Passes isort . && black . && flake8
  • Fully documented, including CHANGELOG.rst

This adds a LinearRegression class. To fit a linear regression you would call:

lr = mesmer.core.linear_regression.LinearRegression()
lr.fit({"tas": tas}, target, "time")
lr.params

This sets lr.params - i.e. the estimated intercept and slope for tas.

The next examples for one gridpoint ("cell") actually work - you can manually set lt.params and then call lr.predict and lr.residuals to get the predicted values and the residuals:

import mesmer.core.linear_regression
import xarray as xr

# set some params (i.e. intercept and slope)
lr = mesmer.core.linear_regression.LinearRegression()

params = xr.Dataset(data_vars={"intercept": ("cell", [5]), "tas": ("cell", [0.1])})
lr.params = params

tas_predictor = xr.DataArray([0, 1, 2], dims="time")
target = xr.DataArray([[5, 8, 0]], dims=("cell", "time"))


predicted = lr.predict({"tas": tas_predictor})
residuals = lr.residuals({"tas": tas_predictor}, target)

@codecov-commenter
Copy link

codecov-commenter commented Feb 23, 2022

Codecov Report

Merging #134 (d772b60) into master (a682256) will increase coverage by 0.45%.
The diff coverage is 89.85%.

@@            Coverage Diff             @@
##           master     #134      +/-   ##
==========================================
+ Coverage   78.97%   79.43%   +0.45%     
==========================================
  Files          29       29              
  Lines        1346     1405      +59     
==========================================
+ Hits         1063     1116      +53     
- Misses        283      289       +6     
Flag Coverage Δ
unittests 79.43% <89.85%> (+0.45%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mesmer/core/linear_regression.py 90.00% <83.33%> (-10.00%) ⬇️
mesmer/core/utils.py 100.00% <100.00%> (+5.88%) ⬆️
mesmer/calibrate_mesmer/train_gv.py 85.33% <0.00%> (+0.40%) ⬆️
mesmer/create_emulations/create_emus_gv.py 73.46% <0.00%> (+1.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a682256...d772b60. Read the comment docs.

@mathause
Copy link
Member Author

This is basically an xarray wrapper for from sklearn.linear_model.LinearRegression.

@mathause
Copy link
Member Author

@yquilcaille If you are interested you can have a look here. You can concentrate on the linear_regression.py file - the others are not super important.

@mathause mathause requested a review from znicholls March 11, 2022 15:10
Copy link
Collaborator

@znicholls znicholls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sick, such a nice and elegant solution to how we store parameters, make the fitting and prediction stay close to each other etc. Not sure if there's changelog etc. to be done but I think this is a great pattern for us to follow


def LinearRegression_fit_wrapper(*args, **kwargs):
# wrapper for LinearRegression().fit() because it has no return value - should it?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I um and ah about this. I think it's probably a good idea to just get people used to the pattern of calling fit and then getting the params or whatever in the next line. That's what's used in sklearn and following that is probably a good idea. We could add a function like this to the codebase of course, it's so simple that it probably doesn't hurt...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I recently heard somewhere that a class method should either return something or change state but not both. So I will keep it as is.

@mathause
Copy link
Member Author

mathause commented Mar 20, 2022

Thanks for the feedback! Will add a changelog and merge. I am not yet sure if the AR process will work you just as nice...

CHANGELOG.rst Outdated Show resolved Hide resolved
mesmer/core/linear_regression.py Outdated Show resolved Hide resolved
@mathause mathause merged commit 4cdd095 into MESMER-group:master Mar 20, 2022
@mathause mathause deleted the LinearRegressionClass branch March 20, 2022 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants