New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add parallel fit
and predict_residuals
for calculation of residuals_matrix
in ConformalIntervals
#3414
Conversation
Update fork with main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Change requests:
- please add the new
n_jobs
parameter to the docstring. Simple copy-paste job from the other estimators with that parameter. - should the default be 1?
Will do 👍
|
Then it's probably fine. I was just thinking from consistency, since elsewhere in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Comments:
- are we sure
None
always means 1? Your docstring says "unless in a ... context", which is not clear (what is it then?), and may mean it's not?
https://joblib.readthedocs.io/en/latest/generated/joblib.Parallel.html <- Only not 1 if you overwrite with the |
elsewhere it's 1, do you think we can leave it as is, or should we make it consistent? |
Happy to change to |
argh, we have an inconsistent default then. Let me know which way to prefer, and which of us you prefer to open the issue. If I don't hear from you, I'll merge this whichever way it is end of Monday (tomorrow) |
More files use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
What does this implement/fix? Explain your changes.
The
ConformalIntervals
wrapper can be slow due to the large number offit
andpredict_residuals
routines required to create theresiduals_matrix
.This PR adds
Parallel
to the routine as thefit
andpredict_residuals
routines are independent. It follows the pattern used in forecasting.base._meta.py and forecasting.model_selection._tune.py which adds a nested function for the parallelised routine.What should a reviewer concentrate their feedback on?
Is this the right solution for this issue, or are there other better ways to speed this up. It is particularly an issue for larger training sets, even when
sample_frac
is set to something small (there is a limit on how small this can be without losing information).PR checklist
For all contributions
For new estimators