Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix predict with confidence intervals #253

Merged
merged 3 commits into from Nov 13, 2018
Merged

Fix predict with confidence intervals #253

merged 3 commits into from Nov 13, 2018

Conversation

nalimilan
Copy link
Member

Use keyword arguments, which are passed through by the DataFrameRegressionModel method.

While we're at it, I wonder whether we should change the API a bit. interval=:confint and interval=:predint could be changed to interval=:confidence and interval=:prediction, which are the names used by R (since "int" is redundant with interval).

Cc: @mkborregaard

@codecov-io
Copy link

codecov-io commented Sep 11, 2018

Codecov Report

Merging #253 into master will decrease coverage by 0.27%.
The diff coverage is 28.57%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #253      +/-   ##
=========================================
- Coverage   51.97%   51.7%   -0.28%     
=========================================
  Files           6       6              
  Lines         583     588       +5     
=========================================
+ Hits          303     304       +1     
- Misses        280     284       +4
Impacted Files Coverage Δ
src/lm.jl 49.46% <28.57%> (-1.68%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4966a05...c2a5bd2. Read the comment docs.

@mkborregaard
Copy link
Contributor

Wonderful. I also agree wholeheartedly with the name change.

@mkborregaard
Copy link
Contributor

Does this mean this will work with DataFrames now? There was a PR on DataFrames I never got merged...

@nalimilan
Copy link
Member Author

Does this mean this will work with DataFrames now? There was a PR on DataFrames I never got merged...

Do you have a link? It must have worked at some point, since it's in the examples, but I wonder since when it's been broken.

@mkborregaard
Copy link
Contributor

JuliaData/DataFrames.jl#1160

I don't think it's ever worked from DataFrames...

@mkborregaard
Copy link
Contributor

I'll rebase and write the test if that's still doable, but I'm guessing the interface has changed a lot?

@nalimilan
Copy link
Member Author

Thanks. Actually I now realize I hadn't tested the example based on DataFrames correctly. The test in runtests.jl only covers matrices. So this PR shouldn't be merged without fixing DataFrameRegressionModel first.

The code now lives in StatsModels.jl. If we make interval-related arguments keyword arguments rather than positional, I think we can avoid defining a separate function as in JuliaData/DataFrames.jl#1160. Instead, you can just check whether yp is a vector or a matrix, and slightly adapt the code to suit each possibility.

@mkborregaard
Copy link
Contributor

Or here in GLM?

@nalimilan
Copy link
Member Author

Yes, in StatsModels. GLM should be ready with this PR.

@mkborregaard
Copy link
Contributor

By the way @nalimilan I just rememembered that the way I implemented it like it is, and not with a keyword as in this PR, is because the keyword version breaks type stability of predict. With confidence intervals it returns a three-column Matrix, without it's a Vector. Making the interval an optional positional argument allows the compiler to figure out the return type from the method signature.

@nalimilan
Copy link
Member Author

AFAIK type-stability isn't a problem anymore with keyword arguments:

julia> f(; x=nothing) = x === nothing ? 1 : 1.0
f (generic function with 1 method)

julia> @code_warntype f()
Body::Int64
[...]

julia> @code_warntype f(x=nothing)
Body::Int64
[...]

julia> @code_warntype f(x=1)
Body::Float64
[...]

@mkborregaard
Copy link
Contributor

sweet 🎉

Use keyword arguments, which are passed through by the DataFrameRegressionModel method.
@nalimilan
Copy link
Member Author

I've pushed a commit to change the names. Is this good to go? IIRC a separate change is needed in StatsModels to support DataFrames, but that's already broken currently.

@mkborregaard
Copy link
Contributor

mkborregaard commented Nov 11, 2018

Yes, I'll have a look a StatsModels some time during the week

@mkborregaard
Copy link
Contributor

@nalimilan I can't see that the name has been changed? It still says :confint but I agree that :confidence makes more sense.

Here's the StatsModels PR:
JuliaStats/StatsModels.jl#77

I have a question - we currently return a three-row Matrix as [lower prediction upper]. Would it make more sense / be more useful to return a Tuple, e.g. (prediction, (lower, upper)) or maybe a NamedTuple? Allows things like pred, int = predict(mymodel, newdata, interval = :confidence); plot(pred, ribbon = int) etc.?

@mkborregaard
Copy link
Contributor

Oh, I see we already discussed the tuple thing once: #171 (comment)
No reason to flog a dead horse :-)

@nalimilan
Copy link
Member Author

Woops, I had just forgotten to push the new commits.

@mkborregaard
Copy link
Contributor

I've added prediction intervals in a PR against this branch

@nalimilan nalimilan merged commit d19c2f8 into master Nov 13, 2018
@nalimilan nalimilan deleted the nl/predict branch January 16, 2019 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants