Skip to content

Conversation

@pdeffebach
Copy link
Contributor

In reference to #351.

Whether or not lm should allow a rank deficient matrix is an option, and true or false is not an input value on its own. Therefore it makes sense that this be a keyword argument and not a positional argument.

The docs are also improved.

No changes to glm because I don't think any aspect of glm allows rank deficient input matrices, even though it probably should. But thats another PR.

@pdeffebach
Copy link
Contributor Author

Tests fail for unrelated reasons. I will try and figure out and fix with another PR.

This is ready for a review.

@pdeffebach
Copy link
Contributor Author

Tests pass now, the "trim trailing whitespace" option in Sublime text was breaking the printing tests.

Ready for a review.

Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, other than a few comments on the tests

test/runtests.jl Outdated
4.108734932819495, 4.995249696954908, 6.075962907632594, 0.0, 8.038151489191618,
8.848886704358202, 2.8697881579099085, 11.15107375630744, 11.8392578374927])

m2p_dep = fit(LinearModel, Xmissingcell, ymissingcell, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a test that this warns?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I guess maybe also the behavior when the kwarg and the positional conflict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

src/glmfit.jl Outdated
end

function GlmResp(y::AbstractVector{<:Real}, d::D, l::L, off::AbstractVector{<:Real},
function GlmResp(y::AbstractVector{<:Real}, d::D, l::L, off::AbstractVector{<:Real},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you revert all these whitespace changes so that the git blame is clean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

src/lm.jl Outdated
Comment on lines 129 to 132
if allowrankdeficient_dep != nothing
@warn "Positional argument `allowrankdeficient` is deprecated, use keyword" *
"argument instead. Proceeding with positional argument value: $allowrankdeficient_dep"
allowrankdeficient = allowrankdeficient_dep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if allowrankdeficient_dep != nothing
@warn "Positional argument `allowrankdeficient` is deprecated, use keyword" *
"argument instead. Proceeding with positional argument value: $allowrankdeficient_dep"
allowrankdeficient = allowrankdeficient_dep
if allowrankdeficient_dep != nothing
Base.depwarn("Positional argument `allowrankdeficient` is deprecated, use keyword" *
"argument instead. Proceeding with positional argument value: $allowrankdeficient_dep",
:fit)
allowrankdeficient = allowrankdeficient_dep

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't get this to work, I'll try again in a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it to work!

@codecov-io
Copy link

codecov-io commented Nov 7, 2020

Codecov Report

Merging #386 (ae7669e) into master (3234f5e) will increase coverage by 0.23%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #386      +/-   ##
==========================================
+ Coverage   79.56%   79.79%   +0.23%     
==========================================
  Files           7        7              
  Lines         685      688       +3     
==========================================
+ Hits          545      549       +4     
+ Misses        140      139       -1     
Impacted Files Coverage Δ
src/lm.jl 93.33% <100.00%> (-0.92%) ⬇️
src/glmfit.jl 74.89% <0.00%> (-2.47%) ⬇️
src/linpred.jl 70.70% <0.00%> (+8.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3234f5e...5aa77bd. Read the comment docs.

@pdeffebach
Copy link
Contributor Author

Thanks for the reviews. I responded to all comments.

Ready for another round!

@nalimilan
Copy link
Member

By making this a keyword argument, the name allowrankdeficient becomes part of the API, so I think it's worth considering whether it's the best possible name. I must say I don't find it very user-friendly, but I don't have very good suggestions either. Are there examples of similar options in other software?

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
@pdeffebach
Copy link
Contributor Author

I also don't have any good suggestions. Stata and R do it by default, while it looks like python doesn't allow it at all.

Would it be too breaking to make it true by default?

@nalimilan
Copy link
Member

AFAICT it wouldn't be breaking at all since it would turn an error into something that succeeds.

@dmbates @andreasnoack Opinions?

@pdeffebach
Copy link
Contributor Author

Okay just pushed a commit to

  1. Rename allowrankdeficient to dropcollinear
  2. Make the default of collinear to true. This will cause fewer "what's going on?" scenarios for new users who are less familiar with the math behind OLS.

It also has an added performance benefit!

julia> x = rand(10000, 50);

julia> @btime GLM.cholpred(x, true);
  2.134 ms (17 allocations: 7.67 MiB)

julia> @btime GLM.cholpred(x, true);
  2.133 ms (17 allocations: 7.67 MiB)

julia> @btime GLM.cholpred($x, true);
  2.130 ms (17 allocations: 7.67 MiB)

julia> @btime GLM.cholpred($x, false);
  3.031 ms (15 allocations: 7.67 MiB)

julia> @btime GLM.cholpred($x, false);
  3.029 ms (15 allocations: 7.67 MiB)

Tests are updated. I haven't added anything to docs/ as I'm not sure where it will go. But the docstring is updated.

@pdeffebach
Copy link
Contributor Author

Ready for a review!

@pdeffebach
Copy link
Contributor Author

Bumping this! Just fixed some errant white space changes.

To recap: Changes allowrankdeficient to a keyword argument and the new name is dropcollinear

@kleinschmidt
Copy link
Member

LGTM. Nightly failures are due to the p-value printing thing with printfpocalypse (so unrelated)

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a wording suggestion.

Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Copy link
Contributor Author

@pdeffebach pdeffebach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants