Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cov's corrected argument a keyword argument and cleanup docstrings for cov and cor #21709

Merged
merged 2 commits into from
Jun 27, 2017

Conversation

nalimilan
Copy link
Member

For consistency with var and std. Also remove methods which are no longer needed
now that deprecations have been removed. Add types to signatures in docstrings.

This only changes the public API for cov. The changes to cor are cosmetic.

@tkelman
Copy link
Contributor

tkelman commented May 5, 2017

worth an 0.7 news entry?

@nalimilan
Copy link
Member Author

Sure. I've added a 0.7. section.

Copy link
Member

@ararslan ararslan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kshyatt kshyatt added the domain:docs This change adds or pertains to documentation label May 5, 2017
@andreasnoack
Copy link
Member

Why not the other way around? Have you measured the overhead?

@ararslan
Copy link
Member

ararslan commented May 8, 2017

What do you mean by the other way around?

@andreasnoack
Copy link
Member

What do you mean by the other way around?

Make var and std use positional arguments instead of keyword arguments

@nalimilan
Copy link
Member Author

I thought the trend was rather to move to keyword arguments where appropriate (since we won't be able to change this after 1.0), and improve their performance if needed. Maybe @StefanKarpinski can confirm?

Clearly, the overhead will depend a lot on the size of the input vector/matrix. For a relatively small vector with 1000 entries, it's about 14%:

covzm2(x::AbstractVector; corrected::Bool=true) = Base.unscaled_covzm(x) / (Base._length(x) - Int(corrected))
covm2(x::AbstractVector, xmean; corrected::Bool=true) = covzm2(x .- xmean; corrected=corrected)
cov2(x::AbstractVector; corrected::Bool=true) = covm2(x, Base.mean(x); corrected=corrected) 
using BenchmarkTools
@benchmark cov(X)
@benchmark cov2(X)

BTW, if we care about performance, we should get rid of x .- xmean, as taking a copy sounds like a very inefficient approach to me.

@nalimilan
Copy link
Member Author

@StefanKarpinski What's your position WRT keyword arguments?

@StefanKarpinski
Copy link
Sponsor Member

StefanKarpinski commented May 20, 2017

I think that unnamed boolean arguments are not nearly as self-documenting as keywords. If you see cov(v, false) you have no idea what that false is about. If you see cov(v, corrected=false) you immediately know what's going on without look at the docs. This is worse for boolean flags because the value itself carries so little semantic information. When a positional argument is a vector or matrix or something, you get some sense of what it means based on how it's used and what the variable is named. What does false mean? Could be literally mean anything – all you know is that it's the opposite of whatever true means. The only case where you have a hint what it means from the calling context is if someone writes corrected=false; cov(v, corrected). I.e. almost never.

@StefanKarpinski
Copy link
Sponsor Member

@JeffBezanson can verify, but I think we have semiconcrete plans to make keywords more efficient in 1.0 and should therefore not be overly worried about the efficiency gap between positional and keyword argument forms and choose based on better APIs, not performance.

@nalimilan
Copy link
Member Author

OK, thanks. Should be good to go then.

@nalimilan
Copy link
Member Author

Barring objections I'll merge this tomorrow.

cov(x::AbstractVector, Y::AbstractMatrix) = cov(x, Y, 1, true)
cov(X::AbstractMatrix, y::AbstractVector) = cov(X, y, 1, true)
cov(X::AbstractMatrix, Y::AbstractMatrix) = cov(X, Y, 1, true)
cov(X::AbstractVecOrMat, Y::AbstractVecOrMat, vardim::Int=1; corrected::Bool=true) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one inferred?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these work:

@inferred cov([1, 2], [1, 2])
@inferred cov([1 2], [1 2], 1)
@inferred cov([1, 2], [1; 2])
@inferred cov([1; 2], [1, 2])

Did I miss other cases to test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go then?

…ings

For consistency with var and std. Also remove methods which are no longer needed
now that deprecations have been removed. Add types to signatures in docstrings.
Remove methods which are no longer needed now that deprecations have
been removed. Add types to signatures in docstrings.
@mschauer
Copy link
Contributor

@deprecate cov(x::AbstractMatrix, vardim::Int, corrected::Bool) cov(x, corrected=corrected)
ignores the vardim argument

@nalimilan
Copy link
Member Author

@mschauer Thanks. That's the problem with untested code. :-) See #22595.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:docs This change adds or pertains to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants