Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cov(Vector,Vector,dims=1) fails #41680

Open
PaulSoderlind opened this issue Jul 22, 2021 · 1 comment
Open

cov(Vector,Vector,dims=1) fails #41680

PaulSoderlind opened this issue Jul 22, 2021 · 1 comment
Labels
statistics The Statistics stdlib module

Comments

@PaulSoderlind
Copy link
Contributor

cov(rand(5),rand(5),dims=1) fails, while both cov(rand(5),rand(5)) and cov(rand(5),rand(5,1),dims=1) work.

Why bother? Well, to be consistent with the other functions in Statistics. For instance, std(rand(5),dims=1) gives a 1x1 vector. It is surely a convenience for users to be able to interchange Nx1 matrices and N-vectors in such basic calculations.

The quick fix is to add a dummy dims argument to the current method (in Statistics.jl) as cov(x::AbstractVector, y::AbstractVector; dims::Int=1, corrected::Bool=true) =. This would produce a Number. A better fix might be produce a 1x1 vector to be consistent with std and other functions.

@simeonschaub simeonschaub added the statistics The Statistics stdlib module label Jul 24, 2021
@PaulSoderlind
Copy link
Contributor Author

I have spent some time with the cov() and cor() functions in Statistics - and I get the impression that the cov() function could use an overhaul. For instance, it is several times slower than cor(). However, it is a somewhat heavy machinery.

In the short run, the issue with the missing cov(Vector,Vector,dims=1) method, and similarly for cor(), can be solved by the crude approach suggested below. The performance seems to be within 1% of the existing code, and attempts to rely on dispatching on dims (via the covm()) do not improve.

function cov(x::AbstractVector, y::AbstractVector; dims=:, corrected::Bool=true)
    result = covm(x, mean(x), y, mean(y); corrected=corrected)
    if dims === (:)
        return result
    elseif isa(dims,Int) && dims >= 1
        dims == 1 ? result2 = fill(result,1,1) : result2 = fill(NaN,length(x),length(y))
        return result2
    else
        error("wrong value of dims")
    end
end

function cor(x::AbstractVector, y::AbstractVector; dims=:)
    result = corm(x, mean(x), y, mean(y))
    if dims === (:)
        return result
    elseif isa(dims,Int) && dims >= 1
        dims == 1 ? result2 = fill(result,1,1) : result2 = fill(NaN,length(x),length(y))
        return result2
    else
        error("wrong value of dims")
    end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
statistics The Statistics stdlib module
Projects
None yet
Development

No branches or pull requests

2 participants