Skip to content

joshday/StandardizedMatrices.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StandardizedMatrices

Build Status Build status codecov

Statisticians often work with standardized matrices. If x is a data matrix with observations in rows, we want to work with z = StatsBase.zscore(x, 1). This package defines a StandardizedMatrix type that treats a matrix as standardized without copying or changing data in place.

A Motivating Example

Suppose our original matrix is sparse and we want to perform matrix-vector multiplication with a standardized version. Typically, standardizing a sparse matrix destroys the sparsity.

using StatsBase, BenchmarkTools, StandardizedMatrices, SparseArrays, Statistics

# generate some data
n, p = 100_000, 1000
x = sprandn(n, p, .01)
β = randn(p)

xdense = zscore(x, 1)		# this destroys the sparsity
z = StandardizedMatrix(x)	# this acts as standardized, but keeps sparse benefits

b1 = @benchmark xdense * β
b2 = @benchmark z * β
ratio(median(b1), median(b2))  # StandardizedMatrix is roughly 13 times faster

Methods implemented:

  • *()
  • mul!(Y, A::StandardizedMatrix, B)
  • mul!(Y, A::Adjoint{<:StandardizedMatrix}, B)