Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
R
 
 
 
 
man
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Conditional Random Sampling Sparse Matrices

CRAN_Status_Badge License: GPL v3 Build Status Codacy Badge GitHub last commit (branch)

Description

This R package contains a novel matrix sampling algorithm. Conducts conditional random sampling on observed values in sparse matrices. Useful for training and test set splitting sparse matrices prior to model fitting in cross-validation procedures and for estimating the predictive accuracy of data imputation methods, such as matrix factorization or singular value decomposition (SVD). Although designed for applications with sparse matrices, CRASSMAT can also be applied to complete matrices, as well as to those containing missing values.

Details

CRASSMAT takes a matrix Aij and samples out a single jth value on the condition that the number of jth values within the ith observation is greater than the specified conditional (minimum number of values to remain per ith observation). This process repeats itself until the specified sampling threshold is met.

Features

  1. Simple implementation for data splitting sparse matrices, useful in cross-validation procedures
  2. Supports sparse matrices, matrices with missing values, and complete matrices
  3. Supports implementation into various recommendation system settings
  4. Provides a novel alternative to other matrix sampling methods
    (eg. Wold 'speckled style' hold-outs, Gabriel 'block style' hold-outs)

Installation

## install CRAN release
install.packages('crassmat')

## install developer version
library(devtools)
devtools::install_github('nickkunz/crassmat')

Usage

## load crassmat
library(crassmat)

## test set
A_test <- A

## training set
A_train <- crassmat(data = A,            # matrix
                    sample_thres = 0.20, # remove 20% of observed values
                    conditional = 1)     # keep > 1 observed values per row

License

© Nick Kunz, 2019. Licensed under the General Public License v3.0 (GPLv3).

Contributions

CRASSMAT is open for improvements and maintenence. Your help is valued to make the package better for everyone.

References

Kunz, N. (2019). Unsupervised Learning for Submarket Modeling: A Proxy for Neighborhood Change (Master’s Thesis). Columbia University, New York, NY. https://doi.org/10.7916/d8-rj87-yx32.

Kunz, N. (2019). CRASSMAT: Conditional Random Sampling Sparse Matrices. The Comprehensive R Archive Network (CRAN). https://cran.r-project.org/web/packages/crassmat/index.html.

Releases

No releases published

Packages

No packages published

Languages