tpcaMonitoring

An R package containing all code to reproduce the simulation study of the paper "Online Detection of Sparse Changes in High-Dimensional Data Streams Using Tailored Projections". In addition, the package contains the files with the results we obtained by running the simulation study in the directories /results and /thresholds. It is also an aim that this package can be used by others who want to test TPCA for their monitoring tasks, both for bootstrapping thresholds and testing performance in terms of expected detection delays (EDD).

Overview

Functionality:

Functions for finding thresholds by bootstrping (parametric and non-parametric).
Functions for simulating run lengths to estimate EDD.
Functions to generate and retrieve the test data used in the simulation study.
The monitoring functions (the running detection statistics) are implemented by Rcpp.
Summary and plot functions.

Installation

You can install tpcaMonitoring from github with:

# install.packages("devtools")
devtools::install_github("Tveten/tpcaMonitoring")

To reproduce figures from our result files, download the folders /results and /thresholds from https://github.com/Tveten/tpcaMonitoring and put them in your working R directory before running the plot functions.

Exported and documented functions

For more information, see the documentation of the functions below inside R.

est_edd_all_methods
gen_train
get_training_sets
ggplot_edd_conf_int
mixture_arl
multiplot_edd_summary
run_simstudy
threshold_finder
tpca_arl

Example

library(tpcaMonitoring)
# To run the entire simulation study, use run_simstudy().
# It will take several days to run, but the output from it is also contained in
# this package so that one can still get plots of results and study the results in more detail.
# The code in run_simstudy() specifies in a simple way the simulation
# setup and the necessary steps in the study.

# Generate the test sets used:
n_sets <- 30
d <- 100
m <- 2 * d
training_sets <- get_training_sets(n_sets, d = d, m = m)

# Plot of the results for low correlation matrices:
multiplot_edd_summary(cov_mat_type = 'low')

# Plot of the results for low correlation matrices:
multiplot_edd_summary(cov_mat_type = 'high')

# Plot of the EDD results for a single covariance matrix/training set:
multiplot_edd_summary(cov_mat_nr = 5)

# Plot EDD results with confidence intervals:
ggplot_edd_conf_int(n = 100, alpha = 0.01, m = 200, d = 100,
                    change_type = 'sd', change_param = 1.5)

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
R		R
man		man
results		results
src		src
tests		tests
thresholds		thresholds
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README-fig1-2-3-1.png		README-fig1-2-3-1.png
README-fig1-2-3-2.png		README-fig1-2-3-2.png
README-fig1-2-3-3.png		README-fig1-2-3-3.png
README-fig4-1.png		README-fig4-1.png
README.Rmd		README.Rmd
README.md		README.md
overall_log_n100alpha01m200d100txt		overall_log_n100alpha01m200d100txt
tpcaMonitoring.Rproj		tpcaMonitoring.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tpcaMonitoring

Overview

Installation

Exported and documented functions

Example

About

Releases

Packages

Languages

Tveten/tpcaMonitoring

Folders and files

Latest commit

History

Repository files navigation

tpcaMonitoring

Overview

Installation

Exported and documented functions

Example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages