Skip to content
An R package for performing TPCA change detection.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
man
results
src
tests
thresholds
.Rbuildignore
.gitignore
DESCRIPTION
NAMESPACE
README-fig1-2-3-1.png
README-fig1-2-3-2.png
README-fig1-2-3-3.png
README-fig4-1.png
README.Rmd
README.md
overall_log_n100alpha01m200d100txt
tpcaMonitoring.Rproj

README.md

tpcaMonitoring

An R package containing all code to reproduce the simulation study of the paper "Online Detection of Sparse Changes in High-Dimensional Data Streams Using Tailored Projections". In addition, the package contains the files with the results we obtained by running the simulation study in the directories /results and /thresholds. It is also an aim that this package can be used by others who want to test TPCA for their monitoring tasks, both for bootstrapping thresholds and testing performance in terms of expected detection delays (EDD).

Overview

Functionality:

  • Functions for finding thresholds by bootstrping (parametric and non-parametric).
  • Functions for simulating run lengths to estimate EDD.
  • Functions to generate and retrieve the test data used in the simulation study.
  • The monitoring functions (the running detection statistics) are implemented by Rcpp.
  • Summary and plot functions.

Installation

You can install tpcaMonitoring from github with:

# install.packages("devtools")
devtools::install_github("Tveten/tpcaMonitoring")

To reproduce figures from our result files, download the folders /results and /thresholds from https://github.com/Tveten/tpcaMonitoring and put them in your working R directory before running the plot functions.

Exported and documented functions

For more information, see the documentation of the functions below inside R.

  • est_edd_all_methods
  • gen_train
  • get_training_sets
  • ggplot_edd_conf_int
  • mixture_arl
  • multiplot_edd_summary
  • run_simstudy
  • threshold_finder
  • tpca_arl

Example

library(tpcaMonitoring)
# To run the entire simulation study, use run_simstudy().
# It will take several days to run, but the output from it is also contained in
# this package so that one can still get plots of results and study the results in more detail.
# The code in run_simstudy() specifies in a simple way the simulation
# setup and the necessary steps in the study.

# Generate the test sets used:
n_sets <- 30
d <- 100
m <- 2 * d
training_sets <- get_training_sets(n_sets, d = d, m = m)
# Plot of the results for low correlation matrices:
multiplot_edd_summary(cov_mat_type = 'low')

# Plot of the results for low correlation matrices:
multiplot_edd_summary(cov_mat_type = 'high')

# Plot of the EDD results for a single covariance matrix/training set:
multiplot_edd_summary(cov_mat_nr = 5)

# Plot EDD results with confidence intervals:
ggplot_edd_conf_int(n = 100, alpha = 0.01, m = 200, d = 100,
                    change_type = 'sd', change_param = 1.5)

You can’t perform that action at this time.