Additional cPCA Experiments

Quickstart

Install packages

uv sync

Run experiments

source scripts/experiments/<filename of .sh file>

Claims that We're Trying to Make

cPCA-preprocessed data yields better model performance downstream, compared to PCA
cPCA-preprocessed data yields better model performance downstream compared to no preprocessing
We idenify which types of backgrounds are most effective for cPCA
We identify the cPCA parameters which are optimal (for both alpha and number of dimensions)

Motivation

Datasets with high-dimensionality are expensive to run models on
cPCA provides better target label separation relative to PCA or no preprocessing

Limitations

cPCA-compressed data is difficult to explain

Set of Experiments

Evaluating Numerical Datasets

Metrics: f1, precision and recall

Mouse gene expression dataset
Beans dataset

Evaluating Natural Language Datasets

Metrics: f1, precision and recall

Sentiment analysis (sst)

Evaluating Image Datasets

Metrics: f1, precision and recall

CIFAR-10 classification

Evaluating Effective Backgrounds

Ablation study of having unlabelled beans
Comparing different backgrounds

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
notebooks		notebooks
results		results
scripts/experiments		scripts/experiments
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
outfile-task-11-ndim-2.yaml		outfile-task-11-ndim-2.yaml
outfile-task-167140-ndim-2.yaml		outfile-task-167140-ndim-2.yaml
outfile-task-3560-ndim-2.yaml		outfile-task-3560-ndim-2.yaml
outfile-task-3573-ndim-2.yaml		outfile-task-3573-ndim-2.yaml
outfile-task-53-ndim-2.yaml		outfile-task-53-ndim-2.yaml
outfile.yaml		outfile.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Additional cPCA Experiments

Quickstart

Claims that We're Trying to Make

Motivation

Limitations

Set of Experiments

Evaluating Numerical Datasets

Evaluating Natural Language Datasets

Evaluating Image Datasets

Evaluating Effective Backgrounds

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Additional cPCA Experiments

Quickstart

Claims that We're Trying to Make

Motivation

Limitations

Set of Experiments

Evaluating Numerical Datasets

Evaluating Natural Language Datasets

Evaluating Image Datasets

Evaluating Effective Backgrounds

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages