Install packages
uv syncRun experiments
source scripts/experiments/<filename of .sh file>- cPCA-preprocessed data yields better model performance downstream, compared to PCA
- cPCA-preprocessed data yields better model performance downstream compared to no preprocessing
- We idenify which types of backgrounds are most effective for cPCA
- We identify the cPCA parameters which are optimal (for both alpha and number of dimensions)
- Datasets with high-dimensionality are expensive to run models on
- cPCA provides better target label separation relative to PCA or no preprocessing
- cPCA-compressed data is difficult to explain
Metrics: f1, precision and recall
- Mouse gene expression dataset
- Beans dataset
Metrics: f1, precision and recall
- Sentiment analysis (sst)
Metrics: f1, precision and recall
- CIFAR-10 classification
- Ablation study of having unlabelled beans
- Comparing different backgrounds