Statistical Inference for k-means Clustering after Domain Adaptation

This package provides a statistical inference framework for k-means clustering after domain adaptation (DA). It leverages the SI framework and employs a divide-and-conquer strategy to efficiently compute the p-value of selected features. Our method ensures reliable feature selection by controlling the false positive rate (FPR) while simultaneously maximizing the true positive rate (TPR), effectively reducing the false negative rate (FNR).

Environment Setup

pip install -r requirements.txt

Usage

We provide several Jupyter notebooks demonstrating how to use the SCaDA.

Example for computing p-values for k-means clustering after DA: ex1_compute_pvalue.ipynb
Check the uniformity of the pivot: ex2_validity_of_pvalue.ipynb

PyPI package

The SCaDA is available on the PyPI and can be installed as follows:

pip install PySCaDA

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
scada		scada
.gitignore		.gitignore
README.md		README.md
ex1_compute_pvalue.ipynb		ex1_compute_pvalue.ipynb
ex2_validity_of_pvalue.ipynb		ex2_validity_of_pvalue.ipynb
gendata.py		gendata.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Statistical Inference for k-means Clustering after Domain Adaptation

Environment Setup

Usage

PyPI package

About

Uh oh!

Releases

Packages

Languages

locluclak/SCaDA

Folders and files

Latest commit

History

Repository files navigation

Statistical Inference for k-means Clustering after Domain Adaptation

Environment Setup

Usage

PyPI package

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages