AUC_is_all_you_need

Analyzing different ML model comparison metrics

Current Status

AUROC_AUPRC: Matthew's original examination of beta distributions and their relationship to AUROC and AUPRC.
1_synthetic_experiments: Giovanni's extension of Matthews work.
Overleaf draft here

Analyses

1. Theory Experiments:

A. Theory:

Clean up expressions for AUROC and AUPRC that involve taking expectations over p_+ of various quantities.
Verify the Au(log)ROC expression and explore if it is useful.

B. Fairness:

Fairness analyses - if two subgroups, and their p_+ and p_- are each uniform distributions over intersecting ranges, in what settings can we comment on the impact of AUROC over AUPRC w.r.t. fairness implications?

2. Synthetic experiments:

A. Configurations:

Specify model configurations
- Specify metrics
  - AUROC
  - AUPRC
  - Brier Score
  - Best-F1
  - Best-Accuracy
  - Best-Precision
  - Best-Recall
  - Best-Sensitivity
  - Best-Specificity
  - Total expected deployment cost under a uniform sampling of independent FP, FN, TP, and TN cost/benefit ratios.
- Specify subgroup analysis
  - N subgroups
  - Prevalences
  - Positive label prevalences per subgroup

B. Theoretical analyses:

For each metric and set up above --> Theoretically computed metrics (independent of sizes of dev and test sets).

C. Empirical analyses:

Goal: Empirically computed metrics with variances over varying test and dev set sizes (dev set only used to pick thresholds for threshold dependent metrics).

D. Cost analyses:

For each combination of subgroup prevalences S, positive label prevalences per subgroup of R, test dataset size N, dev dataset size M, and model selection under decision rule Q...
Expected true deployment cost of that metric under cost-benefit ratios V for each subgroup.

Publication

1. Literature review:

A. Mini-Literature Review:

Need to do mini-literature review of studies stating auprc is better in imbalanced datasets.

2. Dashboard:

A. Figma Sketch:

See Figma sketch here.

B. Integration:

Basic integration of Matthews code into dashboard done.

C. Discussions:

To rediscuss

3. Real-world analyses:

A. Potential Analyses:

Run SubPopBench but integrate the extra metrics above + calculate for subgroups.
Chexclusion

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
dashboard		dashboard
.devcontainer.json		.devcontainer.json
.gitignore		.gitignore
1_synthetic_experiments.ipynb		1_synthetic_experiments.ipynb
AUROC_AUPRC.ipynb		AUROC_AUPRC.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AUC_is_all_you_need

Current Status

Analyses

1. Theory Experiments:

A. Theory:

B. Fairness:

2. Synthetic experiments:

A. Configurations:

B. Theoretical analyses:

C. Empirical analyses:

D. Cost analyses:

Publication

1. Literature review:

A. Mini-Literature Review:

2. Dashboard:

A. Figma Sketch:

B. Integration:

C. Discussions:

3. Real-world analyses:

A. Potential Analyses:

About

Releases

Packages

Languages

jhn-nt/AUC_is_all_you_need

Folders and files

Latest commit

History

Repository files navigation

AUC_is_all_you_need

Current Status

Analyses

1. Theory Experiments:

A. Theory:

B. Fairness:

2. Synthetic experiments:

A. Configurations:

B. Theoretical analyses:

C. Empirical analyses:

D. Cost analyses:

Publication

1. Literature review:

A. Mini-Literature Review:

2. Dashboard:

A. Figma Sketch:

B. Integration:

C. Discussions:

3. Real-world analyses:

A. Potential Analyses:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages