Code for the Model Variability project

This is the code repo for the research project Model Variablilty to investigate the variability of deep learning models and how variance based metrics can be used to debug deep learning models.

This is a summary of how to runs various scripts. The process of running the experiment is as following:

Prepare the data
Train the models
Analyze and generate result

Env preparation:

Use the "azureml_py36_pytorch" anaconda env as the base (Already in the new node).
You also need to mount the mlvariance to "~/teamdrive/mlvariance".
Install packages to "azureml_py36_pytorch" using:

python3 -m pip install --disable-pip-version-check --extra-index-url https://azuremlsdktestpypi.azureedge.net/K8s-Compute/D58E86006C65 azureml_contrib_k8s

Data preparation:

Data preparation scripts are located under the folder prepare_data.

Prepare COMPAS dataset

prepare_artificial_compas.py and prepare_holdout_compas.py prepare COMPAS data for two scenarios: artificial correlated features and holding out clustered sets.

The path to the data is hard coded in the scripts. Change them to point to the correct path if needed.

Prepare CIFAR10 dataset

prepare_holdout_cifar10.py prepares CIFAR10 data for holdout scenarios.

The script takes two arguments:

The data_folder that would contain the prepared data. This should be pointed to ~/teamdrive/mlvariance/data if run from GPU dev machine.
The mode which represents the scenario:

holdout: holdout a portion of class 0
holdout-dup: holdout a portion of class 0 with duplication so the number of training examples is evently distributed across all classes
augmentation: convert a portion of class 0 to grayscale
augmentation-all: convert a portion of all classes to grayscale

Prepare CIFAR100 dataset

prepare_holdout_cifar100-cifar10.py prepares CIFAR100 data for holdout scenarios.

The script takes two arguments:

The data_folder that would contain the prepared data. This should be pointed to ~/teamdrive/mlvariance/data if run from GPU dev machine.
The mode which represents the scenario:

holdout: holdout a portion of a subclass. The list of subclasses are hard coded in the script.

local_export_images_cifar100.py: export images to display in html report

Training deep learning models

There are various training scripts for different datasets for COMPAS, CIFAR10, and CIFAR100

COMPAS training:

To train the models for the COMPAS dataset, you can run one of several scripts

main_compas.py: for training with original COMPAS dataset
main_artificial_compas.py: for training using COMPAS dataset with artificial correlated features
main_holdout_compas.py: for training using COMPAS dataset with holdout clusted set

CIFAR10 and CIFAR100 training:

To train models for CIFAR10 and CIFAR100, the AML platform would be used. To schedule the training jobs, the script cluster_schedulel_all.py can be used. This is a master script that queue jobs to the AML cluster. The script take a single argument local_teamdrive_folder which should point to ~/teamdrive/mlvariance. There should be a config.json file with the subscription info for the AML cluster. This is the information from [https://dev.azure.com/msresearch/GCR/_wiki/wikis/GCR.wiki/3438/AML-K8s-(aka-ITP)-Overview] under the table Connection Details.

To train CIFAR10 models uncomment the line runs_list = list_cifar10_runs(...). This function generate the jobs list which consist of single run of cluster_single_cifar10.py
To train CIFAR100 models uncomment the line runs_list = list_cifar100_runs(...). This function generate the jobs list which consist of single run of cluster_single_cifar100.py

Analyze results

There are several scripts that can be used to generate the analysis result. There are some analysis scripts under the analysis folder.

Analyze COMPAS results

There are several script to analyze COMPAS result under the analysis folder.

_analyze.py: analyze COMPAS result without holdout
_analyze_artificial.py: analyze COMPAS result with artificial correlated features
_analyze_holdout.py: analyze COMPAS result with holdout cluster set

Analyze CIFAR10 results

Run the analyze_cifar10.py script to analyze the accuracy of the CIFAR10 models. This script takes one argument result_folder which should be pointed to ~/teamdrive/mlvariance/result if run in GPU Dev machine.

Analyze CIFAR100 results

There are several analysis script for CIFAR100.

Some should be run on the AML cluster:

cluster_single_detail_analysis_cifar100.py: runs this via cluster_schedulel_all.py by uncommenting runs_list = list_cifar100_holdout_analysis_runs(..., "cluster_single_detail_analysis_cifar100.py"). This generates detail analysis for each sample and also create ranking results.
cluster_single_saliency_cifar100.py: runs this via cluster_schedulel_all.py by uncommenting runs_list = list_cifar100_holdout_generate_map_runs(...). This generates gradcam images for all samples.
cluster_single_detail_analysis_cifar100_html.py: runs this via cluster_schedulel_all.py by uncommenting runs_list = list_cifar100_holdout_analysis_runs(..., "cluster_single_detail_analysis_cifar100_html.py"). This generates detail analysis in the form of html pages for each sample which includes gradcam images.

Some are local scripts:

analyze_cifar100.py: analyze CIFAR100 models accuracy
local_merge_rank_report.py: to merge the result once all cluster_single_detail_analysis_cifar100.py has finished.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
analysis		analysis
models		models
prepare_data		prepare_data
.gitignore		.gitignore
cifar100_holdout_runs.sh		cifar100_holdout_runs.sh
cifar10_augmentation_runs.sh		cifar10_augmentation_runs.sh
cifar10_holdout-dup_runs.sh		cifar10_holdout-dup_runs.sh
cifar10_holdout_runs.sh		cifar10_holdout_runs.sh
cluster_local_schedulel_all.py		cluster_local_schedulel_all.py
cluster_schedule_all.py		cluster_schedule_all.py
cluster_schedule_all.sh		cluster_schedule_all.sh
cluster_schedule_all_retrain.sh		cluster_schedule_all_retrain.sh
cluster_schedule_all_train.sh		cluster_schedule_all_train.sh
cluster_schedule_all_val_test.sh		cluster_schedule_all_val_test.sh
cluster_single_amazon.py		cluster_single_amazon.py
cluster_single_cifar10.py		cluster_single_cifar10.py
cluster_single_cifar100.py		cluster_single_cifar100.py
cluster_single_detail_analysis_cifar100.py		cluster_single_detail_analysis_cifar100.py
cluster_single_detail_analysis_cifar100_html.py		cluster_single_detail_analysis_cifar100_html.py
cluster_single_detail_analysis_n_models_cifar100.py		cluster_single_detail_analysis_n_models_cifar100.py
cluster_single_evaluate_cifar100.py		cluster_single_evaluate_cifar100.py
cluster_single_job.py		cluster_single_job.py
cluster_single_job.sh		cluster_single_job.sh
cluster_single_run.sh		cluster_single_run.sh
cluster_single_saliency_cifar100.py		cluster_single_saliency_cifar100.py
cluster_single_train_more_cifar100.py		cluster_single_train_more_cifar100.py
cluster_single_val_detail_analysis_cifar100.py		cluster_single_val_detail_analysis_cifar100.py
cluster_single_val_evaluate_retrain_cifar100.py		cluster_single_val_evaluate_retrain_cifar100.py
cluster_single_val_test_cifar100.py		cluster_single_val_test_cifar100.py
conda_dependencies.yml		conda_dependencies.yml
config.json		config.json
holdout_runs.sh		holdout_runs.sh
image_utils.py		image_utils.py
local_acc_analysis_amazon.py		local_acc_analysis_amazon.py
local_acc_analysis_cifar100.py		local_acc_analysis_cifar100.py
local_detail_analysis_cifar100.py		local_detail_analysis_cifar100.py
local_detail_analysis_cifar100.sh		local_detail_analysis_cifar100.sh
local_detail_analysis_cifar100_cluster_single.py		local_detail_analysis_cifar100_cluster_single.py
local_detail_analysis_cifar100_cluster_single.sh		local_detail_analysis_cifar100_cluster_single.sh
local_detail_analysis_cifar100_html.py		local_detail_analysis_cifar100_html.py
local_detail_analysis_n_models_cifar100.py		local_detail_analysis_n_models_cifar100.py
local_detail_analysis_n_models_cifar100.sh		local_detail_analysis_n_models_cifar100.sh
local_evaluate_amazon.py		local_evaluate_amazon.py
local_evaluate_cifar10.py		local_evaluate_cifar10.py
local_evaluate_cifar100.py		local_evaluate_cifar100.py
local_evaluate_cifar100.sh		local_evaluate_cifar100.sh
local_export_images_cifar100.py		local_export_images_cifar100.py
local_generate_acc_cluster_graph.py		local_generate_acc_cluster_graph.py
local_generate_acc_graph.py		local_generate_acc_graph.py
local_generate_auc_graph.py		local_generate_auc_graph.py
local_generate_auc_graph_n_models.py		local_generate_auc_graph_n_models.py
local_generate_cor_cluster_table.py		local_generate_cor_cluster_table.py
local_generate_rank_graph.py		local_generate_rank_graph.py
local_generate_retrain_acc_gain_graph.py		local_generate_retrain_acc_gain_graph.py
local_generate_retrain_acc_graph.py		local_generate_retrain_acc_graph.py
local_merge_reports.py		local_merge_reports.py
local_retrain_acc_analysis_cifar100.py		local_retrain_acc_analysis_cifar100.py
local_retrain_acc_gain_analysis_cifar100.py		local_retrain_acc_gain_analysis_cifar100.py
local_train_amazon.py		local_train_amazon.py
local_val_detail_analysis_cifar100_cluster_single.py		local_val_detail_analysis_cifar100_cluster_single.py
local_val_detail_analysis_cifar100_cluster_single.sh		local_val_detail_analysis_cifar100_cluster_single.sh
local_val_evaluate_retrain_cifar100.py		local_val_evaluate_retrain_cifar100.py
local_val_evaluate_retrain_cifar100.sh		local_val_evaluate_retrain_cifar100.sh
local_val_test_cifar100_cluster_single.py		local_val_test_cifar100_cluster_single.py
local_val_test_cifar100_cluster_single.sh		local_val_test_cifar100_cluster_single.sh
local_val_test_retrain_cifar100_cluster_single.py		local_val_test_retrain_cifar100_cluster_single.py
local_val_test_retrain_cifar100_cluster_single.sh		local_val_test_retrain_cifar100_cluster_single.sh
main_artificial_compas.py		main_artificial_compas.py
main_cifar10.py		main_cifar10.py
main_cifar100.py		main_cifar100.py
main_compas.py		main_compas.py
main_holdout_cifar10.py		main_holdout_cifar10.py
main_holdout_compas.py		main_holdout_compas.py
readme.md		readme.md
resnet.py		resnet.py
run.sh		run.sh
sort_script.txt		sort_script.txt
utils.py		utils.py

hvpham/mlvariance

Folders and files

Latest commit

History