<a href="https://colab.research.google.com/github/acorbin3/07kit/blob/master/mobilenetv2/2023_01_06/CIRCLe_with_isic2018_with_skin_transformer_mobilenetv2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Experiment notes

- date: 2023/01/06 4:13am
- base model: mobilenetv2 
- Adding the de-normlization for input image before transforming the image. then normalizing the transformed image before running it throug the base model


Results: 



# Intro

This notebook is used to modify the implementation of CIRCLe from this paper : [CIRCLe: Color Invariant Representation
Learning for Unbiased Classification of Skin
Lesions](https://arxiv.org/pdf/2208.13528.pdf)

Their github repo is : https://github.com/arezou-pakzad/CIRCLe

This paper uses the Fitzpatrick17k dataset which can be obtained here: https://github.com/mattgroh/fitzpatrick17k

For these set of experiments we will use the ISIC 2017 dataset from: https://github.com/manideep2510/melanoma_segmentation.git 

#TODO list

1. [X] Download 2018 dataset
1. [X] Analize dataset to get Fitzpatrick info. 
1. [X] Save off Fitzpatrick info data so we dont have to do it every time
1. [X] load cached fitzpatrick data
1. [X] Create masks uing https://github.com/DebeshJha/2020-CBMS-DoubleU-Net Because Task 3 for 2018 doesnt havent masks. Trick was to get the higher end GPU and ram (12/29/2022)
1. [X] Create pytorch dataloader for ISIC 2018 dataset including loading masks, images, diagnossis, fitzpatrick type for training (12/30/2022) needed to create custom split function
1. [X] Create dataloaders for test and validation  (12/30/2022)
1. [X] Added jupiter notebook download code into the github repo (1/1/2023)
1. [X] plug in dataloader into CIRCLe main file (1/1/2023)
1. [X] Figure out how to transform image and mask the same from the dataloader (1/2/2023)
1. [X] Use the new dataloader to train the model (1/2/2023)
1. [X] Use new transformer for CIRCLe model (1/3/2023)
1. [ ] test using different base models
1. [ ] test that adding dropout might help with overfitting
1. [X] Add more metrics such as precision and recall (1/4/2023)
1. [ ] add fairness metrics
1. [ ] add confusion matrics
1. [ ] add sensitivity and specificity
1. [ ] add metrics for each class
1. [ ] (optional) Go back and download and use larger datasets
1. [ ] (optional) Run Fitzpatrick on larger datasets(currently using the test set from isic 2018 task 3)
1. [ ] The dataloaders need to be split stratified different than the current "training, validation, and test" as given from https://challenge.isic-archive.com/data/#2018 based on skin types. 12/30/2022 - I think this is done BUT we might consider doing k-fold approach which adds another layer of complexity to the dataloaders

# Set up the environment

In [1]:
!python --version

Python 3.8.16


## Installs & imports

## Download latest code

In [2]:
!git clone https://github.com/acorbin3/CIRCLe.git

Cloning into 'CIRCLe'...
remote: Enumerating objects: 306, done.[K
remote: Counting objects: 100% (306/306), done.[K
remote: Compressing objects: 100% (206/206), done.[K
remote: Total 306 (delta 148), reused 247 (delta 96), pack-reused 0[K
Receiving objects: 100% (306/306), 1.78 MiB | 3.32 MiB/s, done.
Resolving deltas: 100% (148/148), done.


In [3]:
%cd ./CIRCLe

/content/CIRCLe


In [4]:
!git checkout -- models/circle.py

In [5]:
!git pull

Already up to date.


In [6]:
!pip3 install -r ./requirements.txt

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting numpy==1.23.2
  Downloading numpy-1.23.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.1/17.1 MB[0m [31m69.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pandas==1.4.4
  Downloading pandas-1.4.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.7/11.7 MB[0m [31m95.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting Pillow==9.2.0
  Downloading Pillow-9.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m30.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting scikit_learn==1.1.2
  Downloading scikit_learn-1.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━

**This next block of code will be needed if you get this error: **

A100-SXM4-40GB with CUDA capability sm_80 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.

In [8]:
#!pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

# IF ERROR, RESTART RUNTIME due to derm-ita lib
This is due to derm-ita using newer libaries than the Google Colab default(during this time of 12/24/2022)

# Train CIRCLe model 

In [7]:
%mkdir ./saved
%mkdir ./saved/model

In [8]:
!git pull

Already up to date.


In [10]:
!python main.py --use_reg_loss True --base mobilenetv3l --dataset isic2018

Flags:
	alpha: 0.1
	base: mobilenetv3l
	batch_size: 32
	data_dir: ../data/fitz17k/images/all/
	dataset: isic2018
	epochs: 100
	gan_path: saved/stargan/
	hidden_dim: 256
	lr: 0.001
	model: circle
	model_save_dir: saved/model/
	num_classes: 7
	seed: 1
	use_reg_loss: True
	weight_decay: 0.001
isic2018 images already downloaded
isic 2018 masks already downladed
Donloading isic 2018 ground truth classification data
Creating dataframe
	 Looking for cached dataframe
		 organize_data/isic_2018/saved_data_2022_12_27_isic_2018.csv
Creating dataframe. Complete!
Splitting up the dataset into train,test, validation datasets
fizpatrick_skin_type: 1 8001
	 train 6400
	 test 800
	 val 801
fizpatrick_skin_type: 2 1049
	 train 839
	 test 105
	 val 105
fizpatrick_skin_type: 3 513
	 train 410
	 test 51
	 val 52
fizpatrick_skin_type: 4 182
	 train 145
	 test 18
	 val 19
fizpatrick_skin_type: 5 107
	 train 85
	 test 11
	 val 11
fizpatrick_skin_type: 6 163
	 train 130
	 test 16
	 val 17
total_train: 8009 79.

In [16]:
#%mkdir /content/drive/MyDrive/Corbin_Adam_PhD_Workspace/corbin_papers/dissertation_proposal/model_checkpoints

mkdir: cannot create directory ‘/content/drive/MyDrive/Corbin_Adam_PhD_Workspace/corbin_papers/dissertation_proposal/model_checkpoints’: No such file or directory


In [12]:
#%cp ./saved/model/epoch97_acc_0.762.ckpt /content/drive/MyDrive/Corbin_Adam_PhD_Workspace/corbin_papers/dissertation_proposal/model_checkpoints/CIRCLE/mobilenetv3l/

cp: cannot stat './saved/model/epoch97_acc_0.762.ckpt': No such file or directory
