# This is the official training Illustration of PuzzleTuning
* Use google colab pro+ (high RAM+GPU) to run 24 hours
* we use the Python3.7 Pytorch 1.9.0+cu111 torchvision 0.10.0+cu111
* we use the t4 GPU for the data-flow illustration with Colab

The code and Training process along with all record are Open-Source:
* PuzzleTuning official github page: https://github.com/sagizty/PuzzleTuning
* The dataset CPIA is publicly aviliable at: https://github.com/zhanglab2021/CPIA_Dataset


## Check Colab GPU

In [None]:
# check GPU
!nvidia-smi

In [None]:
!date --date='+8 hour'  # CST time zone

## Mount Google Drive

This will save output images to your google drive, you can remove this line and the last part if you don't want the output

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## Build file-system enviroment

In [None]:
# create file-system enviroment
# mount the google drive first
# https://drive.google.com/drive/u/1/my-drive

# clear colab path
!rm -rf /data
!rm -rf /home/Pathology_Experiment

# create path
!mkdir /home/Pathology_Experiment
!mkdir /home/Pathology_Experiment/runs
!mkdir /home/Pathology_Experiment/code
!mkdir /home/Pathology_Experiment/saved_models
!mkdir /home/Pathology_Experiment/imaging_results

!mkdir /data
!mkdir /data/Pathology_Experiment
!mkdir /data/Pathology_Experiment/dataset

print('Folder Tree Creation completed!')

# get latest code from Github pancreatic-cancer-diagnosis-tansformer page
!git clone https://github.com/sagizty/PuzzleTuning.git /home/Pathology_Experiment/code
print('code transfer from github completed!')

# get the CLS dataset by its zip
!mv /home/Pathology_Experiment/code/Archive/* /data/Pathology_Experiment/dataset/
# unzip
!unzip -q /data/Pathology_Experiment/dataset/PuzzleTuning_demoset.zip -d /data/Pathology_Experiment/dataset/
!unzip -q /data/Pathology_Experiment/dataset/warwick_CLS.zip -d /data/Pathology_Experiment/dataset/
# alter the path
!rm -f /data/Pathology_Experiment/dataset/PuzzleTuning_demoset.zip
!rm -f /data/Pathology_Experiment/dataset/warwick_CLS.zip
print('data transfer completed!')

## Arrange the working enviorment

In [None]:
!sudo apt-get install python3.7
!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 1
!sudo apt-get install python3.7-distutils
!sudo apt-get install python3-pip
!python -m pip install --upgrade pip --user

In [None]:
!python3.7 -m pip install -q torch==1.9.0 torchvision==0.10.0

In [None]:
# get packages
!pip install tqdm
!pip install timm==0.5.4
!pip install einops
!pip install ml_collections
!pip install ttach
!pip install notifyemail
!pip install psutil
!pip install scipy
!pip install torchsummary
!pip install tensorboardX
!pip install opencv_contrib_python
!pip install matplotlib
!pip install ipykernel

In [None]:
!python --version

In [None]:
!pip list
!pip freeze>requirements.txt
!cp requirements.txt ../runs

# Pre-Training
* set up path by command line
* use argparse to set down hyper-parameter

10000epochs will be trined with 400 images, for data-flow illustration.
We suggest you to use 4 * A100 SMX4 GPUs to train PuzzleTuning with CPIA dataset.



Our official training script is given here:

In [None]:
# nohup python PuzzleTuning.py --batch_size 64 --group_shuffle_size 16 --blr 1.5e-4 --epochs 200 --accum_iter 2 --print_freq 2000 --check_point_gap 50 --input_size 224 --warmup_epochs 20 --pin_mem --num_workers 32 --strategy loop --PromptTuning Deep --basic_state_dict ../saved_models/ViT_b16_224_Imagenet.pth --data_path ../datasets/All &

All following lines are for data-flow illustation with colab

In [None]:
# change working dir
import os
os.chdir("/home/Pathology_Experiment/code")
!pwd

Training

In [None]:
!python PuzzleTuning.py --model sae_vit_base_patch16 --PromptTuning Deep --batch_size 32 --group_shuffle_size 8 --strategy loop --blr 1.5e-5 --epochs 10000 --warmup_epochs 20 --accum_iter 2 --print_freq 200 --check_point_gap 10000 --input_size 224 --pin_mem --num_workers 2 --basic_state_dict timm --data_path /data/Pathology_Experiment/dataset/PuzzleTuning_demoset --output_dir /home/Pathology_Experiment/runs --log_dir /home/Pathology_Experiment/imaging_results

Visulization

In [None]:
!python PuzzleTesting.py --model sae_vit_base_patch16 --PromptTuning Deep --Prompt_Token_num 20 --batch_size 8 --fix_position_ratio 0.5 --fix_patch_size 16 --enable_visualize_check --data_path /data/Pathology_Experiment/dataset/PuzzleTuning_demoset --output_dir /home/Pathology_Experiment/imaging_results --log_dir /home/Pathology_Experiment/imaging_results --checkpoint_path /home/Pathology_Experiment/runs/PuzzleTuning_sae_vit_base_patch16_Prompt_Deep_tokennum_20/PuzzleTuning_sae_vit_base_patch16_Prompt_Deep_tokennum_20_checkpoint-9999.pth

Load-up the ViT prompt weight from pre-trained checkpoint

In [None]:
os.chdir("/home/Pathology_Experiment/code/utils")

In [None]:
!python transfermodel.py --given_name ViT_b16_224_timm_PuzzleTuning_SAE_CPIAm_Prompt_Deep_tokennum_20_promptstate.pth --model_idx ViT --PromptTuning Deep --Prompt_Token_num 20 --edge_size 224 --checkpoint_path /home/Pathology_Experiment/runs/PuzzleTuning_sae_vit_base_patch16_Prompt_Deep_tokennum_20/PuzzleTuning_sae_vit_base_patch16_Prompt_Deep_tokennum_20_checkpoint-9999.pth --save_model_path /home/Pathology_Experiment/saved_models

# Finetuning and comparison
* set up path by command line
* use argparse to set down hyper-parameter

## Finetuning without PuzzleTuning

In [None]:
os.chdir("/home/Pathology_Experiment/code")

### ViT (with timm weight)

Train

In [None]:
!python Train.py --edge_size 224 --data_augmentation_mode 3 --lr 1e-05 --lrf 0.30 --enable_tensorboard --model_idx ViT_base_timm_401_lf30_finetuning_warwick_CLS --dataroot /data/Pathology_Experiment/dataset/warwick_CLS --draw_root /home/Pathology_Experiment/runs/404_lf30_warwick --model_path /home/Pathology_Experiment/saved_models

Test

In [None]:
!python Test.py --edge_size 224 --data_augmentation_mode 3 --model_idx ViT_base_timm_401_lf30_finetuning_warwick_CLS --dataroot /data/Pathology_Experiment/dataset/warwick_CLS --draw_root /home/Pathology_Experiment/runs/404_lf30_warwick --model_path /home/Pathology_Experiment/saved_models

### VPT + finetuning (with timm weight)

Train

In [None]:
!python Train.py --edge_size 224 --data_augmentation_mode 3 --lr 1e-05 --lrf 0.30 --enable_tensorboard --model_idx ViT_base_timm_PromptDeep_20_401_lf30_finetuning_warwick_CLS --PromptTuning Deep --dataroot /data/Pathology_Experiment/dataset/warwick_CLS --draw_root /home/Pathology_Experiment/runs/404_lf30_warwick --model_path /home/Pathology_Experiment/saved_models

Test

In [None]:
!python Test.py --edge_size 224 --data_augmentation_mode 3 --model_idx ViT_base_timm_PromptDeep_20_401_lf30_finetuning_warwick_CLS --PromptTuning Deep --dataroot /data/Pathology_Experiment/dataset/warwick_CLS --draw_root /home/Pathology_Experiment/runs/404_lf30_warwick --model_path /home/Pathology_Experiment/saved_models

## Finetuning with PuzzleTuning Prompt
VPT + finetuning (with timm weight & PuzzleTuning Prompt)

Train

In [None]:
!python Train.py --edge_size 224 --data_augmentation_mode 3 --lr 1e-05 --lrf 0.30 --enable_tensorboard --model_idx ViT_base_timm_PuzzleTuning_SAE_promptstate_PromptDeep_20_401_lf30_finetuning_warwick_CLS --PromptTuning Deep --Prompt_Token_num 20 --PromptUnFreeze --dataroot /data/Pathology_Experiment/dataset/warwick_CLS --draw_root /home/Pathology_Experiment/runs/runs/SAE-timm-start_promptstate_404_lf30_warwick --Prompt_state_path /home/Pathology_Experiment/saved_models/ViT_b16_224_timm_PuzzleTuning_SAE_CPIAm_Prompt_Deep_tokennum_20_promptstate.pth --model_path /home/Pathology_Experiment/saved_models

Test

In [None]:
!python Test.py --edge_size 224 --data_augmentation_mode 3 --model_idx ViT_base_timm_PuzzleTuning_SAE_promptstate_PromptDeep_20_401_lf30_finetuning_warwick_CLS --PromptTuning Deep --Prompt_Token_num 20 --PromptUnFreeze --dataroot /data/Pathology_Experiment/dataset/warwick_CLS --draw_root /home/Pathology_Experiment/runs/runs/SAE-timm-start_promptstate_404_lf30_warwick --model_path /home/Pathology_Experiment/saved_models

# check the Tensorboard output

In [None]:
%load_ext tensorboard
%tensorboard --logdir '/home/Pathology_Experiment/runs'

# After the task, save the output to google drive


In [None]:
# change working dir
import os
os.chdir("/home/Pathology_Experiment/code/utils")
!python check_log_json.py --enable_notify --draw_root /home/Pathology_Experiment/runs --record_dir /home/Pathology_Experiment/CSV_logs

In [None]:
# copy tensorboard runs
!/bin/cp -rf /home/Pathology_Experiment/runs/*  /content/drive/MyDrive/Pathology_Experiment/runs/
print('runs copy completed!')
# copy the traind models
!/bin/cp -rf /home/Pathology_Experiment/saved_models/* /content/drive/MyDrive/Pathology_Experiment/saved_models/
print('models copy completed!')
# copy the imaging_results
!/bin/cp -rf /home/Pathology_Experiment/imaging_results/* /content/drive/MyDrive/Pathology_Experiment/imaging_results/
print('imaging_results copy completed!')

In [None]:
!date --date='+8 hour'  # CST time zone