# Chip Characterization Test
## YOLOv5 on modified MSTAR Dataset

21 Aug 2022, Alex Denton, Thesis Work

## Before You Start 
YOLOv5 Tutorial: https://blog.roboflow.com/how-to-train-yolov5-on-a-custom-dataset/ <br>
My GitHub Repo: https://github.com/awd86/yolov5

Notes on scripts and syntax:

- I recommend doing <i>all of the following</i> on the DGX because the dataset is very large. It takes nearly an hour to transfer (once coupon'd) via USB3.0 on the DGX machines.

- I did <i>not</i> use Jupyter Notebook for this task. I used PyCharm IDE to write my Python code and exectuted <i>train.py</i> from terminal. Some of the required packages are only available on Pip. You can probably use Conda and Jupyter Notebooks, but I haven't done testing. 

- "!" means that Jupyter (or a .py file) will run that command in a new terminal instance. The terminal instance will be created within your virtual environment, but each new "!" is a new terminal. If you need to run multiple commands in one instance, put them on 1 line with ";" separators.

- "python" vs. "python3" depends on your machine's aliasing. If you want to alias "python" to run "python3" instead of your machine's default python release, you can look up how to edit your profile. <i>Do this at your own risk!</i> I have set mine up this way and tend to write "python"...you can safely replace that with "python3" if you're having issues. 

### (needed before making venv and launching jupyter notebook)

Clone YOLOv5 GitHub repo and install requirements.txt in a Python>=3.6.0 environment, including PyTorch>=1.7. Models and datasets download automatically from the latest YOLOv5 release. I'm sucessfully using 3.8 and 3.9 on different machines.<br><br>



NOTE: PyTorch>=1.9 with new torch.distributed.run is recommended (replaces older torch.distributed.launch commands below). See https://pytorch.org/docs/stable/distributed.html for details.

<strike> You'll want to download my .py files into the cloned YOLOv5 Repo (to have to most current version of YOLOv5). Here are the ones you'll need:
    
- HowTo_YOLOv5_xView.ipynb (this document)
- coupons.py  (divides the picture and labels into coupons)
- split_set.py  (to divide images/labels into 'train' and 'val' sets)
- re_classify.py  (to change class names and remove 'None' class)
- vague_classes.py  (where new class names are specified
- autosplit_txt.py  (replicates the autosplit files created by the YOLO converted, described below)

<strike> The last piece to the puzzle is the file that converts xView labels into YOLO format. That can be found in the /data/ directory (folder) of the YOLOv5 clone'd repo. There is a file in there called 'xView.yaml' that specifies the data structure. At the bottom they've included the python code to convert your xView dataset (awesome of them!). You'll have to copy/paste to a new .py and modify the paths to fit your file structure. You're also going to end up modifying copies of this .yaml to specify the data structure for your runs.</strike>

There isn't much prep work before training. The dataset creation routine takes care of almost everything. Just copy the synthetic datasets into the same directory as this file. Change the absolute paths in the <i>data.yaml</i> files to reflect there actual location.<br> Then ensure you have these other supporting files:

- train.py
- val.py
- split_set.py
- /models/ChipCT.yaml  (determines the architecture)


## Check the local Cuda version

In [1]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130


## Check PyTorch version & GPU Compatability
- torch >= 1.9
- CudaDeviceProperties should have something under 'name' - this means it is compatible

In [2]:
import torch
print('torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))


torch 1.10.1+cu102 _CudaDeviceProperties(name='Tesla V100-DGXS-32GB', major=7, minor=0, total_memory=32485MB, multi_processor_count=80)


## Configure Multi-GPU DistributedDataParallel Mode
https://github.com/ultralytics/yolov5/issues/475

Before specifying GPUs, <a href="https://hsf-training.github.io/hsf-training-ml-gpu-webpage/02-whichgpu/index.html">determine the parameters</a>:




In [3]:
import torch
use_cuda = torch.cuda.is_available()

if use_cuda:
    print('__CUDNN VERSION:', torch.backends.cudnn.version())
    print('__Number CUDA Devices:', torch.cuda.device_count())
    print('__CUDA Device Name:',torch.cuda.get_device_name(0))
    print('__CUDA Device Total Memory [GB]:',torch.cuda.get_device_properties(0).total_memory/1e9)

__CUDNN VERSION: 7605
__Number CUDA Devices: 4
__CUDA Device Name: Tesla V100-DGXS-32GB
__CUDA Device Total Memory [GB]: 34.063712256


<strike> Note: DGX1 and DGX4 both report "Number of CUDA Devices: 5" but the correct number to specify is "4"
<br>There are only 4 GPUs. This script might be counting the CPU as an additional CUDA device,<i> but train.py won't run</i> if you say "5"</strike>

You will have to pass python the following along with the usual arguments:

like this:

--nproc_per_node specifies how many GPUs you would like to use. In the example above, it is 4.<br>

--batch is the total batch-size. It will be divided evenly to each GPU. In the example above, it is 64/4=16 per GPU.<br>

The code above will use GPUs 0... (N-1).

Notes<br>
- Windows support is untested, Linux is recommended.
- '--batch' must be a multiple of the number of GPUs.
- GPU 0 will take slightly more memory than the other GPUs as it maintains EMA and is responsible for checkpointing etc.

If you get RuntimeError: Address already in use, it could be because you are running multiple trainings at a time. To fix this, simply use a different port number by adding --master_port like below,

## Build the Dataset Folder Structure:

NOTE: The folder architecture is very important!!<br>
It doesn't have to be exactly like this, but it does need to be specified in data.yaml<br>

yolov5 (contains all py code)<br>
|_ venv()<br>
|<br>
|_ ChipCharTest<br>
&emsp;   |_ models<br>
&emsp;   | &emsp;|_ ChipCT.yaml. * model specification<br>
&emsp;   |<br>
&emsp;   |_ dataset_A<br>
&emsp;   &emsp;   |_ data.yaml * class and image/label location specification<br>
&emsp;   &emsp;   |<br>
&emsp;   &emsp;   |_ train<br>
&emsp;   &emsp;   | &emsp;|_ images()<br>
&emsp;   &emsp;   | &emsp;|_ labels()<br>
&emsp;   &emsp;   |<br>
&emsp;   &emsp;   |_ val<br>
&emsp;   &emsp;   | &emsp;|_ images()<br>
&emsp;   &emsp;   | &emsp;|_ labels()<br>
&emsp;   &emsp;   |<br>
&emsp;   &emsp;   |_ test<br>
&emsp;   &emsp;    &emsp;|_ images()<br>
&emsp;   &emsp;    &emsp;|_ labels()<br>




The script <i>split_set.py</i> should take care of this for you.


<hr border-top: 24px solid #bbb; border-radius: 10px>

#  * * * Prepare Dataset * * *

## Split 'train' dataset into 'train' and 'test'
The <i>train</i> and <i>validate</i> sets have labels, but <i>test</i> does not. (well, it does...but they will be ignored)

I'm rebuilding the <i>split_set.py</i> file to fully develope the train/test/validate folders based on information from:<br>
https://www.v7labs.com/blog/train-validation-test-set

In [1]:
from split_set import split_set
import os

dataset_dir = 'Set1_M35_2s1_M1'
for dataset in os.listdir(path=dataset_dir):
    src_dir = '/'.join((dataset_dir,dataset))
    split_set(src_dir,[0.7,0.15,0.15],'rand') 


The Set1_M35_2s1_M1/chips_10_black move is complete
The Set1_M35_2s1_M1/chips_05_clutter move is complete
The Set1_M35_2s1_M1/chips_10_clutter move is complete
The Set1_M35_2s1_M1/chips_05_black move is complete
The Set1_M35_2s1_M1/chips_20_clutter move is complete
The Set1_M35_2s1_M1/chips_20_black move is complete
The Set1_M35_2s1_M1/chips_20_noise move is complete
The Set1_M35_2s1_M1/chips_05_noise move is complete
The Set1_M35_2s1_M1/standardized move is complete
The Set1_M35_2s1_M1/chips_10_noise move is complete


## Define model configuration and architecture (needed in runtime):

Check 'data.yaml' and 'xView.yaml' (model) files for each run:<br>
- data.yaml : structure of image data folders, number of classes, name of classes
- xView.yaml : repeat number of classes (must match), specifies architecture model in PyTorch format

## Make an account on WandB.ai to watch training progress and get auto-generated charts

https://wandb.ai/

<hr border-top: 24px solid #bbb; border-radius: 10px>

#  * * * Execution * * *

## Execute train.py

Train Custom YOLOv5 Detector!
Here, we are able to pass a number of arguments:<br>

img: define input image size (must be <b>multiple of 32</b>)<br>
  '--rect' allows non-square input images<br>
batch: determine batch size (multiple of number of GPUs)<br>
epochs: define the number of training epochs.<br>
data: <b>set the path to our data.yaml file</b><br>
cfg: <b>specify our model configuration xView.yaml</b><br>
weights: specify a path to pretrained weights if using transfer learning. (Note: some available from Ultralytics)<br>
name: <b>result names</b><br>
nosave: only saves the final checkpoint <b>(not recommended, will keep best model if this is left out)</b><br>
cache: cache images for faster training<br>

## Testing Viability of ChipCT.yaml

In [2]:
!python3 -m torch.distributed.run --nproc_per_node 4 ~/PycharmProjects/yolov5/train.py --img 78 --batch 64 --epochs 20 --data ./Set1_M35_2s1_M1/chips_05_black/data.yaml --cfg ./models/ChipCT.yaml --weights '' --name ChipCT_viability_1  --cache

*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
[34m[1mwandb[0m: Currently logged in as: [33mawd86[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mtrain: [0mweights=, cfg=./models/ChipCT.yaml, data=./Set1_M35_2s1_M1/chips_05_black/data.yaml, hyp=../data/hyps/hyp.scratch.yaml, epochs=20, batch_size=64, imgsz=78, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, adam=False, sync_bn=False, workers=8, project=../runs/train, name=ChipCT_viability_1, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, patience=100, freeze=0, save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bb

***IMPORTANT NOTE*** Do <i>not</i> run these sequentially. You must change the name of your label folder. Otherwise you will simply retrain on the previous labels.

## Chip Characterization Test

***IMPORTANT NOTE*** Do <i>not</i> run these sequentially. You must change the name of your label folder. Otherwise you will simply retrain on the previous labels.