# TLT Image Classification -- Steel 304 dataset

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

NVIDIA Transfer Learning Toolkit (TLT) is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.
This notebooks uses TLT to train a neural network on images of various defects in steel welding. 
It is based on the classification example of the TLT computer vision example notebook. 
https://docs.nvidia.com/tlt/tlt-user-guide/text/tlt_quick_start_guide.html#running-the-transfer-learning-toolkit

In [None]:
# Setting up env variables for cleaner command line commands.
import os

# insert your personal key to NGC, it can be obtained for free: 
#https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key
%env KEY=
%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tlt-experiments/classification
%env DATA_DOWNLOAD_DIR=/workspace/tlt-experiments/data

# Set this path if you don't run the notebook from the samples directory.
%env NOTEBOOK_ROOT=/mnt/sdb/AI/TLT/tlt_cv_samples_v1.0.2/classification_steel304

# Please define this local project directory that needs to be mapped to the TLT docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/classification

# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
os.environ["LOCAL_PROJECT_DIR"] = "/mnt/sdb/AI/TLT/tlt_cv_samples_v1.0.2/classification_steel304"

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "classification"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tlt-experiments/classification/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

In [None]:
# Mapping up the local directories to the TLT docker.
import json
import os
mounts_file = os.path.expanduser("~/.tlt_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tlt-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions":{
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [None]:
!cat ~/.tlt_mounts.json

In [None]:
# SKIP this cell IF you have already installed the TLT launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tlt

In [None]:
# View the versions of the TLT launcher
!tlt info

## 2. Prepare datasets and pre-trained model <a class="anchor" id="head-2"></a>

We will use the following dataset: https://www.kaggle.com/danielbacioiu/tig-stainless-steel-304
The dataset stems from researchers at the University of Birmingham and describes various defects that occur in steel welding. 
The dataset comes already split into a train, validation and test dataset. It is important that we use the given split as the data stems from continous camera runs. Interleving these camera runs would yield good model performance but bad performance in real world applications. 
For further infromation on the dataset take a look at their publication: http://www.sciencedirect.com/science/article/pii/S0963869518305942

Download the data from kaggle into the LOCAL_DATA_DIR. 

In [None]:
import os 

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
print(DATA_DIR)

!ls $DATA_DIR

- Unpack the data set 

In [None]:
# verify the unpacking 
!ls $LOCAL_DATA_DIR/

### A. Split the dataset into train/val/test <a class="anchor" id="head-1-1"></a>

In [None]:
# install pip requirements
!pip3 install tqdm
!pip3 install matplotlib==3.3.3

In [None]:
import json
import os 
from fnmatch import fnmatch
import shutil
import glob 
from collections import Counter

DATA_DIR=os.environ.get('LOCAL_DATA_DIR')
TARGET_DIR=os.path.join(DATA_DIR,'split')

PATH_SOURCE = os.path.join(DATA_DIR,'ss304')

label_list=['good_weld','burn_through','contamination','lack_of_fusion','lack_of_shielding_gas','high_travel_speed']    
dataset_list=['valid','train','test']

#make the split directory 
if not os.path.exists(TARGET_DIR):
        os.mkdir(TARGET_DIR)

for data_set in dataset_list:
    counter1 = 0
    with open(PATH_SOURCE + '/' + data_set + '/' + data_set + '.json') as file:
        f_json = json.load(file)
    
    distribution_classes = Counter(f_json.values())
    nClass = len(f_json)
    
    print(distribution_classes)
    print(distribution_classes[0])
    
    #make the target directories for classes    
    PATH_TARGET_DATASET = os.path.join(TARGET_DIR,data_set)    
    if not os.path.exists(PATH_TARGET_DATASET):
        os.mkdir(PATH_TARGET_DATASET)

    for label in label_list:
        label_path = os.path.join(TARGET_DIR,data_set,label)
        if not os.path.exists(label_path):
            os.mkdir(label_path)

    suffix = '.png'
    pattern_test = os.path.join(DATA_DIR,'ss304',data_set,'*/*')
    print(pattern_test+suffix)

    #get a image in the directory train,test,valid and look it up in the dictionary 
    #then put it in the correct new directory based on its label
    for img in glob.glob(pattern_test+suffix):
        image = img.replace(PATH_SOURCE+'/'+data_set+'/','')
        
 
        #can be used to oversample the dataset, did not improve the performance by a lot/much longer training time
        for j in range(1):#int((nClass / distribution_classes[f_json[image]])) % distribution_classes[0] + 1):
            copy_path=os.path.join(TARGET_DIR,data_set,label_list[f_json[image]])
            image_mod = image.replace('/','')
            shutil.copy(img,copy_path+'/'+ str(j) +image_mod)   
            counter1 +=1
    
    print("Number of of images in",data_set,"dataset: ",counter1)



In [None]:
!ls $LOCAL_DATA_DIR/split/test/good_weld

### Look at the data 

In [None]:
import matplotlib.pyplot as plt
from PIL import Image 
import os
DATA_DIR=os.environ.get('LOCAL_DATA_DIR')

w,h = 200,200
fig = plt.figure(figsize=(200,200))

rows = 1
cols = 6 

image = ['/split/test/burn_through/0161214-151210-run7image-0967.png',
         '/split/test/contamination/0160705-121434-50mmLens added slugimage-0782.png',
         '/split/test/good_weld/0160708-115129-50mmLens 200A w.s.Lev12 try joining 5mm Plateimage-0592.png',
        '/split/test/high_travel_speed/0160705-113121-50mmLens w.s.154cm.mimage-0145.png',
        '/split/test/lack_of_fusion/0160708-145105-50mmLens 350A w.s.Lev16 g.f.20L.m try joining 10mm Plateimage-0682.png',
        '/split/test/lack_of_shielding_gas/0160707-111307-50mmLens 200A w.s.11.5cm.m + no shielding gasimage-0155.png']
labels = ["burn_trough","contamination","good_weld","high_travel_speed","lack_of_fusion","lack_of_shielding"]

for i in range(1, cols*rows + 1):
    ax = fig.add_subplot(rows, cols,i)
    img = Image.open(str(DATA_DIR)+str(image[i-1]))
    img = img.resize((200,200), Image.ANTIALIAS)    
    plt.imshow(img, cmap='gray')
    ax.set_title(labels[i-1], fontsize=150)
    

### B. Download pretrained models <a class="anchor" id="head-1-2"></a>

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_reg_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tlt_pretrained_classification:*

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/

In [None]:
!ls $LOCAL_EXPERIMENT_DIR/

In [None]:
!ls $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_classification_vresnet18

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tlt_pretrained_classification:resnet18 --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet18

In [None]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_classification_vresnet18

## 3. Provide training specfication <a class="anchor" id="head-3"></a>

In [None]:
!cat $LOCAL_SPECS_DIR/classification_spec.cfg

In [None]:
! sudo pip3 install tensorflow 

## 4. Run TLT training <a class="anchor" id="head-4"></a>

In [None]:
!echo $SPECS_DIR
!echo $USER_EXPERIMENT_DIR

In [None]:
%%time
!tlt classification train -e $SPECS_DIR/classification_spec.cfg -r $USER_EXPERIMENT_DIR/output -k $KEY --gpus 2 | tee train_RES18PRE_50EPOC_0_006Learn.out

In [None]:
!docker pull nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

Edit the spec file at `$SPECS_DIR/classification_spec.cfg` to point to the intended model.

In [None]:
%%time    
!tlt classification evaluate -e $SPECS_DIR/classification_spec.cfg -k $KEY  | tee eval_RES50PRE_20EPOC_0_006Learn.out

## 6. Visualize Inferences <a class="anchor" id="head-9"></a>

In [None]:
# Defining the checkpoint epoch number to use for the subsequent steps.
%env EPOCH=020

In [None]:
!tlt classification inference -e $SPECS_DIR/classification_spec.cfg \
                          -m $USER_EXPERIMENT_DIR/output/weights/resnet_$EPOCH.tlt \
                          -k $KEY -b 32 -d $DATA_DOWNLOAD_DIR/split/test/contamination \
                          -cm $USER_EXPERIMENT_DIR/output_retrain/classmap.json

In [None]:
import matplotlib.pyplot as plt
from PIL import Image 
import os
import csv
import random as rd

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
DATA_DOWNLOAD_DIR = os.environ.get('DATA_DOWNLOAD_DIR')
csv_path = os.path.join(DATA_DIR, 'split', 'test', 'contamination', 'result.csv')

with open(csv_path,newline='\n') as csvfile:
    results = list(csv.reader(csvfile))

w,h = 200,200
fig = plt.figure(figsize=(30,30))
columns = 5
rows = 1

#print(results)
for i, index in enumerate(rd.sample(range(1, len(results)), 5)):  
    ax = fig.add_subplot(rows, columns,i+1)
    img = Image.open(results[index][0].replace(DATA_DOWNLOAD_DIR, DATA_DIR))
    img = img.resize((w,h), Image.ANTIALIAS)
    plt.imshow(img,cmap='gray')
    ax.set_title(results[index][1] + '\n' + 'Image: ' + str(index) + ' ' +str(round(float(results[index][2]),3)), fontsize=30)