# Training FCN Model
This script performs the training of a network after the dataset has been constructed. Four modules are used in this training:

1. `input_data.py`: This module loads in the training and testing data that was created, and prepares it so that it can be fed into the neural network for training.
2. `models.py`: this module contains different types of neural network architectures. One can choose a model from this script and train it.
3. `training_utilities.py`: this module contains functions to construct models, set up diagnostic files, and perform the training on the model.
4. `accuracy.py`: The accuracy that the FCN outputs looks at the predicted label and evaluated label, and does a pixelwise comparison (#pixels set to the correct value)/(#pixels in the image). This isn't the best way to determine the success of our model. Rather, we want to look at a metric that shows that the model is detecting defects (where each defect consists of multiple pixels). Hence we use measures such as the recall, which is the number of true positive defect detections TN over the true positives plus the false negatives (FN), R = TN/(TN+FN). This module helps in calculating such metrics

First, we import these modules

In [1]:
from input_data import *
from models import *
from training_utilities import *

## Parameters ##
Next, we need to provide parameter information based on the data and model we're using.

**parent_dir**: This is the path to the "stem-learning" code

**data_dir**: This is the path to the data that we created in the preprocessing section

**sess_name**: we will create a folder in the "results" directory called session_name, where all the output will be stored

**N**: This is the pixel width/height of the input images (note we're assuming a square image)

**k_fac**: this is a factor that describes how many channels we want per layer in our FCNs. Whatever the default value is per layer, it is multiplied by k_fac.

**nb_classes**: this is the number of labels that we are learning at once. For example, if our data is just the "2Te" labels, then nb_classes = 2 (2Te and no defect). 

**dropout**: This is the dropout rate at the end of our FCN

**num_steps**: The total number of steps to train on (-1 for infinite loop)

In [2]:
parent_dir = 'D:/stem-learning/'
identifier = "20220512_unet_gen_fft_10_reparamed_dist_211_1922"
sess_name  = '1vacancy'
data_dir = parent_dir + "data/WSe/data_for_gan/generated/GANNED_DATA_FOLDER/data_folder_{}/parsed_label_{}/".format(identifier, sess_name)
N          = 256
k_fac      = 16
nb_classes = 2
num_steps  = 200

The variables below are then created to locate the directories that we'll be storing our data

In [3]:
from os import makedirs
sess_dir = "{}results/results_{}/{}/".format(parent_dir, identifier, sess_name)
makedirs(sess_dir, exist_ok=True)

model_weights_fn = sess_dir + "weights.h5"
model_fn         = sess_dir + "model.json"
diagnostics_fn   = sess_dir + "diagnostics.dat"

Now we create the model and set up a diagnostics file

In [4]:
model = construct_model(N, k_fac, nb_classes, sess_dir, model_fn, model_weights_fn)
step = setup_diagnostics(diagnostics_fn)

creating new session


  layer_config = serialize_layer_fn(layer)


Finally, we train

In [5]:
train(step, data_dir, N, nb_classes, model, diagnostics_fn, model_weights_fn, num_steps=num_steps)

training step: 0	training file: train_00032.p
	grabbing data
	done
32/32 - 15s - loss: 0.8406 - accuracy: 0.5177 - val_loss: 11.2005 - val_accuracy: 0.2689 - 15s/epoch - 460ms/step
	calculating accuracy
0.734653091430664
	done
TP = -1, FP = -1, FN = -1, TN = -1
recall = -1, precision = -1, F1 = -1, bal_acc = -1
training step: 1	training file: train_00023.p
	grabbing data
	done
32/32 - 8s - loss: 0.2804 - accuracy: 0.9797 - val_loss: 5.6417 - val_accuracy: 0.7102 - 8s/epoch - 249ms/step
	calculating accuracy
0.285886287689209
	done
TP = -1, FP = -1, FN = -1, TN = -1
recall = -1, precision = -1, F1 = -1, bal_acc = -1
training step: 2	training file: train_00019.p
	grabbing data
	done
32/32 - 8s - loss: 0.2240 - accuracy: 0.9901 - val_loss: 0.1798 - val_accuracy: 0.9875 - 8s/epoch - 243ms/step
	calculating accuracy
0.003603839874267578
	done
TP = 0, FP = 6, FN = 1525, TN = 47109
recall = 0.0, precision = 0.0, F1 = -1, bal_acc = 0.4999363260108246
training step: 3	training file: train_00019

  conv_label_img = (label_img - np.min(label_img))/np.ptp(label_img)


	done
TP = 0, FP = 0, FN = 1525, TN = 47115
recall = 0.0, precision = -1, F1 = -1, bal_acc = 0.5
training step: 4	training file: train_00011.p
	grabbing data
	done
32/32 - 8s - loss: 0.1320 - accuracy: 0.9911 - val_loss: 0.1345 - val_accuracy: 0.9910 - 8s/epoch - 246ms/step
	calculating accuracy
7.62939453125e-07
	done
TP = 0, FP = 0, FN = 1525, TN = 47115
recall = 0.0, precision = -1, F1 = -1, bal_acc = 0.5
training step: 5	training file: train_00002.p
	grabbing data
	done
32/32 - 8s - loss: 0.1156 - accuracy: 0.9896 - val_loss: 0.1183 - val_accuracy: 0.9910 - 8s/epoch - 247ms/step
	calculating accuracy
4.76837158203125e-07
	done
TP = 0, FP = 0, FN = 1525, TN = 47115
recall = 0.0, precision = -1, F1 = -1, bal_acc = 0.5
training step: 6	training file: train_00008.p
	grabbing data
	done
32/32 - 8s - loss: 0.0955 - accuracy: 0.9916 - val_loss: 0.1004 - val_accuracy: 0.9910 - 8s/epoch - 248ms/step
	calculating accuracy
1.71661376953125e-06
	done
TP = 0, FP = 0, FN = 1525, TN = 47115
recal

KeyboardInterrupt: 