# Colfax TASS Trainer

![Colfax Tass Trainer](https://raw.githubusercontent.com/TechBubbleTechnologies/IoT-JumpWay-Intel-Examples/master/Intel-Colfax/images/tass-trainer.jpg) 

This notebook provides the source codes and walk through of the code used for the TASS Colfax trainer, a computer vision training program based on Tensorflow's Inception V3 example and training on Intel Nervana AI HPC Cluster (Colfax Cluster).

This tutorial is part of the Computer Vision In The Modern World series by TechBubble Technologies founder and Intel Software Innovator, Adam Milton-Barker.

FOR THE FULL TUTORIAL PLEASE VISIT THE FOLLOWING LINK:
[Full Colfax Tass Trainer Tutorial](https://github.com/TechBubbleTechnologies/IoT-JumpWay-Intel-Examples/tree/master/Intel-Colfax/Tass-Trainer)

PLEASE NOTE: YOU DO NOT NEED TO EXECUTE ANY BLOCKS UNTIL YOU GET TO "Create training job"

## Import the required modules

Import the modules required for this tutorial to work:

In [2]:
import json
import InceptionFlow
import cfxmagic
print("Imported Required Modules")

Imported Required Modules


## Some quick testing

Do some quick testing to see what version of Tensorflow we are using and show that we are running on a Colfax Cluster Node:

In [3]:
# START TESTING

import tensorflow as tf; 
print("Tensorflow V"+str(tf.__version__))

import socket
print("Running on Colfax Cluster Node: {}".format(socket.gethostname()))

# END TESTING

Tensorflow V1.3.1
Running on Colfax Cluster Node: c001-n041


## Define the TassColfaxTrainer class

Define the main class that will be used in this example.

In [4]:
class TassColfaxTrainer():
    
    def __init__(self):
        
        self.InceptionFlow = InceptionFlow.InceptionFlow()
            
        print("TassColfaxTrainer Initiated")
        
    def InitiateTraining(self):
        
        print("TassColfaxTrainer Training Initiated")
        self.InceptionFlow.trainModel()

## Initiate TassColfaxTrainer

Initiate the TassColfaxTrainer class:

In [5]:
TassColfaxTrainer = TassColfaxTrainer()

TassColfaxTrainer Initiated


## Create training job

As the images have been provided for the training (Yoda & Darth Vader), you can begin training by creating a training job script using the following block. This will create a script that can be executed to create a job.

In [3]:
%%writefile colfax-tass-trainer
cd $PBS_O_WORKDIR
echo "* Hello world from compute server `hostname`!"
echo "* The current directory is ${PWD}."
echo "* Compute server's CPU model and number of logical CPUs:"
lscpu | grep 'Model name\\|^CPU(s)'
echo "* Python available to us:"
which python
python --version
echo "* This job trains TASS on the Colfax Cluster using images of Yoda & Darth Vader. May the force be with you ;)"
python runit.py
sleep 10
echo "*Adios"
# Remember to have an empty line at the end of the file; otherwise the last command will not run


Overwriting colfax-tass-trainer


## Check training job

Now check to make sure that the script has been created. Execute the following block and you should see the file colfax-tass-trainer has been generated.

In [36]:
%ls

colfax-tass-trainer                  [0m[01;34mdata[0m/           runit.py
Colfax-TASS-Trainer-Inference.ipynb  [01;34mInceptionFlow[0m/  [01;34mtraining[0m/
Colfax-TASS-Trainer.ipynb            [01;34mmodel[0m/          Welcome.ipynb


## Submit the training job script

Now you can submit your training job to schedule the training of TASS.

In [7]:
!qsub colfax-tass-trainer

28153.c001


## Check the status of the job

Now you can monitor the status of the job by executing the following block. You may need to do this a number of times before the job completes.

In [2]:
!qstat

Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
28193.c001                 ...ub-singleuser u4399           00:25:30 R jupyterhub     
