<center><img src="../images/DLI Header.png" alt="Header" style="width: 400px;"/></center>

# Getting Started with AI on Jetson Nano
### Interactive Classification Tool

This notebook is an interactive data collection, training, and testing tool, provided as part of the NVIDIA Deep Learning Institute (DLI) course, "Getting Started with AI on Jetson Nano". It is designed to be run on the Jetson Nano in conjunction with the detailed instructions provided in the online DLI course pages. 

To start the tool, set the **Camera** and **Task** code cell definitions, then execute all cells.  The interactive tool widgets at the bottom of the notebook will display.  The tool can then be used to gather data, add data, train data, and test data in an iterative and interactive fashion! 

The explanations in this notebook are intentionally minimal to provide a streamlined experience.  Please see the DLI course pages for detailed information on tool operation and project creation.

### Camera
First, create your camera and set it to `running`.  Uncomment the appropriate camera selection lines, depending on which type of camera you're using (USB or CSI). This cell may take several seconds to execute.
<div style="border:2px solid black; background-color:#e3ffb3; font-size:12px; padding:8px; margin-top: auto;"><i>
    <h4><i>Tip</i></h4>
<p>There can only be one instance of CSICamera or USBCamera at a time.  Before starting a new project and creating a new camera instance, you must first release the existing one. To do so, shut down the notebook's kernel from the JupyterLab pull-down menu: <strong>Kernel->Shutdown Kernel</strong>, then restart it with <strong>Kernel->Restart Kernel</strong>.</p>
<ul><code>sudo systemctl restart nvargus-daemon</code> with password:<code>dlinano</code> is included to then force a reset of the camera daemon.</ul>

In [1]:
# Full reset of the camera
!echo 'dlinano' | sudo -S systemctl restart nvargus-daemon && printf '\n'
# Check device number
!ls -ltrh /dev/video*

# USB Camera (Logitech C270 webcam)
# from jetcam.usb_camera import USBCamera
# camera = USBCamera(width=224, height=224, capture_device=0) # confirm the capture_device number
# CSI Camera (Raspberry Pi Camera Module V2)

from jetcam.csi_camera import CSICamera
camera = CSICamera(width=224, height=224)

camera.running = True
print("camera created")

[sudo] password for dlinano: 
crw-rw----+ 1 root video 81, 0 Aug 14 13:18 /dev/video0
camera created


### Task
Next, define your project `TASK` and what `CATEGORIES` of data you will collect.  You may optionally define space for multiple `DATASETS` with names of your choosing. 

Uncomment/edit the associated lines for the classification task you're building and execute the cell.
This cell should only take a few seconds to execute.

In [2]:
import torchvision.transforms as transforms
from dataset import ImageClassificationDataset

#TASK = 'thumbs'
TASK = 'emotions'
# TASK = 'fingers'
# TASK = 'diy'

# CATEGORIES = ['thumbs_up', 'thumbs_down']

#CATEGORIES = ['none', 'happy', 'sad', 'angry']
CATEGORIES =  ['angry','digust','fear', 'happy', 'sad', 'surprise','neutral']


# CATEGORIES = ['1', '2', '3', '4', '5']
# CATEGORIES = [ 'diy_1', 'diy_2', 'diy_3']

DATASETS = ['A', 'B']
# DATASETS = ['A', 'B', 'C']
# transforms.ColorJitter --> Randomly change the brightness, contrast and saturation of an image.
# transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
# transforms.ColorJitter(0.2, 0.2, 0.2, 0.2),
# transforms.RandomRotation((0.1,0.5)),

TRANSFORMS = transforms.Compose([
    transforms.ColorJitter(0.2, 0.2, 0.2, 0.2),
    transforms.Grayscale(num_output_channels=3),
    transforms.Resize((224, 224)),
    transforms.RandomRotation((0.1,0.5)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

datasets = {}

for name in DATASETS:
    datasets[name] = ImageClassificationDataset(TASK + '_' + name, CATEGORIES, TRANSFORMS)
    
print("{} task with {} categories defined".format(TASK, CATEGORIES))

emotions task with ['angry', 'digust', 'fear', 'happy', 'sad', 'surprise', 'neutral'] categories defined


### Data Collection
Execute the cell below to create the data collection tool widget. This cell should only take a few seconds to execute.

In [3]:
import ipywidgets
import traitlets
from IPython.display import display
from jetcam.utils import bgr8_to_jpeg

# initialize active dataset
dataset = datasets[DATASETS[0]]

# unobserve all callbacks from camera in case we are running this cell for second time
camera.unobserve_all()

# create image preview
camera_widget = ipywidgets.Image()
traitlets.dlink((camera, 'value'), (camera_widget, 'value'), transform=bgr8_to_jpeg)

# create widgets
dataset_widget = ipywidgets.Dropdown(options=DATASETS, description='dataset')
category_widget = ipywidgets.Dropdown(options=dataset.categories, description='category')
count_widget = ipywidgets.IntText(description='count')
save_widget = ipywidgets.Button(description='add')

# manually update counts at initialization
count_widget.value = dataset.get_count(category_widget.value)

# sets the active dataset
def set_dataset(change):
    global dataset
    dataset = datasets[change['new']]
    count_widget.value = dataset.get_count(category_widget.value)
dataset_widget.observe(set_dataset, names='value')

# update counts when we select a new category
def update_counts(change):
    count_widget.value = dataset.get_count(change['new'])
category_widget.observe(update_counts, names='value')

# save image for category and update counts
def save(c):
    dataset.save_entry(camera.value, category_widget.value)
    count_widget.value = dataset.get_count(category_widget.value)
save_widget.on_click(save)

data_collection_widget = ipywidgets.VBox([
    ipywidgets.HBox([camera_widget]), dataset_widget, category_widget, count_widget, save_widget
])

# display(data_collection_widget)
print("data_collection_widget created")

data_collection_widget created


### Model
Execute the following cell to define the neural network and adjust the fully connected layer (`fc`) to match the outputs required for the project.  This cell may take several seconds to execute.

In [4]:
import torch
import torchvision


device = torch.device('cuda')

# ALEXNET
# model = torchvision.models.alexnet(pretrained=True)
# model.classifier[-1] = torch.nn.Linear(4096, len(dataset.categories))

# SQUEEZENET 
# model = torchvision.models.squeezenet1_1(pretrained=True)
# model.classifier[1] = torch.nn.Conv2d(512, len(dataset.categories), kernel_size=1)
# model.num_classes = len(dataset.categories)

# RESNET 18
model = torchvision.models.resnet18(pretrained=True)
model.fc = torch.nn.Linear(512, len(dataset.categories))

# RESNET 34
# model = torchvision.models.resnet34(pretrained=True)
# model.fc = torch.nn.Linear(512, len(dataset.categories))
    
model = model.to(device)

model_save_button = ipywidgets.Button(description='save model')
model_load_button = ipywidgets.Button(description='load model')
model_path_widget = ipywidgets.Text(description='model path', value='my_model.pth')

def load_model(c):
    model.load_state_dict(torch.load(model_path_widget.value))
model_load_button.on_click(load_model)
    
def save_model(c):
    torch.save(model.state_dict(), model_path_widget.value)
model_save_button.on_click(save_model)

model_widget = ipywidgets.VBox([
    model_path_widget,
    ipywidgets.HBox([model_load_button, model_save_button])
])

# display(model_widget)
print("model configured and model_widget created")

model configured and model_widget created


### Live  Execution
Execute the cell below to set up the live execution widget.  This cell should only take a few seconds to execute.

In [5]:
import threading
import time
from utils import preprocess
import torch.nn.functional as F

state_widget = ipywidgets.ToggleButtons(options=['stop', 'live'], description='state', value='stop')
prediction_widget = ipywidgets.Text(description='prediction')
score_widgets = []
for category in dataset.categories:
    score_widget = ipywidgets.FloatSlider(min=0.0, max=1.0, description=category, orientation='vertical')
    score_widgets.append(score_widget)

def live(state_widget, model, camera, prediction_widget, score_widget):
    global dataset
    while state_widget.value == 'live':
        image = camera.value
        preprocessed = preprocess(image)
        output = model(preprocessed)
        output = F.softmax(output, dim=1).detach().cpu().numpy().flatten()
        category_index = output.argmax()
        prediction_widget.value = dataset.categories[category_index]
        for i, score in enumerate(list(output)):
            score_widgets[i].value = score
            
def start_live(change):
    if change['new'] == 'live':
        execute_thread = threading.Thread(target=live, args=(state_widget, model, camera, prediction_widget, score_widget))
        execute_thread.start()

state_widget.observe(start_live, names='value')

live_execution_widget = ipywidgets.VBox([
    ipywidgets.HBox(score_widgets),
    prediction_widget,
    state_widget
])

# display(live_execution_widget)
print("live_execution_widget created")

live_execution_widget created


### Training and Evaluation
Execute the following cell to define the trainer, and the widget to control it. This cell may take several seconds to execute.

In [6]:
import numpy as np 
import PIL
from matplotlib import pyplot as plt


max_dataset = 3
BATCH_SIZE = 8

epochs = 20

optimizer = torch.optim.Adam(model.parameters())
# optimizer = torch.optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)

epochs_widget = ipywidgets.IntText(description='epochs', value=1)
eval_button = ipywidgets.Button(description='evaluate')
train_button = ipywidgets.Button(description='train')
loss_widget = ipywidgets.FloatText(description='loss')
accuracy_widget = ipywidgets.FloatText(description='accuracy')
progress_widget = ipywidgets.FloatProgress(min=0.0, max=1.0, description='progress')

def train_eval(is_training,use_otherdt=False):
    global BATCH_SIZE, LEARNING_RATE, MOMENTUM, model, dataset, optimizer, eval_button, train_button, accuracy_widget, loss_widget, progress_widget, state_widget
    
    
    # x_train , y_train = np.load("train_X.npy"),np.load("train_Y.npy")
    
    # print(x_train)
    # print(y_train)
    
    if use_otherdt:
        x_train = np.load("train_X.npy")
        y_train = np.load("train_Y.npy")
        dataset = [x_train[0:max_dataset],y_train[0:max_dataset]]
        
    train_loader = torch.utils.data.DataLoader(
            dataset,
            batch_size=BATCH_SIZE,
            shuffle=True
        )
        
    
    
    try:
        
        
        state_widget.value = 'stop'
        train_button.disabled = True
        eval_button.disabled = True
        time.sleep(1)
        
        

        if is_training:
            model = model.train()
        else:
            model = model.eval()
            
            
        while epochs_widget.value > 0:
            i = 0
            sum_loss = 0.0
            error_count = 0.0
            
            print("epoch = {}".format(epochs_widget.value))
            
        
            
            
            for images, labels in iter(train_loader):
                # send data to device
                images = images.to(device) 
                labels = labels.to(device) 
                
                
                if is_training:
                    # zero gradients of parameters
                    optimizer.zero_grad()

                # execute model to get outputs
                outputs = model(images)

                # compute loss
                loss = F.cross_entropy(outputs, labels)

                if is_training:
                    # run backpropogation to accumulate gradients
                    loss.backward()

                    # step optimizer to adjust parameters
                    optimizer.step()

                # increment progress
                error_count += len(torch.nonzero(outputs.argmax(1) - labels).flatten())
                count = len(labels.flatten())
                i += count
                sum_loss += float(loss)
                progress_widget.value = i / len(dataset)
                loss_widget.value = sum_loss / i
                accuracy_widget.value = 1.0 - error_count / i
                
                
                
                
                
                
            if is_training:
                epochs_widget.value = epochs_widget.value - 1
            else:
                break
    except e:
        pass
    model = model.eval()

    train_button.disabled = False
    eval_button.disabled = False
    state_widget.value = 'live'

    
#train_eval(is_training=True)


train_button.on_click(lambda c: train_eval(is_training=True))
eval_button.on_click(lambda c: train_eval(is_training=False))
    
train_eval_widget = ipywidgets.VBox([
    epochs_widget,
    progress_widget,
    loss_widget,
    accuracy_widget,
    ipywidgets.HBox([train_button, eval_button])
])

# display(train_eval_widget)
print("trainer configured and train_eval_widget created")

trainer configured and train_eval_widget created


### Display the Interactive Tool!

The interactive tool includes widgets for data collection, training, and testing.

<center><img src="../images/classification_tool_key2.png" alt="tool key" width=500/></center>
<br>
<center><img src="../images/classification_tool_key1.png" alt="tool key"/></center>

Execute the cell below to create and display the full interactive widget.  Follow the instructions in the online DLI course pages to build your project.

In [7]:
# Combine all the widgets into one display
all_widget = ipywidgets.VBox([
    ipywidgets.HBox([data_collection_widget, live_execution_widget]), 
    train_eval_widget,
    model_widget
])

display(all_widget)

VBox(children=(HBox(children=(VBox(children=(HBox(children=(Image(value=b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01…

<center><img src="../images/DLI Header.png" alt="Header" style="width: 400px;"/></center>

In [1]:
import numpy as np 
import PIL
from matplotlib import pyplot as plt
import torch
import torchvision
import torchvision.transforms as transforms
import torchvision.transforms.functional as F


import pandas as pd
import numpy as np
import PIL



TRANSFORMS = transforms.Compose([
    transforms.ColorJitter(0.2, 0.2, 0.2, 0.2),
    transforms.Grayscale(num_output_channels=3),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

emotion_dict =  {0: 'Angry', 1: 'Digust', 2: 'Fear', 3: 'Happy', 4: 'Sad', 5: 'Surprise', 6: 'Neutral'}


def train_eval(is_training):
    device = torch.device('cuda')

    device = torch.device('cuda')

    model = torchvision.models.resnet18(pretrained=True)
    model.fc = torch.nn.Linear(512, len(emotion_dict))

    model = model.to(device)
    max_dataset = 3

    BATCH_SIZE = 10
    epochs = 100

    optimizer = torch.optim.Adam(model.parameters())
    # optimizer = torch.optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)

    
    #global BATCH_SIZE, LEARNING_RATE, MOMENTUM, model, dataset, optimizer, eval_button, train_button, accuracy_widget, loss_widget, progress_widget, state_widget
    
    x_train = np.load("train_X.npy")
    y_train = np.load("train_Y.npy")
    dataset_x = x_train
    dataset_y = y_train


    train_x_loader = torch.utils.data.DataLoader(
        dataset_x,
        batch_size=8,
        shuffle=False
    )
        
    if is_training:
        model = model.train()
    else:
        model = model.eval()

    while epochs > 0:
        i = 0
        sum_loss = 0.0
        error_count = 0.0

        print("------ epoch = {}".format(epochs))


        indice = 0

        for dt_x  in iter(train_x_loader):
            all_trans = []
            all_labels = []
            
            print("** New bach")
            for j in range(dt_x.shape[0]): 
                #img_to = transforms.ToPILImage(mode="RGB")
                #img_to = img_to(dt_x[j])
                img_to = PIL.Image.fromarray(np.uint8(dt_x[j]*255))
            
                i_t = TRANSFORMS(img_to)
                all_trans.append(i_t)
                all_labels.append(y_train[indice])
                #print("indice = {} ".format(indice))

                indice += 1 

            all_img = torch.cat(all_trans)
            #print("all_img_shape = {} ".format(all_img.shape))
            
            
            
            
            #print("other_shape = {} ".format((dt_x.shape[0],3,all_img.shape[1],all_img.shape[2])))
            all_img = all_img.reshape(dt_x.shape[0],3,all_img.shape[1],all_img.shape[2])


            # send data to gpu

            #print("all_labes B= {} ".format(all_labels))
            all_labels = torch.from_numpy(np.asarray(all_labels)).long()

            #all_labels = torch.Tensor(list(all_labels),dtype=torch.int32)

            #print("all_labes A= {} ".format(all_labels))
            #all_img = all_img.to(device)
            #all_labels = all_labels.to(device)


            # send data to device
            all_img = all_img.to(device) 
            all_labels = all_labels.to(device) 





            if is_training:
                # zero gradients of parameters
                optimizer.zero_grad()

            # execute model to get outputs
            outputs = model(all_img)


            #print("output_shape = {} ".format(outputs.shape))
            #print("output = {} ".format(outputs))
            # compute loss
            # labels
            loss = torch.nn.CrossEntropyLoss()

            #print("all_labels = {} ".format(all_labels))


            loss = loss(outputs, all_labels)



            #loss = F.cross_entropy(outputs, all_labels)

            if is_training:
                # run backpropogation to accumulate gradients
                loss.backward()

                # step optimizer to adjust parameters
                optimizer.step()

            # increment progress
            error_count += len(torch.nonzero(outputs.argmax(1) - all_labels).flatten())
            count = len(all_labels.flatten())
            i += count
            sum_loss += float(loss)
            #print("progress = {} ".format(i / len(dataset_x)))
            print("loss = {}".format(sum_loss / i))
            print("accuracy = {}".format(1.0 - error_count / i))


            #progress_widget.value = i / len(dataset_x)
            #loss_widget.value = sum_loss / i
            #accuracy_widget.value = 1.0 - error_count / i

                
                
                
                
                
            if is_training:
                epochs = epochs - 1
            else:
                break

        
        
    model = model.eval()
    
    torch.save(model.state_dict(), "model_50.pth")

    #train_button.disabled = False
    #eval_button.disabled = False
    #state_widget.value = 'live'

    
train_eval(is_training=True)

# display(train_eval_widget)
print("trainer configured and train_eval_widget created")

------ epoch = 100
** New bach
loss = 0.2610681653022766
accuracy = 0.0
** New bach
loss = 0.2617570608854294
accuracy = 0.0625
** New bach
loss = 0.2721181909243266
accuracy = 0.125
** New bach
loss = 0.28766847401857376
accuracy = 0.125
** New bach
loss = 0.29144487977027894
accuracy = 0.125
** New bach
loss = 0.27619990209738415
accuracy = 0.16666666666666663
** New bach
loss = 0.29846993940217154
accuracy = 0.1428571428571429
** New bach
loss = 0.3016877882182598
accuracy = 0.15625
** New bach
loss = 0.3020038836532169
accuracy = 0.16666666666666663
** New bach
loss = 0.3113661348819733
accuracy = 0.16249999999999998
** New bach
loss = 0.30403913963924756
accuracy = 0.19318181818181823
** New bach
loss = 0.3088577712575595
accuracy = 0.1875
** New bach
loss = 0.30417274511777437
accuracy = 0.1923076923076923
** New bach
loss = 0.30636945366859436
accuracy = 0.2053571428571429
** New bach
loss = 0.3024927278359731
accuracy = 0.20833333333333337
** New bach
loss = 0.3022481333464384


In [1]:
! pwd

/home/dlinano/nvdli-nano/classification
