# TRAINING A DANCING ROBOT 

# Part 1: Detecting Human Poses #

Our first step is to set up the human pose detection model, which is the trt_pose model that can be found here: https://github.com/NVIDIA-AI-IOT/trt_pose . This model is already trained on millions of human pose images, so we will not train it in this workshop. We will only load it onto the Dancebot.

1. Run the code cell below to load the model.

In [1]:
#Load the human pose keypoints
import json
import trt_pose.coco
import trt_pose.models
import torch

with open('human_pose.json', 'r') as f:
    human_pose = json.load(f)

topology = trt_pose.coco.coco_category_to_topology(human_pose)

num_parts = len(human_pose['keypoints'])
num_links = len(human_pose['skeleton'])

#Load the pose detection model
import torch2trt
from torch2trt import TRTModule
OPTIMIZED_MODEL = 'resnet18_baseline_att_224x224_A_epoch_249_trt_2.pth'
model_trt = TRTModule()
model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))

import cv2
import torchvision.transforms as transforms
import PIL.Image
import time

mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()
std = torch.Tensor([0.229, 0.224, 0.225]).cuda()
device = torch.device('cuda')

def preprocess(image):
    global device
    device = torch.device('cuda')
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image = PIL.Image.fromarray(image)
    image = transforms.functional.to_tensor(image).to(device)
    image.sub_(mean[:, None, None]).div_(std[:, None, None])
    return image[None, ...]

from trt_pose.draw_objects import DrawObjects
from trt_pose.parse_objects import ParseObjects

parse_objects = ParseObjects(topology)
draw_objects = DrawObjects(topology)

def get_keypoints(image, human_pose, topology, object_counts, objects, normalized_peaks):
    """Get the keypoints from torch data and put into a dictionary where keys are keypoints
    and values the x,y coordinates. The coordinates will be interpreted on the image given.

    Args:
        image: cv2 image
        human_pose: json formatted file about the keypoints

    Returns:
        dictionary: dictionary where keys are keypoints and values are the x,y coordinates
    """
    height = image.shape[0]
    width = image.shape[1]
    keypoints = {}
    K = topology.shape[0]
    count = int(object_counts[0])

    for i in range(count):
        obj = objects[0][i]
        C = obj.shape[0]
        for j in range(C):
            k = int(obj[j])
            if k >= 0:
                peak = normalized_peaks[0][j][k]
                x = round(float(peak[1]) * width)
                y = round(float(peak[0]) * height)
                keypoints[human_pose["keypoints"][j]] = (x, y)

    return keypoints

Now we will turn on the camera of the Dancebot.

2. Run the cell below to turn the camera on. The camera's light should turn on if this step was successful.

In [2]:
from jetcam.usb_camera import USBCamera

from jetcam.utils import bgr8_to_jpeg
import traceback


camera = USBCamera(width=640, height=480, capture_fps=30)

camera.running = True

Now we will view what the Dancebot sees and detects using the camera and the human pose detection model.

3. Run the cell below. This will set up a window for viewing what the Dancebot sees. An empty frame should appear if this step was successful.

In [3]:
import ipywidgets
from IPython.display import display
from PIL import Image


from torchvision import transforms as trans
import numpy as np
import cv2

def execute(change):
    
    global pose, reward, action, location, prevstate
    #print(pose)
    image = change['new']
    height, width, _ = image.shape

    diff = width - height

    image = image[:, int(diff/2):height+int(diff/2), :]
    resized_img = cv2.resize(image, dsize=(224, 224))
    data = preprocess(resized_img)

    
    cmap, paf = model_trt(data)

    cmap, paf = cmap.detach().cpu(), paf.detach().cpu()

    counts, objects, peaks = parse_objects(cmap, paf)
    
    draw_objects(resized_img, counts, objects, peaks)

    resized_img = cv2.resize(resized_img, dsize=(480, 480))
    
    image_w.value = bgr8_to_jpeg(resized_img[:, ::-1, :])


image_w = ipywidgets.Image(format='jpeg')
reward = 0

location = 1
pose = 0
action = None
prevstate = None
display(image_w)

Image(value=b'', format='jpeg')

This is the exciting part!

4. Run the cell below to play what the Dancebot sees on the frame above.

In [4]:
camera.observe(execute, names='value')

5. When you are done viewing, you can run the cell below to stop viewing.

In [5]:
camera.unobserve_all()

# Part 2: Teaching the robot how to Dance #

Now we need to decide on the dance rules. 

The dancebot will learn based on trial and error. Each human pose will make the robot perform a particular dance move. When the robot does the wrong move, it will receive punishment points. When it does the right move it will receive reward points. By doing a lot of attempts the robot will eventually learn how to maximize the reward points and minimize the punishment points.

You can train the robot to dance in different behaviours. For example, you can encourage the robot to continue dancing when you're not performing any poses by punishing the robot for being still. You can also make sure the dance moves are not repetitive by punishing the robot for performing the same dance move twice in a row.

1. Run the cell below to define reward and punishment points for the dancebot. Boxes will appear where you can enter your desired reward/punishment values.

In [7]:
import ipywidgets as widgets

rightDanceReward = widgets.IntSlider(
    value=100,
    min=0,
    max=500,
    step=1,
    description='',
    disabled=False,
    continuous_update=True,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

wrongDancePunishment = widgets.IntSlider(
    value=500,
    min=0,
    max=500,
    step=1,
    description='',
    disabled=False,
    continuous_update=True,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

stillnessPunishment = widgets.IntSlider(
    value=0,
    min=0,
    max=500,
    step=1,
    description='',
    disabled=False,
    continuous_update=True,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

repetitivenessPunishment = widgets.IntSlider(
    value=0,
    min=0,
    max=500,
    step=1,
    description='',
    disabled=False,
    continuous_update=True,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

print('Slide the buttons to your desired values. We recommend trying it with the default values first.')
print('Reward for the right dance move:')
display(rightDanceReward)
print('Punishment for the wrong dance move:')
display(wrongDancePunishment)
print('Punishment for being still:')
display(stillnessPunishment)
print('Punishment for the repeating the last move:')
display(repetitivenessPunishment)
#print(rightDanceReward.value)

Slide the buttons to your desired values. We recommend trying it with the default values first.
Reward for the right dance move:


IntSlider(value=100, max=500)

Punishment for the wrong dance move:


IntSlider(value=500, max=500)

Punishment for being still:


IntSlider(value=0, max=500)

Punishment for the repeating the last move:


IntSlider(value=0, max=500)

We can finally load the dancing AI model and train it to dance correctly based on the rules we defined above.

2. Run the cell below to load the dancing AI model.

In [8]:
#Initialize robot for performing moves
from robot import Robot
import numpy as np
dancebot = Robot()

#Initialize AI model and dance session to start training
from qlearning_agent_v2 import QLearningAgent

danceAgent = QLearningAgent()
danceAgent.epsilon = 0.3
danceAgent.qvalues = np.zeros((16, 4))
from dance_session_v2 import DanceSession
sesh = DanceSession(danceAgent, wrongDancePunishment.value, stillnessPunishment.value,
                rightDanceReward.value, repetitivenessPunishment.value, [(0, 'doNothing'), (1, 'wiggle'), (2, 'shuffle'), (3, 'donut')])

WARNNIG: Jetson.GPIO library has not been verified with this carrier board,


We will create another view to see what the Dancebot sees again. We will use this to make sure the dancebot can see us when we try to train it.

3. Run the cell below to set up a window for viewing what the Dancebot sees. An empty frame should appear if this step was successful.

In [9]:
import ipywidgets
from IPython.display import display
from PIL import Image

from torchvision import transforms as trans
import numpy as np
import cv2
runtime = 0
def execute(change):
    global runtime
    image = change['new']
    height, width, _ = image.shape

    diff = width - height

    image = image[:, int(diff/2):height+int(diff/2), :]
    resized_img = cv2.resize(image, dsize=(224, 224))
    data = preprocess(resized_img)

    
    cmap, paf = model_trt(data)

    cmap, paf = cmap.detach().cpu(), paf.detach().cpu()

    counts, objects, peaks = parse_objects(cmap, paf)#, cmap_threshold=0.15, link_threshold=0.15)

    
    location = 0 if counts.item() > 0 else 1
    draw_objects(resized_img, counts, objects, peaks)

    resized_img = cv2.resize(resized_img, dsize=(480, 480))
    runtime += 1
    if runtime % 10 == 0:
        kp = get_keypoints(resized_img, human_pose, topology, counts, objects, peaks)
        sesh.run(counts, kp, dancebot)
        runtime = 0
    image_w.value = bgr8_to_jpeg(resized_img[:, ::-1, :])
    
    
image_w = ipywidgets.Image(format='jpeg')
reward = 0

location = 1
pose = 0
action = None
prevstate = None
display(image_w)

Image(value=b'', format='jpeg')

The dance session begins here! When you run the cell below the AI model will start watching you and trying different dance moves based on what poses you perform! Every time it does a wrong move, it will receive a punishment as you defined. Every time it does a right move, it will receive a reward as you defined. With trial and error, it will learn how to dance correctly based on the rules you defined.

Poses you should do:

-Arms down: neutral pose, the robot should do nothing

-Left hand up: the robot should wiggle side to side

-Right hand up: the robot should shuffle back and forth

-Both hands up: the robot should spin

The Dancebot will remember everything it learns by filling a table of environment states and robot actions. A very simple dancing robot would have a table like this:

<img src="example.png" width="400" />

Our Dancebot is more sophisticated: in addition to the human poses it sees it also remembers the previous dance move it performed so that it can be trained to not be repetitive. So an example environment state for our Dancebot would be [Pose: Arms down, Previous dance move: Wiggle] 

As you train the Dancebot, it will update the zeros in the table with the punishments and rewards it earns. When you are finished with the training it will start to pick and perform the highest reward action from this table given a pose.

4. Run the cell below to start training the Dancebot by dancing with it!

Tip: the robot only looks at your poses between performing dance moves

In [10]:
camera.observe(execute, names='value')

5. When you are done viewing and training, you can run the cell below to stop viewing.

In [11]:
camera.unobserve_all()

# Part 3: Dancing with the fully trained Dancebot #

Once you are happy with your training, you can dance with the fully trained Dancebot. You can choose to dance with the robot you trained or some readily available fully trained Dancebots with different behaviours.

You can choose:

-My Dancebot: the Dancebot that you trained.

-Restless Dancebot: a Dancebot trained to perform random, nonrepetitive dance moves if it doesn't see any poses other than the neutral pose.

-Calm Dancebot: a Dancebot trained to only perform dance moves based on the poses it sees.

1. Run the cell below to choose a Dancebot variation to dance with. It should create a dropdown menu upon successful execution.

In [14]:
# create ipywidget for selection
dancebot_choice = ipywidgets.Dropdown(
    options=['My Dancebot', 'Restless Dancebot', 'Calm Dancebot'],
    value='Calm Dancebot',
    description='Dancebot:',
    disabled=False,
)

display(dancebot_choice)

Dropdown(description='Dancebot:', index=2, options=('My Dancebot', 'Restless Dancebot', 'Calm Dancebot'), valu…

2. Run the cell below to set up a window for viewing what the Dancebot sees. An empty frame should appear if this step was successful.

In [24]:
#load selected dancebot
danceAgent.epsilon = 0
danceAgent.qvalues = danceAgent.variations[dancebot_choice.value]

# copy paste of execute from above
runtime = 0    
image_w = ipywidgets.Image(format='jpeg')
display(image_w)

Image(value=b'', format='jpeg')

3. Run the cell below to start dancing with the Dancebot!

In [25]:
camera.observe(execute, names='value')

4. Run the cell below to end the dance session.

In [26]:
camera.unobserve_all()

# Part 4: Questions and Experiments #

Congratulations! You completed a successful dance session with the Dancebot!

Here are some questions about how the Dancebot learns and perceives the world. You can discuss these with your friends and ask them to the workshop instructors.

1. What is the minimum amount of information that the robot needs to know about the environment to be a calm Dancebot?

Hint: The poses only rely on arms being up or down, so does the dancebot need to know about the background, or where the legs are?

2. How would you make an even less repetitive version of the Dancebot that doesn't repeat any dance move for any 3 consecutive dance moves?

Hint: What information does the robot need to make sure it doesn't do the same dance move that it performed 1 dance move ago? What about 2 dance moves ago? 

3. What reward and punishment configuration would you use to train a restless Dancebot? You can scroll up and try different reward/punishment values and train your bot again.