# Real-Time Edge AI on FPGAs - Live Pokemon Card Recognition!

## Setup


###  Make sure the USB Camera is plugged into **the PYNQ FPGA Dev Board, not the normal computer (Intel NUC)!!!!!**

### Cable Setup (Sanity Check) 
If you're reading this, the board is likely plugged in and setup correctly for the most part - but make sure:
* The ethernet cable is plugged straight into the intel NUC, and the intel NUC has a static IP set to `192.168.2.XX`, where `XX` is any number between 0-255 **that is not 99**, as the dev board itself is set to `192.168.2.99`
    * If we're at DEFCON, we don't need to take any chances plugging this into any real network! We just need one computer to access and run this notebook.
* the HDMI cable is plugged into, and the monitor powered on, **before running the notebook**
* the USB camera is plugged into the Pynq Dev board itself, not the host computer! 


### Camera setup & tweaks
Once all cables are verified as connected, press the "Restart and Run all cells" button at the top of the notebook (the fast forward icon/two arrows connected to eachother). This will start the actual video output of the USB webcam, which you will need to use to position/set the zoom, focus, etc. 
 * The goal is to position the camera so that the art of the card, when placed on the table in the printed box under the camera, will take up the entire frame and be in focus. 
 * Use the three adjustable rings on the camera lens itself to adjust zoom, aperature (brightness), and focus - in that order. You can use the little screws on each ring to lock it in place once properly set. 
 * If desired, you can set the script to display a "camera calibration mode" below via setting `calibrate_camera_mode = True`, but remember to set it back to `False` once your done, then Restart and Run All cells to output the actual display image
 
Additionally, depending on the monitor/setup used, you might have to change the color mode. By default, BGR is used and assumed, but if the colors look weird, set `color_mode_bgr` to `False` (or `True` to switch back to BGR mode), then Restart & Re-run all cells to switch to RGB mode. 

## Running the Demo

Once setup is complete, the demo hardware itself should be pretty hands off! It's up to you to talk to folks and walk them through the concept and idea of what's going on. More material for that is provided down below/along side this info!

Folks are free to play with, rotate, etc. the fake pokemon cards to see what happens with the demo. Despite being just printed on some cardstock, they're not intended to be taken! We have a decent number of extras, but not enough to hand out. If some do disappear, there should be a stock of replacements nearby. Try to keep a decent balance of the pokemon that are out on the table! 

### Troubleshooting

If you run into issues, try the following. If all else fails, restart and run all cells, and if that doesn't work, restart the whole dev board and/or call Ben Hawks (contact info left with AI Village staff) 

* The output froze!
    * Try restart & run all cells. If that doesn't work, stop the cells and the kernel, unplug and power-cycle the display (if possible), plug it back in, then restart and run all cells. 
* After multiple iterations/restarts of the notebook, the display output image is shifting and wrapping!
    * This is a known issue with an unknown cause. Restarting the whole dev board is the only reliable way of fixing this. HDMI is weird. 
* the colors are weird!
    * the display probably expects a different mode. If this happens randomly, try restarting the display and/or restart&run all cells in the notebook. If that fails restart the dev board. 
* The USB Camera isn't displaying/opening/connecting properly!
    * Double check the normal usb connection, but also the weird barrel jack connector in the middle of the cable. also make sure it's plugged into the USB on the Pynq Dev board, not the host computer/Intel NUC. 
* One of the cells is repeatedly giving an error!
    * If all your connections are okay, and all the cells above that one have been run (and without any errors), call Ben Hawks. This shouldn't happen, probably?!
* It's not predicting the right pokemon!
    * yeah it does that sometimes. The model is very small, and trained with a somewhat bad dataset. It should be able to get most cards after some rotating and movement, but if it seems like it's _never_ getting anything right, even with perfect framing and trying multiple cards of the same pokemon (Onix is pretty reliable), call Ben Hawks. 
* I can't access the Jupyter server/webpage to start the notebook!
    * The CPU itself is pretty small (dual core ARM) so it might take a minute to load pages, but if the connection is timing out, try restarting the dev board, but plug the ethernet cable in *before* powering it on. If the ethernet cable isn't connected, it gets a bit weird and doesn't automatically assign itself an ipv4 address without restarting the `networking` service (`sudo systemctl restart networking`). You can do this over the USB Serial connection (115200 baud, via the micro USB port next to the ethernet port) if you'd like, but the simplest way is to just restart the whole dev board.
    
    
**TL;DR - Try restarting the notebook, if that fails the dev board, and if that fails, call Ben Hawks (AIV Staff have contact info)**

In [1]:
# Set to false if colors seem weird on display - likely an issue with display using BGR or RGB mode... 
color_mode_bgr = True

# Set to true if you need to set/calibrate the USB webcam position, zoom, focus, etc. 
# outputs just a simplified view of the whole webcam frame if true. 
# !!! Must set to True then Restart & Run all cells to go back to normal operation! !!!
calibrate_camera_mode = False

# Count of how many frames to average the prediction over 
# Higher num will likely give more "accurate" results, but have a noticable lag and make it harder to demonstrate some issues
# Lower num will have less of a lag when changing/placing new cards,but be less "accurate" overall
# 10 seems to be a good "sweet spot"
rolling_predict_frames = 15


In [2]:
# initialize camera from OpenCV
import cv2 as cv

videoIn = cv.VideoCapture(cv.CAP_V4L)
videoIn = cv.VideoCapture(-1)
#videoIn = cv.VideoCapture('/dev/v4l/by-id/usb-HD_Web_Camera_HD_Web_Camera_Ucamera001-video-index0')
while(videoIn.isOpened() == False):
    #videoIn = cv.VideoCapture(cv.CAP_V4L)
    videoIn = cv.VideoCapture(-1)
    #videoIn = cv.VideoCapture('/dev/v4l/by-id/usb-HD_Web_Camera_HD_Web_Camera_Ucamera001-video-index0')
    videoIn.set(cv.CAP_PROP_FRAME_WIDTH, 640)
    videoIn.set(cv.CAP_PROP_FRAME_HEIGHT, 480)


print("Capture device is open: " + str(videoIn.isOpened()))

Capture device is open: True


In [3]:
# Load the Neural network FPGA firmware
import numpy as np
from axi_stream_driver import NeuralNetworkOverlay

X_shape = (32, 32, 3)
y_shape =(10)

nn = NeuralNetworkOverlay('../Video_v3.bit', X_shape, y_shape)

In [4]:
# Load the background image for HDMI output
bg_img = cv.imread("hls4ml_pokemon_demo_bg.png") #defaults to BGR, uncomment below if needed?
# bg_img = cv2.cvtColor(bg_img, cv2.COLOR_BGR2RGB)

In [5]:
# Steup and start HDMI output, make sure it's connected before running the cell!
from pynq.lib.video import *

# monitor configuration: 1280*720 @ 60Hz
Mode = VideoMode(1280,720,24)
hdmi_out = nn.video.hdmi_out

if color_mode_bgr:
    hdmi_out.configure(Mode,PIXEL_BGR)
else:
    hdmi_out.configure(Mode,PIXEL_RGB)
    
hdmi_out.start()

<contextlib._GeneratorContextManager at 0xa5372930>

In [6]:
import time
start = time.time()
for NumOfFrames in range (50): 
    ret, frame = videoIn.read()
    if (ret):
        outframe = hdmi_out.newframe()
        outframe[0:480,0:640,:] = frame[0:480,0:640,:]        
        hdmi_out.writeframe(outframe)
    else:
        raise RuntimeError("Failed to read from camera.")
end = time.time()
print("Frames per second:     " + str(50 / (end - start)))

Frames per second:     25.90793503645379


In [7]:
## used to calibrate camera! Stop the cell, then set "calibrate_camera_mode" to false, then restart & run all when done!
while calibrate_camera_mode: 
    ret, frame = videoIn.read()
    if (ret):
        outframe = hdmi_out.newframe()
        outframe[0:480,0:640,:] = frame[0:480,0:640,:]        
        hdmi_out.writeframe(outframe)
    else:
        raise RuntimeError("Failed to read from camera.")

In [None]:
# Actual Running mode!

# Capture webcam video and display to HDMI Output
%matplotlib inline 
from matplotlib import pyplot as plt
import numpy as np
rolling_predict = np.zeros(rolling_predict_frames, dtype=np.int32)
#Prediction Text Location - Center Bottom of Screen (720p)
x,y,w,h = 550,660,150,40
start_time = time.time()
# FPS update time in seconds
display_time = 1
fc = 0
FPS = 0
fps_disp = ""
while (1):
    # Read in image from webcam
    ret, frame = videoIn.read()
    if (ret):
        
        # Calculate FPS 
        fc+=1
        TIME = time.time() - start_time
        if (TIME) >= display_time:
            FPS = fc / (TIME)
            fc = 0
            start_time = time.time()
            fps_disp = "FPS: "+str(FPS)[:5]

        #preprocess image before passing to neural network, - Crop to square and resize to 32*32px
        outframe = hdmi_out.newframe()
        cropped_frame = frame[0:480, 0:480]
        if color_mode_bgr:
            RGB_img = cropped_frame #Desired input color format depends on monitor, USB Webcam returns RGB
        else:
            RGB_img = cv.cvtColor(cropped_frame, cv.COLOR_RGB2BGR)
        resized = cv.resize(RGB_img, (32, 32))
        resized_scaled = resized/255.
        
        # Send pre-processed image to FPGA Neural Network!
        #y_hw, latency, throughput = nn.predict(resized_scaled, debug=False, profile=True)
        y_hw = nn.predict(resized_scaled, debug=False, profile=False)
        
        # Get prediction back from FPGA, determine which pokemon to display as our prediction (rolling prediction)
        percentage = np.array(y_hw)
        last_predict = np.argmax(percentage)
        if rolling_predict_frames > 1:
            rolling_predict = rolling_predict[1:] # pop oldest element from our predictions
            rolling_predict = np.append(rolling_predict, last_predict) # add newest prediction
            percentage_max = np.bincount(rolling_predict).argmax() # get the most common prediction over the last N frames
        else: # skip the whole array operations if set rolling pred is = 1, just for the sake of speed (+ ~0.25-0.5 FPS)
            percentage_max = last_predict
        if percentage_max == 0:
            text = "Bulbasaur"
        elif percentage_max == 1: 
            text = "Charmander"
        elif percentage_max == 2:
            text = "Eevee"
        elif percentage_max == 3: 
            text = "Gengar"
        elif percentage_max == 4: 
            text = "Jigglypuff"
        elif percentage_max == 5: 
            text = "Mewtwo"
        elif percentage_max == 6: 
            text = "Onix"
        elif percentage_max == 7: 
            text = "Pikachu"
        elif percentage_max == 8: 
            text = "Snorlax"
        elif percentage_max == 9: 
            text = "Squirtle"
        else: 
            text = "NA"
        
        # Construct our output HDMI video frame
        scaled_input = cv.resize(resized, (480, 480),interpolation=cv.INTER_AREA) #resize our 32px image to 480px size, but at the same resolution
        outframe[0:720,0:1280,:] = bg_img #Set background image
        outframe[0:480,0:480,:] = RGB_img #add full resolution input image
        outframe[0:480,800:1280,:] = scaled_input # add actual model input resolution image 
        cv.rectangle(outframe, (x,y), (x + w, y + h), (255,255,255), -1) # Add rectange & prediction text
        cv.putText(img=outframe, text=text,org=(x+int(w/10),y+int(4*h/5)), fontFace=cv.FONT_HERSHEY_DUPLEX, fontScale=1, color=(255,0,0), thickness=1)
        cv.putText(outframe, fps_disp, (10, 25), cv.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)
        hdmi_out.writeframe(outframe) # Output the final constructed frame
    else:
        print("Failed to read from camera.")
        break

#### hdmi_out.close()