Skip to content
This repository has been archived by the owner on Apr 20, 2023. It is now read-only.

Latest commit

 

History

History
430 lines (224 loc) · 21.3 KB

Getting Started.md

File metadata and controls

430 lines (224 loc) · 21.3 KB

Getting Started

Overview

In this tutorial we will provide a basic demonstration of how to incorporate the AIGamedevToolkit into a Unity project. By the end, you will know how to run a deep learning model inside a Unity scene, without any additional coding.

Background Information

For our purposes, a model can be thought of like a function in traditional programming. A model takes in input and returns an output. Unlike in traditional programming, the desired function is not explicitly coded, but approximated based on a sample dataset. The sample dataset provides a mapping of input values to output values. As an example, a model could approximate a function that takes in an image as input and returns the locations and types of objects in the image based on a sample dataset of annotated images.

A deep learning model consists of sequences of functions called layers where output from the previous layer serves as input for the next. The combination of the types and arrangement of functions is referred to as the model's architecture. These internal functions have parameters that are learned based on the sample data through a process called training. The goal of training is to find the values for these internal parameters that enable the model to best approximate the desired function.

In theory, this approach can be used to approximate any function. However, it is best suited to applications where it would be infeasible to explicitly define the desired function. It can also be useful for approximating more efficient versions of computationally intensive functions, typically with some decrease in accuracy.

In the AIGamedevToolkit, a model's functionality is accessed through an InferenceFeature asset. Inference refers to using a trained deep learning model to make predictions on new inputs. In the toolkit, an inference feature implements the required steps for performing inference with a specific type of deep learning model. For this tutorial, we shall use some predefined inference features that are included with the toolkit. We will cover how to implement new inference features for custom pretrained models in a future tutorial.

Clone Toolkit Repository

First, we need to clone the GitHub repository for the toolkit. Make sure to actually clone the repository rather than downloading it as a .zip. Compressing the project folder can break some of the binary files included in the toolkit.

Create Unity Project

Next, we need to a create Unity project to use the toolkit. We can stick with the default template for a 3D project.

create_new_unity_project

Add Toolkit Folder

Once the Unity Editor has loaded, we can add the toolkit folder. Open the AIGamedevToolkit repository folder and select the AIGamedevToolkit subfolder.

select-toolkit-folder

Drag the toolkit folder into the Project → Assets directory.

add_toolkit_folder

Add Inference Features

The toolkit comes with some predefined InferenceFeature assets for both Intel's OpenVINO and Unity's Barracuda inference libraries.

An inference library refers to the code used to perform the internal operations in a trained model. The OpenVINO inference library is optimized to speed up model inference on Intel hardware while the Barracuda library provides cross-platform support.

The predefined assets are located in the Assets → AIGamedevToolkit → ScriptableObjects → InferenceFeatures folder.

builtin-inference-features

Here, we can see there are three predefined assets.

  • COCO_YOLOX: An OpenVINO inference feature which uses object detection models that were trained on the COCO (Common Objects in Context) image dataset. The model can detect 80 different types of real world objects in an input image. YOLOX refers to the name of the specific architecture used by the object detection models.
  • StyleTransfer_Barracuda: A Barracuda inference feature which uses models that were trained to apply the artistic style of a particular image to an arbitrary input image.
  • StyleTransfer_OpenVINO: An OpenVINO inference feature which uses models that were trained to apply the artistic style of a particular image to an arbitrary input image.

The toolkit adds a graphical user interface (GUI) for adding inference features to a scene. This GUI can be accessed from the Window → AI Gamedev Toolkit submenu. Open the submenu and select Add Inference Features. The toolkit will automatically look for any InferenceFeature assets and display them in a popup window.

toolkit_menu

Allow Unsafe Code

As we can see in the new window, there is a warning that unsafe code needs to be enabled to use OpenVINO inference features. This is because the memory location containing the pixel data for an input image needs to be accessed from a DLL plugin.

toolkit-inference-feature-window

If we try to view the options for an OpenVINO inference feature, we can see the same message.

toolkit-inference-feature-window-2

We can enable unsafe code in the the Player Settings. Open the Edit menu and select Project Settings....

open-project-settings

Open the Player submenu and scroll down to the Allow 'unafe' Code parameter. Tick the checkbox for the parameter and close the Project Settings window.

scroll-to-unsafe-code-option

The toolkit will automatically detect that the setting was enabled and unlock the inference feature settings after a few seconds. We will explore the settings for the inference features in greater detail in another post. For now, we can stick with the default settings.

Note: The OpenVINO inference library is optimized for Intel hardware. By default, OpenVINO inference features will only run if Intel hardware is detected.

toolkit-inference-feature-window-3

Install Barracuda Package

If we try to view the options for a Barracuda inference feature we see a similar warning message indicating that the Barracuda package is not installed.

barracuda-package-not-installed-message

We can install the Barracuda package from the Package Manager window. Select the Package Manager tab and click on the + sign in the top-left corner.

package-manager-tab

Select the Add package from git URL... option from the dropdown.

add-package-from-git

Type com.unity.barracuda into the text field and click Add.

add-barracuda-from-git

The latest stable version of Barracuda at the time of writing is version 2.0.0.

installed-barracuda-package

As with enabling unsafe code, the toolkit will automatically detect that the Barracuda package was installed and unlock the inference feature settings. If we go back to the AIGameDev Toolkit window, we can now see and interact with the Barracuda inference feature.

toolkit-inference-feature-window-4

We can stack multiple inference features, but we'll stick with adding just one for now. If you are running on Intel hardware stick with the default COCO_YOLOX selection. Otherwise, deselect the COCO_YOLOX inference feature and select the StyleTransfer_Barracuda option. Click on Apply now to add the inference feature to the scene.

apply-inference-feature-to-scene

The toolkit makes some assumptions about the scene when adding inference features through the AI Gamedev Toolkit window.

When adding computer vision inference features, which take in images as input, the toolkit will attach a helper script to the in-game camera. This script is responsible for getting the texture data for the current frame and making it accessible to any computer vision inference features.

camera-texture-helper-component

Some inference features like the COCO_YOLOX asset will only use the current camera frame as input for their deep learning model. Others, like the two style transfer inference features, will also modify the texture data for the current camera frame like a post-processing effect.

The toolkit will also create a new GameObject called Inference Manager with an InferenceManager component and attach the selected inference features to the component. The Inference Manager provides a central place to configure and initialize any Inference Features in the Unity scene.

inference-manager-component

When adding object detection inference features like the COCO_YOLOX asset, an additional GameObject will be added called Bounding Box Manager. This will have a BoundingBoxManager component and any object detection inference features in the scene will be attached. This GameObject will be responsible for drawing boxes around any detected objects and indicating their predicted object type.

bounding-box-manager-component

Add Input Data

At this point, we could go ahead and press play and the object detection model would take input from the in-game camera. However, since the scene is blank, we won't see any bounding boxes displayed. We need to add some input data.

We could either create a game environment containing objects to detect or get input from a video feed. Since the object detection model being used was trained exclusively on real-world images, we will go with a video feed for now.

The toolkit provides built-in options for creating a video feed using a video file or an attached webcam. For either option, we first need to create a screen to view the video feed.

Right-click an empty area in the Hierarchy tab and select 3D Object → Quad from the popup menu.

add-quad

We can just name the new object VideoScreen. With the video screen object selected, click Add Component in the Inspector tab.

click-add-component

Type Video Screen Manager into the search bar and press Enter.

empty-video-screen-manager

Drag the VideoScreen object from the Hierarchy tab into the Video Screen field for the VideoScreenManager component. This will allow the Quad dimensions to be updated based on the current video dimensions.

attach-video-screen

Next we need an input texture to store the pixel data for the video feed. Click on the little circle icon at the end of the Input Texture field.

open-input-texture-selection

Select the VideoPlayer_Texture asset in the popup window.

select-video-player-texture

Lastly, drag the in-game camera from the Hierarchy Tab into the Target Camera field. This will allow the camera to be repositioned based on the the current video dimensions. The Video Dims field will be updated based on the current video dimensions.

attach-game-camera

We can use a video file or webcam feed by adding a either a Video Manager or Webcam Manager component respectively to the VideoScreen.

attach-video-webcam-managers

In both cases, we need to assign the same VideoPlayer_Texture asset as is used for the Video Screen Manager component to the Video Texture field.

When using a Video Manager we also need to populate the Video Clips field with video files. Feel free to use your own video files or use the sample videos included in the pre-configured demo project.

add-video-clips

The last component we need to add to the VideoScreen is a VideoPlayer to play the video clips. Make sure to enable the Loop parameter for the Video Player component.

add-video-player-component

Now if we press play, we should see that bounding boxes are drawn around objects detected in the video.

test-yolox-plane

test-yolox-bus

test-yolox-person

test-yolox-zebra

The full list of object types that can be detected using the COCO_YOLOX inference feature is provided below.

COCO Dataset Object Classes

Index Name
0 person
1 bicycle
2 car
3 motorcycle
4 airplane
5 bus
6 train
7 truck
8 boat
9 traffic light
10 fire hydrant
11 stop sign
12 parking meter
13 bench
14 bird
15 cat
16 dog
17 horse
18 sheep
19 cow
20 elephant
21 bear
22 zebra
23 giraffe
24 backpack
25 umbrella
26 handbag
27 tie
28 suitcase
29 frisbee
30 skis
31 snowboard
32 sports ball
33 kite
34 baseball bat
35 baseball glove
36 skateboard
37 surfboard
38 tennis racket
39 bottle
40 wine glass
41 cup
42 fork
43 knife
44 spoon
45 bowl
46 banana
47 apple
48 sandwich
49 orange
50 broccoli
51 carrot
52 hot dog
53 pizza
54 donut
55 cake
56 chair
57 couch
58 potted plant
59 bed
60 dining table
61 toilet
62 tv
63 laptop
64 mouse
65 remote
66 keyboard
67 cell phone
68 microwave
69 oven
70 toaster
71 sink
72 refrigerator
73 book
74 clock
75 vase
76 scissors
77 teddy bear
78 hair drier
79 toothbrush

Note: If you went with one of the style transfer inference features you should see something like the image below.

test-style-transfer-plane

As mentioned before, we can can also stack multiple inference features. Although, this might not be advisable for performance reasons, depending on the models being used.

stack-inference-features

Prepare Streaming Assets

When using OpenVINO inference features, there are some additional steps we need to take before building a project. Unity does not save the .xml and .bin files used by OpenVINO inference features in builds. These need to be copied to the StreamingAssets folder before building the project.

Select the Inference Manager object in the Hierarchy tab. Open the Model Assets dropdown menu for any OpenVINO inference feature in the Inspector tab.

inference-manager-model-assets

Right-click one of the model assets and select Properties. A new window will appear with the properties for the selected model asset.

select-model-asset-properties

Here, we can see there is a field called Streaming Assets Path. This field needs to contain the name of a subfolder located in the StreamingAssets folder. There may already be a folder name in the input field.

model-asset-propterties-window

We can also click the Browse button to select or create a different folder inside the StreamingAssets folder. Make sure there is a StreamingAssets folder before changing the subfolder name.

Once a folder is selected, we can click the Copy to StreamingAssets button to copy the files for the model asset to the StreamingAssets folder.

copy-model-files-to-streaming-assets

Clicking this button will also create a StreamingAssets folder if one did not already exist. We need to repeat this for any Model Assets we want to keep in the build.

Note there is also a Copy to StreamingAssets button for the parent inference feature. Clicking this button will copy the model files for all attached Model Assets to their selected StreamingAssets subfolders.

inference-feature-copy-to-streaming-assets

We can verify the desired model files were copied by looking in the Assets > StreamingAssets folder.

verify-streaming-assets

Build the Project

There is one last step required before we can build the project. The video screen uses the Unlit/Texture shader. By default, the Unlit/Texture shader is not included in project builds. We need to manually include it in the project settings

Open the Project Settings and select the Graphics submenu.

project-settings-graphics

Scroll down to the Always Included Shaders field.

always-included-shaders

Increase the Size value by one to add a new element.

always-included-shaders-new-element

Click on the little circle icon at the end of the new shader element.

shader-selectiong-button

Type Unlit/Texture shader into the Select Shader window and select Unlit/Texture from the available options. We can then close the Select Shader window.

select-unlit-texture-shader

We should now be ready to build the project. Press Ctrl+B to build and run the project. Create a build folder if one does not already exist and click Select Folder.

create-build-folder