Dynamic difficulty in virtual reality games through hand motion biofeedback

Allan Pichardo, Abhijit Gupta

XR:MTL, Galilei, Ubisoft, ConcordAI, Concordia University, Montreal, QC, Canada

Abstract

This project aims to use Artificial Intelligence to make Virtual Reality games more dynamic and unique. The VR simulates a zombie survival game, where the player must shoot down waves of zombies. The player's hand movement are analyzed to predict the arousal of the player. The difficulty of the game is then changed based on the arousal. The game prompts the player to stay calm. Hence, the more aroused the player becomes, the harder the game is. That is to say, an excited player will have reduced vision. Additionally, there is an NPC that tells the player how they are performing. The game is developed in Unity3D and utilizes their ML-agent machine learning library to perform analysis of the biofeedback. The AI was trained using reinforcement learning on data obtained from the CMU Graphics Lab Motion Capture Database. The player's arousal is determined by the movement and velocity of the hands with respect to the head.

Introduction

The objective of this research was to experiment with different ways of obtaining biofeedback through virtual reality (VR) headsets and use them to somehow modify the user experience to make each gameplay unique and interesting for the player. Our experiment focuses on the feedback that can be obtained from the hand movements of the player and attempts to adjust the difficulty of the game based on it. It is known that user-inputs (biofeedback) can be obtained through most standard VR headsets, which is why the experiment pays more emphasis on merging VR with Artificial Intelligence (AI). The AI helps analyze the user-inputs and classify them to build relationships between the virtual world and the movements of the player. The game we developed consists of a "Zombie-Shooter" type game in which the player must survive waves of incoming zombies. The difficulty of the game increases based on the arousal (excited/nervousness) of the player as well as the progression of the game. The arousal of the player is determined by the AI through the analysis of hand movements. By doing this experiment we hope to better understand the interconnectivity between these two emerging fields of technology i.e. Virtual Reality and Artificial Intelligence.

Reinforcement Learning with Unity

Reinforcement learning is a form of unsupervised machine learning that encourages an agent to explore its environment in search of optimal solutions to non-trivial problems. The agent collects observations from the environment and produces a corresponding action. Based on its effectiveness, a reward is granted to the agent to either encourage or discourage that action. Through an iterative process, the agent's policy converges on a set of actions that maximizes its rewards.

Source: CS 294 Deep Reinforcement Learning (UC Berkeley)

Unity has developed a framework for creating reinforcement learning environments in the Unity editor which allows a developer to easily obtain observations from within a game. The framework uses the Tensorflow library to train a model that can be later imported into a game for inference. The framework offers a multitude of hyperparameters that can be used to tweak the AI's training. We obtained the best result using these:

Training Dataset

For our experiment, we required a dataset that was not only large, but varied. We decided that using an existing motion capture database would be beneficial because we could automate the extraction of the features we wish to use for training. Additionally, this allowed us to iterate and compare various feature sets rapidly with minimal changes to our training scene. The Carnegie Mellon Graphics Lab Motion Capture Database contains 2605 scenarios organized by category. These scenarios are saved in .asf format, a text-based file which can be readily parsed to obtain the relative position and rotations of any joint in the body.

Alternatively, the motion captures are available in .fbx format, which is compatible with Unity's mecanim system. This proved beneficial because the movements are easily scaled to player-sized proportions. For this reason, and because we were only interested in the movement of the player's hands and head, the .fbx files were our best option. In Unity, the .fbx animations were added to a rigged humanoid model from which we captured the head and hands' positions and rotations.

Feature Selection

Careful consideration should be taken to make observations only on the specific features which are absolutely necessary for the agent to achieve the desired result. Naive selection of these features may lead to unpredictable and surprising results.

During our first experiments, we trained the agent with observations of the world-space position and rotation of the player's hands and head. This resulted in the agent incorrectly inferring player sentiment based on the areas of the map in which the player stood and looked. A better solution would be to transform the position of the hands to be relative to the head position. Still, this leaves the problem of head rotation. Again, a naive observation of head rotation may result in the agent interpreting the user looking in different directions as being significant.

For our final implementation, because we were ultimately interested in the player's hand jittering, we merely observed the velocities of each hand relative to the head. To achieve this, we transformed the hand positions relative to the head and normalized the vectors. We took the difference between the current position vector divided by Time.fixedDeltaTime to obtain the velocity. It's important to use Time.fixedDeltaTime and not Time.deltaTime because Unity will execute 100 times faster than normal during training and this will affect the time scale.

Reward Function

Aside from feature selection, the reward function is the most important aspect of reinforcement learning. If the rewards are too sparse, the agent may have a difficult time determining what actions are actually relevant to achieving the desired goal. If the rewards are too frequent, then the agent may learn ways to exploit the system to rack up rewards.

Our dataset consists of scenarios which we manually tagged with a corresponding excitement value between -1.0 and 1.0. The goal of the agent is to make 50 observations of hand velocities and predict an excitement value. In a real-world application, it may be acceptable for the agent's prediction to be inexact given that it is within an acceptable threshold. In other words, if the agent guesses a 0.9 when we were expecting a 1.0, it should not be penalized the same as it would if it had guessed -0.2. Therefore, given a desired excitement value A, we require a function such that , but approaches -1.0 otherwise.

We derived the reward function where d is the absolute value difference between the predicted value and expected value.

Results

Our training strategy involved creating a curriculum which consisted of decreasing the acceptable error threshold any time the agent's accuracy went above 95%. This proved to be much more effective than simply training without a curriculum. We achieved this by assigning the agent a sparse reward of 1.0 any time r(d) was greater than a certain threshold 95% of the time per 1000 steps. The lessons of the curriculum are as follows:

Lesson	Threshold
1	0.5
2	0.6
3	0.65
4	0.7
5	0.75
6	0.8
7	0.825
8	0.85
9	0.9
10	0.91
11	0.93
12	0.95

We trained the agent with both a sparse reward and a frequent reward. The sparse reward of 1.0 was given only when the agent guessed within the desired threshold, otherwise 0.0. The frequent reward was given each step according to the result of the r(d) function. To our surprise, the sparse reward outperformed the frequent reward in both speed and accuracy.

The graph above shows that the agent trained with the sparse reward reached lesson 10 and plateaued while the frequent reward agent couldn't pass lesson 8.

Arousal Prediction Library

We created a simple API for using the arousal detection model and integrating the agent into any existing Unity game. To use the agent, simply import the Unity package file at https://github.com/allanpichardo/dontpanic/releases/tag/v1.0 into an existing project. To receive observations, create a script that extends the HandEmpathyAgent class. This abstract class contains an abstract method void OnNewPrediction(Vector3 inference) which is called any time a prediction has been made and is ready to consume. Be advised that the library depends on the ML-Agents framework and the Vive Input Utility.

Conclusion

To conclude, we succeeded in using AI to control the difficulty of a virtual reality game. This was achieved by analyzing the hand movements of the player and processing them in a ML-agent brain trained by reinforcement learning with Unity3D. The brain makes prediction on the arousal of the player based on the biofeedback that it receives. Due to a time constraint of 10 weeks, we had to abandon the idea of interpreting emotion based solely on the player's hands. We also did not have the opportunity to further try to implement NPC/player interaction or to merge our AI with those created by other teams. In a future experiment, it would be interesting to add another emotional dimension and get a better estimate. It would also be interesting to see how this can be used with another form of biofeedback such as heart-rate or movements of the eyes. We hope that our research is able to provide some guidance to the next team of researchers.

References

https://github.com/Unity-Technologies/ml-agents

The data used in this project was obtained from https://mocap.cs.cmu.edu.

The database was created with funding from NSF EIA-0196217.

Accuracy-based Curriculum Learning in Deep Reinforcement Learning

Pierre Fournier, Mohamed Chetouani, Pierre-Yves Oudeyer, Olivier Sigaud

https://arxiv.org/pdf/1806.09614.pdf

Navigating in a Simulated Environment with Curriculum-based Reinforcement Learning

Jevgenij Martinkevic

https://projekter.aau.dk/projekter/files/287611793/MS_Thesis_JM_2018.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.idea/.idea.Dont Panic		.idea/.idea.Dont Panic
Assets		Assets
Packages		Packages
ProjectSettings		ProjectSettings
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
actions.json		actions.json
binding_holographic_hmd.json		binding_holographic_hmd.json
binding_rift.json		binding_rift.json
binding_vive.json		binding_vive.json
binding_vive_pro.json		binding_vive_pro.json
binding_vive_tracker_camera.json		binding_vive_tracker_camera.json
bindings_holographic_controller.json		bindings_holographic_controller.json
bindings_knuckles.json		bindings_knuckles.json
bindings_knuckles_ev1.json		bindings_knuckles_ev1.json
bindings_oculus_touch.json		bindings_oculus_touch.json
bindings_vive_controller.json		bindings_vive_controller.json
bindings_vive_tracker.json		bindings_vive_tracker.json
unityProject.vrmanifest		unityProject.vrmanifest

allanpichardo/dontpanic

Folders and files

Latest commit

History

Repository files navigation

Dynamic difficulty in virtual reality games through hand motion biofeedback

Abstract

Introduction

Reinforcement Learning with Unity

Training Dataset

Feature Selection

Reward Function

Results

Arousal Prediction Library

Conclusion

References

About

Resources

Stars

Watchers

Forks

Languages