<a href="https://colab.research.google.com/github/hmukesh5/dino-ai/blob/main/dinoai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Hemanth's Chrome-Dino AI**
Welcome to my implementation of an AI to play the Chrome Dino game! I'm planning on training a Deep Q-Network Agent on the official chrome://dino game by taking screenshots. Hopefully this creates a model that can achieve crazy high scores! Similar projects are out there, but none really seem to use the official dino game or aren't trained very well.

I'm using this project mainly as a way to get experience with Machine Learning and Google Colab / Jupyter Notebook.

Inspired by [this interesting YouTube video](https://www.youtube.com/watch?v=DcYLT37ImBY). Concepts were also heavily drawn from [this one](https://www.youtube.com/watch?v=vahwuupy81A).

**This project is licensed under the GNU GPL 3.0 license. Please cite this repo as the source when copying/modifying/publishing this code.**

## **Development Terminal**


## **OS Check**

Below is the current version of the runtime we are using. This is useful for I want to transition to another server, and also to ensure that PyVirtualDisplay (Linux only library) will work.

In [None]:
!cat /etc/os-release

PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy


## **Dependencies and Downloads**
First, I'm going to install some dependencies. A lot of these are already pre-installed in colab, so if I take this code somewhere else, I might have to add more to this section.

In [1]:
!pip install selenium pyvirtualdisplay gymnasium scikit-image stable-baselines3
!sudo apt-get install xvfb

# i thought i needed these for chromedriver, guess not lol
# !apt-get install -y chromium-chromedriver
# !cp /usr/lib/chromium-browser/chromedriver /usr/bin

Collecting selenium
  Downloading selenium-4.22.0-py3-none-any.whl (9.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.4/9.4 MB[0m [31m18.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting pyvirtualdisplay
  Downloading PyVirtualDisplay-3.0-py3-none-any.whl (15 kB)
Collecting gymnasium
  Downloading gymnasium-0.29.1-py3-none-any.whl (953 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m953.9/953.9 kB[0m [31m36.2 MB/s[0m eta [36m0:00:00[0m
Collecting stable-baselines3
  Downloading stable_baselines3-2.3.2-py3-none-any.whl (182 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m182.3/182.3 kB[0m [31m12.1 MB/s[0m eta [36m0:00:00[0m
Collecting trio~=0.17 (from selenium)
  Downloading trio-0.25.1-py3-none-any.whl (467 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m467.7/467.7 kB[0m [31m25.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting trio-websocket~=0.9 (from selenium)
  Downloading trio_websocket

## **Environment**
Here, I'm going to make an OpenAI Gymnasium environment for the agent to interact with.

### imports

In [2]:
import gymnasium as gym

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains

chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--disable-infobars")
chrome_options.add_argument("--disable-notifications")
chrome_options.add_argument("--mute-audio")
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--no-default-browser-check")
chrome_options.add_argument("--no-first-run")
chrome_options.add_argument("--bwsi")
chrome_options.add_argument("--force-dark-mode")

from pyvirtualdisplay import Display
from pyvirtualdisplay.smartdisplay import SmartDisplay

import cv2
import numpy as np
import time

from PIL.ImageGrab import grab

from matplotlib import pyplot as plt

from skimage.metrics import structural_similarity as ssim

### env class

In [15]:
class DinoEnv(gym.Env):

  def __init__(self):
    super().__init__()
    self.metadata = {
        'render.modes': [None, 'rgb_array']
    }
    self.render_mode = 'rgb_array'
    self.resize_len = 140
    self.resize_width = 490

    self.action_space = gym.spaces.Discrete(3)   # jump, duck, no operation
    self.observation_space = gym.spaces.Box(
        low=0,
        high=255,
        shape=(1, self.resize_len, self.resize_width),
        dtype=np.uint8
    )

    # bounding box
    #          left, top, right, bottom
    self.gamebox = (20, 200, 720, 390)
    self.donebox = (270, 250, 390, 275)
    self.doneimg = None

    self.disp = SmartDisplay(visible=False, size=(800, 800))
    self.disp.start()
    self.driver = webdriver.Chrome(options=chrome_options)
    self.driver.set_network_conditions(
        offline=True,
        latency=0,
        download_throughput=0,
        upload_throughput=0
    )
    try:
      self.driver.get('chrome://dino')
    except:
      pass
    self.body = self.driver.find_element(By.CSS_SELECTOR, 'body')

    print("running first time...")
    self._jump()
    self._noop()
    time.sleep(6)
    dummy = self._get_done()
    print("finished init")


  def step(self, action):
    if action == 0:
      self._noop()
    elif action == 1:
      self._jump()
    elif action == 2:
      self._down()

    next_observation = self._get_observation()

    is_done = bool(self._get_done())

    truncated = False
    reward = -10 if is_done else 1
    info = {}

    return next_observation, reward, is_done, truncated, info

  def reset(self, seed = None, options = None):
    time.sleep(0.5)
    self._jump()
    self._noop()
    time.sleep(0.05)
    first_observation = self._get_observation()
    info = {}
    return first_observation, info

  def render(self):
    if self.render_mode == 'rgb_array':
      img = np.array(grab(xdisplay=self.disp.new_display_var, bbox=self.gamebox))
      return img

  def close(self):
    if self.driver is not None:
      self.driver.quit()
      self.driver = None
      print("driver quit")
    else:
      print("driver already quit")
    if self.disp is not None:
      self.disp.stop()
      self.disp = None
      print("display quit")
    else:
      print("display already quit")

  def curr_score(self):
    score = self.driver.execute_script("return Runner.instance_.distanceRan * Runner.instance_.distanceMeter.config.COEFFICIENT")
    return int(round(score))

  def curr_high_score(self):
    high_score = self.driver.execute_script("return Runner.instance_.highestScore * Runner.instance_.distanceMeter.config.COEFFICIENT")
    return int(round(high_score))

  def _get_observation(self):
    img = np.array(grab(xdisplay=self.disp.new_display_var, bbox=self.gamebox))
    grayscale = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    blurred = cv2.GaussianBlur(grayscale, (5, 5), 0)
    edges = cv2.Canny(blurred, 50, 150)
    resized = cv2.resize(edges, (self.resize_width, self.resize_len))
    expanded = np.expand_dims(resized, axis=0)
    return expanded

  def _get_done(self) -> bool:
    img = np.array(grab(xdisplay=self.disp.new_display_var, bbox=self.donebox))
    grayscale = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    filtered = cv2.bilateralFilter(grayscale, 9, 75, 75)
    post_done_img = filtered

    if self.doneimg is None:
      self.doneimg = post_done_img
      return False

    score = ssim(post_done_img, self.doneimg)
    return score > 0.9

  def _send_key_event(self, event_type, key):
    # event_type must be keyUp or keyDown
    code = {'space': 32, 'down': 40}.get(key, 0)
    self.driver.execute_cdp_cmd('Input.dispatchKeyEvent', {
        'type': event_type,
        'windowsVirtualKeyCode': code,
        'nativeVirtualKeyCode': code,
        'text': ""
    })

  def _jump(self):
    self._send_key_event('keyUp', 'down')
    self._send_key_event('keyDown', 'space')

  def _down(self):
    self._send_key_event('keyUp', 'space')
    self._send_key_event('keyDown', 'down')

  def _noop(self):
    self._send_key_event('keyUp', 'space')
    self._send_key_event('keyUp', 'down')

  def _test_action(self):
    start_time = time.time()
    self._jump()
    self._noop()
    end_time = time.time()
    print("time taken:", end_time - start_time)


  and should_run_async(code)


## **Agent**
Here is the DQN Agent that will learn to play the game

Import some things and test the environment

In [4]:
import os
from stable_baselines3.common.callbacks import BaseCallback
from stable_baselines3.common import env_checker
from stable_baselines3 import DQN

In [16]:
# test env from stable baselines 3
env = DinoEnv()
env_checker.check_env(env)

running first time...
finished init


I'm going to write a callback function to periodically log and save the model

In [None]:
# write callback function here
class CustomCallback(BaseCallback):
  def _init_callback(self):
    pass

  def _on_step(self):
    pass

In [None]:
callback = CustomCallback()

Here's where the magic will happen

In [None]:
model = DQN('CnnPolicy', callback=callback)

In [None]:
model.learn()

## **Test Trained Model**

In [None]:
num_runs = 1
for run in range(num_runs):
  obs, _ = env.reset()
  done = False
  avg_time = 0
  timesteps = 0

  while not done:
    start_time = time.time()
    # action = env.action_space.sample()  random action
    action, _ = model.predict(obs)
    observation, reward, done, truncated, info = env.step(action)
    end_time = time.time()
    avg_time += end_time - start_time
    timesteps += 1

  print("Score:", env.curr_score())
  print("High Score:", env.curr_high_score())
  print("second per frame:", avg_time / timesteps)
  print("fps:", timesteps / avg_time)

In [None]:
env.close()