## Starter / test code for RL group project using computer vision and custom environment

set up custom environment (stable baselines framework)

Included in custom environment settup:
- Get_observation (MSS: https://pypi.org/project/mss/1.0.2/)
    - screen capture to observe the game
    - Run through opencv for preprocessing to get game shape
- Send commands (pydirectinput: https://pypi.org/project/PyDirectInput/)
- get game over state -- embeded ml (pytesseract ocr: https://pypi.org/project/pytesseract/)
    - googles tesseract-ocr engine
    - determine when game is done by extracting text from end game state
- Reward function

Please play around with the code and get comfortable with the following libraries as we will most likely utilize these to build and train our RL model for the game 'Getting Over It'. To begin testing test the code with the chrome dino game ( chrome://dino/ ) and adjust the game_location and game_over_location values to match your screen resolution. After this check out the scratch verion of Getting Over It Issac found https://scratch.mit.edu/projects/389464290/

The following code has derived from this RL YouTube course: https://www.youtube.com/watch?v=dWmJ5CXSKdw&t=30489s&ab_channel=NicholasRenotte

### Install and Import Dependencies

In [None]:
# install pytorch (larger download)
# !pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# install stable-baselines + protobuf
# !pip install stable-baselines3[extra] protobuf==3.20.*

# install mss - cross platfrom multiple screenshot module in pure python
# !pip install --upgrade mss

# install pydirectinput
# !pip install pydirectinput

# install pytesseract (python wrapper for google tesseract.. need google tesseract installed before wrapper)
#
# prereqs: python 3.6+, PIL or Pillow, google terreract ocr (download binary from github)
#
# pip install --upgrade Pillow
# install google tesseract binary https://tesseract-ocr.github.io/tessdoc/Installation.html
# youtube install guide: https://www.youtube.com/watch?v=2kWvk4C1pMo&ab_channel=JayMartMedia
# 
# !pip install pytesseract

# install openai gym
# !pip install gym

In [None]:
import pydirectinput    # sending commands
import cv2              # frame processing
import pytesseract      # ocr for reading text -- game over
import time
import numpy as np
from mss import mss     # screen capture
from matplotlib import pyplot as plt

# environment
from gym import Env
from gym.spaces import Box, Discrete



### Build the Environment (Custom openAI gym env)

In [None]:
class ScreenCaptureGame(Env):
    def __init__(self):
        super().__init__()
        # TODO: 
        # setup spaces 
        self.observation_space = Box(low=0, high=255, shape=(1, 83, 100), dtype=np.uint8)
        # TODO: 
        # number of actions
        self.action_space = Discrete(3)
        # define extraction params for game
        self.cap = mss()

        
        # TODO: adjust values will depend on screen resolution
        # location for game captures
        self.game_location = {'top': 800, 'left': 0, 'width': 1200, 'height': 1000}
        self.game_over_location = {'top': 950, 'left': 1000, 'width': 1200, 'height': 200}
        
    # TODO: determine actions, look into pydirect input and test different action maps
    # hint: limit actions to improve performance, can we use x y mouse values to simulate cw and cc rotation? what about jumping up?
    # step this is what happens every frame   
    def step(self, action):
        # TODO:
        # Action keys: 0 = space, 1 = down, 2 = no action
        action_map = {
            0: 'space',
            1: 'down',
            2: 'no_op'
        }
        if action != 2:
            pydirectinput.press(action_map[action])
        # check if game is done
        is_game_over, game_over_capture = self.get_game_over()
        # get new observation
        new_observation = self.get_observation()
        # TODO: what should our reward be? can we use pytesseract to extract height/score?
        # reward - get point for every frame
        reward = 1
        # info dictionary, required by stable baseline
        info = {}
        
        return new_observation, reward, info
        
    # TODO: can we improve this? do we need this? will this work better in a python script?
    # visulize game
    def render(self):
        cv2.imshow('Game', np.array(self.cap.grab(self.game_location))[:,:,:3])
        if cv2.waitKey(1) & 0xFF == ord('q'):
            self.close()
            
    # reset the game
    # TODO: how do we reset our game?
    def reset(self):
        time.sleep(1)
        pydirectinput.click(x=150, y=150)
        pydirectinput.press('space')
        return self.get_observation()
    
    # close render window
    def close(self):
        cv2.destroyAllWindows()
        
    def get_observation(self):
        # get screen cap
        raw = np.array(self.cap.grab(self.game_location))[:,:,:3]
        # grayscale 3 channels -> single channel
        gray = cv2.cvtColor(raw, cv2.COLOR_BGR2GRAY)
        # resize (width, height)
        resized = cv2.resize(gray, (100, 83))
        # add channels first --> required stable baselines
        channel = np.reshape(resized, (1, 83, 100))
        return channel
    
    # TODO: how to determine game over? pytesseract to extract text data from screen?
    # hint: scratch game allows displaying game variable data
    # get game over text using OCR to signal game over
    def get_game_over(self):
        game_over_capture = np.array(self.cap.grab(self.game_over_location))[:,:,:3]
        is_game_over = False
        # game over text -- test and add possible misspelled 'game' results
        valid_strings = ['GAME', 'GAHE', 'GANE', 'GAAM']
        # OCR, takes image and extracts text (not 100% accurate)
        ocr_res = pytesseract.image_to_string(game_over_capture)[:4]
        if ocr_res in valid_strings:
            is_game_over = True
        return is_game_over, game_over_capture
    

In [None]:
env = ScreenCaptureGame()

In [None]:
env.render()

In [None]:
env.close()

In [None]:
env.reset()

In [None]:
# env.get_observation()[0].shape
# plt.imshow(env.get_observation()[0])

# show game state - converted back to RGB
plt.imshow(cv2.cvtColor(env.get_observation()[0], cv2.COLOR_BGR2RGB))


In [None]:
plt.imshow(np.array(env.get_game_over()[1]) )

In [None]:
env.action_space.sample()

In [None]:
is_game_over, game_over_capture = env.get_game_over()

is_game_over

In [None]:
plt.imshow(env.observation_space.sample()[0])

Test Env

In [None]:
env = ScreenCaptureGame()

In [None]:
observation = env.get_observation()

In [None]:
plt.imshow(cv2.cvtColor(observation[0], cv2.COLOR_BGR2RGB))

In [None]:
is_game_over, game_over_capture = env.get_game_over()

#is_game_over
# plt.imshow(game_over_capture)

### TODO: Train Model

### TODO: Test Model