# Reinforcement Learning With Python | Part 1 | Creating The Environment

Blog Link: https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/

Explanation of the game rules | The game played by a human
:-------------------------:|:-------------------------:
![](images/gifs/envExp.gif)  |  ![](images/gifs/EnvPlayed.gif)

In this tutorial series, we are going through every step of building an expert Reinforcement Learning (RL) agent that is capable to play games.

**This series is divided into three parts:**
- **Part 1:** Designing and Building the Game Environment. In this part we will build a game environment and customize it to make the RL agent able to train on it.
- **Part 2:** Build and Train the Deep Q Neural Network (DQN). In this part, we define and build the different layers of DQN and train it.
- **Part 3:** Test and Play the Game.

We might also try making another simple game environment and use Q-Learning to create an agent that can play this simple game.

## Designing the Environment:

For this environment, we want the agent to develop a sense of its body and how to change its body features to avoid losing the game.

### First: The Elements of The Environment:
The Elements of The Environment |  .
:-------------------------:|:-------------------------:
<img src="images/EnvExp.jpg" width="100%">

#### 1- The Field:
Contains all the other elements,we represent it in code by class named "Field" as follows:

In [19]:
class Field:
    def __init__(self, height=10, width=5):
        self.width      = width
        self.height     = height
        self.body       = np.zeros(shape=(self.height, self.width))
    
    def update_field(self,walls, player):
        try:
            # Clear the field:
            self.body = np.zeros(shape=(self.height, self.width))
            # Put the walls on the field:
            for wall in walls:
                if not wall.out_of_range :
                    self.body[wall.y:min(wall.y+wall.height,self.height),:] = wall.body

            # Put the player on the field:
            self.body[player.y:player.y+player.height,
                      player.x:player.x+player.width] += player.body 
        except :
            pass

**Field attributes:**

* ***width (int)*** : the width of the field (not in pixels)

* ***height (int)*** the height of the field (not in pixels)

* ***body (np.array)*** : holds the array representation of the game elements (player and walls) 

This array is passed to the DQN, and also used to draw the interface using pygame.
<br/><br/>
**Field methods:**

* ***update_field***(self,walls, player) : updates the field.

#### 2- The Walls:

In [20]:
class Wall:        
    def __init__(self, height = 5, width=100,  hole_width = 20,
                 y = 0, speed = 1, field = None):
        self.height       = height
        self.width        = width
        self.hole_width   = hole_width
        self.y            = y
        self.speed        = speed
        self.field        = field
        self.body_unit     = 1
        self.body         = np.ones(shape = (self.height, self.width))*self.body_unit
        self.out_of_range = False
        self.create_hole()
    def create_hole(self):
        hole = np.zeros(shape = (self.height, self.hole_width))
        hole_pos = randint(0,self.width-self.hole_width)
        self.body[ : , hole_pos:hole_pos+self.hole_width] = 0
    def move(self):
        self.y += self.speed
        self.out_of_range = True if ((self.y + self.height) > self.field.height) else False

The Wall |  .
:-------------------------:|:-------------------------:
<img src="images/wall.jpg" width="100%">

**Wall attributes:**

|Attribute   |Type       |Description                                                                         |
|------------|-----------|------------------------------------------------------------------------------------|
|height      |int        |the wall's height                                                                   |
|width       |int        |the wall's width ( the same value as the field's width)                             |
|hole_width  |int        |the hole's width (max value of hole_width should be field.width or wall.width)      |
|y           |int        |the vertical coordinate of the wall (y axis) (max value of y should be field.height)|
|speed       |int        |speed of the wall (raw/step)                                                        |
|field       |Field      |the field that contains the wall                                                    |
|body_unit   |int ; float|the number used to represent the wall in the array representation (in field.body)   |
|body        |np.array   |the wall's body                                                                     |
|out_of_range|bool       |A flag used to delete the wall when it moves out of the field range.                |

**Wall methods:**

* ***create_hole***(self): Creates a hole in the wall that its width = self.hole_width.
* ***move***(self): Moves the wall vertically (every time it get called the wall moves n steps from downward (n = self.speed))

#### 3- The Player :

In [21]:
class Player:
    def __init__(self, height = 5, max_width = 10 , width=2,
                 x = 0, y = 0, speed = 2):
        self.height        = height
        self.max_width     = max_width
        self.width         = width
        self.x             = x
        self.y             = y
        self.speed         = speed
        self.body_unit     = 2
        self.body          = np.ones(shape = (self.height, self.width))*self.body_unit
        self.stamina       = 20
        self.max_stamina   = 20
    def move(self, field, direction = 0 ):
        '''
        Moves the player :
         - No change          = 0
         - left, if direction  = 1
         - right, if direction = 2
        '''
        val2dir   = {0:0 , 1:-1 , 2:1}
        direction = val2dir[direction]
        next_x = (self.x + self.speed*direction)
        if not (next_x + self.width > field.width or next_x < 0):
            self.x += self.speed*direction
            self.stamina -= 1 
    def change_width(self, action = 0):
        '''
        Change the player's width:
         - No change          = 0
         - narrow by one unit = 3
         - widen by one unit  = 4
        '''
        val2act   = {0:0 , 3:-1 , 4:1}
        action = val2act[action]
        new_width = self.width+action
        player_end = self.x + new_width
        if new_width <= self.max_width and new_width > 0 and player_end <= self.max_width:
            self.width = new_width
            self.body  = np.ones(shape = (self.height, self.width))*self.body_unit

**Player attributes:**

|Attribute   |Type       |Description                                                                         |
|------------|-----------|------------------------------------------------------------------------------------|
|height      |int        |player's height                                                                     |
|max_width   |int        |player's maximum width (must be less than field.width)                              |
|width       |int        |player's width (must be less than or equal to max_width and begger than 0)          |
|x           |int        |player's x coordinate in the field                                                  |
|y           |int        |player's y coordinate in the field                                                  |
|speed       |int        |player's speed (how many horizontal units it moves per step)                        |
|body_unit   |int ; float|the number used to represent the player in the array representation (in field.body) |
|body        |np.array   |the player's body                                                                   |
|stamina     |int ; float|player's energy (stamina) (when a player's energy hits zero the player dies)        |
|max_stamina |int ; float|maximum value for player's stamina                                                  |

**Player methods:**
* ***move***(self, field, direction = 0 ): Moves the player :
    - direction = 0 -> No change 
    - direction = 1 -> left 
    - direction = 2 -> right
* ***change_width***(self, action = 0):
    - action = 0 -> No change
    - action = 3 -> narrow by one unit
    - action = 4 -> widen by one unit

---

## The "Environment" Class :
This class facilitates the communication between the environment and the agent, it is designed to work either with an RL agent or with a human player.

### Main Components Needed by the RL Agent:
- ***ENVIRONMENT_SHAPE*** attribute : used by the DQN to set the shape of the input layer.
- ***ACTION_SPACE*** attribute : used by the DQN to set the shape of the output layer.
- ***PUNISHMENT*** and ***REWARD*** : set the values of both punishment and reward, used to train the agent (we use these values to tell the agent if its previous actions were good or bad).
- ***reset*** method : to reset the environment.
- ***step*** method: takes an action as an argument and returns next state, reward, a boolean variable named game_over that is used to tell us if the game is over (the player lost) or not.

It is clear that this environment is not different, it subsumes all the required components and more.

In [22]:
class Environment:
    P_HEIGHT      = 2  # Height of the player
    F_HEIGHT      = 20 # Height of the field
    W_HEIGHT      = 2  # Height of the walls
    WIDTH         = 10 # Width of the field and the walls
    MIN_H_WIDTH   = 2  # Minimum width of the holes
    MAX_H_WIDTH   = 6  # Maximum width of the holes
    MIN_P_WIDTH   = 2  # Minimum Width of the player
    MAX_P_WIDTH   = 6  # Maximum Width of the player
    HEIGHT_MUL    = 30 # Height Multiplier (used to draw np.array as blocks in pygame )
    WIDTH_MUL     = 40 # Width Multiplier (used to draw np.array as blocks in pygame )
    WINDOW_HEIGHT = (F_HEIGHT+1) * HEIGHT_MUL # Height of the pygame window
    WINDOW_WIDTH  = (WIDTH) * WIDTH_MUL       # Widh of the pygame window
    
    ENVIRONMENT_SHAPE = (F_HEIGHT,WIDTH,1)
    ACTION_SPACE      = [0,1,2,3,4]
    ACTION_SPACE_SIZE = len(ACTION_SPACE)
    PUNISHMENT        = -100  # Punishment increment
    REWARD            = 10    # Reward increment
    score             = 0     # Initial Score
    
    MOVE_WALL_EVERY   = 4     # Every how many frames the wall moves.
    MOVE_PLAYER_EVERY = 1     # Every how many frames the player moves.
    frames_counter    = 0

    def __init__(self):
        # Colors:
        self.BLACK      = (25,25,25)
        self.WHITE      = (255,255,255)
        self.RED        = (255, 80, 80)
        self.BLUE       = (80, 80, 255)
        self.field = self.walls = self.player = None
        self.current_state = self.reset()
        self.val2color  = {0:self.WHITE, self.walls[0].body_unit:self.BLACK,
                           self.player.body_unit:self.BLACK, self.MAX_VAL:self.RED}
    def reset(self):
        self.score          = 0
        self.frames_counter = 0
        self.game_over      = False
        
        self.field = Field(height=self.F_HEIGHT, width=self.WIDTH )
        w1    = Wall( height = self.W_HEIGHT, width=self.WIDTH,
                      hole_width = randint(self.MIN_H_WIDTH,self.MAX_H_WIDTH),
                     field = self.field)
        self.walls = deque([w1])
        p_width = randint(self.MIN_P_WIDTH,self.MAX_P_WIDTH)
        self.player    = Player( height = self.P_HEIGHT, max_width = self.WIDTH,
                                width = p_width,
                                x = randint(0,self.field.width-p_width),
                                y = int(self.field.height*0.7), speed = 1)
        self.MAX_VAL = self.player.body_unit + w1.body_unit
        # Update the field :
        self.field.update_field(self.walls, self.player)
        
        observation = self.field.body/self.MAX_VAL
        return observation
    def print_text(self, WINDOW = None, text_cords = (0,0), center = False,
                   text = "", color = (0,0,0), size = 32):
        pygame.init()
        font = pygame.font.Font('freesansbold.ttf', size) 
        text_to_print = font.render(text, True, color) 
        textRect = text_to_print.get_rect()
        if center:
            textRect.center = text_cords
        else:
            textRect.x = text_cords[0]
            textRect.y = text_cords[1]
        WINDOW.blit(text_to_print, textRect)
        
    def step(self, action):
        global score_increased

        self.frames_counter += 1
        reward = 0

        # If the performed action is (move) then player.move method is called:
        if action[0] in [1,2]:
            self.player.move(direction = action[0], field = self.field)
        # If the performed action is (change_width) then player.change_width method is called:
        if action[1] in [3,4]:
            self.player.change_width(action = action[1])                
        
        # Move the wall one step (one step every WALL_SPEED frames):
        if self.frames_counter % self.WALL_SPEED == 0:
            # move the wall one step
            self.walls[-1].move()
            # reset the frames counter
            self.frames_counter = 0
        
        # Update the field :
        self.field.update_field(self.walls, self.player)

        # If the player passed a wall successfully increase the reward +1
        if ((self.walls[-1].y) == (self.player.y + self.player.height)) and not score_increased :
            reward += self.REWARD
            self.score  += self.REWARD
            
            # Increase player's stamina every time it passed a wall successfully  
            self.player.stamina = min(self.player.max_stamina, self.player.stamina+10)
            # score_increased : a flag to make sure that reward increases once per wall 
            score_increased = True
            
        
        #  Lose Conditions : 
        # C1 : The player hits a wall
        # C2 : Player's width was far thinner than hole's width
        # C3 : Player fully consumed its stamina (energy)
        lose_conds = [self.MAX_VAL in self.field.body,
                      ((self.player.y == self.walls[-1].y) and (self.player.width < (self.walls[-1].hole_width-1))),
                      self.player.stamina <=0]
        

        # If one lose condition or more happend, the game ends:
        if True in lose_conds:
            reward = self.PUNISHMENT
            self.game_over = True
            return self.field.body/self.MAX_VAL, reward, self.game_over

        # Check if a wall moved out of the scene:
        if self.walls[-1].out_of_range:
            # Create a new wall
            self.walls[-1] = Wall( height = self.W_HEIGHT, width = self.WIDTH,
                                   hole_width = randint(self.MIN_H_WIDTH,self.MAX_H_WIDTH),
                                   field = self.field)

            score_increased = False

        
        # Return New Observation , reward, game_over(bool)
        return self.field.body/self.MAX_VAL, reward, self.game_over
    
    def render(self, WINDOW = None, human=False):
        if human:
            ################ Check Actions #####################
            action = 0
            events = pygame.event.get()
            for event in events:
                if event.type == pygame.QUIT:
                    self.game_over = True
                if event.type == pygame.KEYDOWN:
                    if event.key == pygame.K_LEFT:
                        action = 1
                    if event.key == pygame.K_RIGHT:
                        action = 2

                    if event.key == pygame.K_UP:
                        action = 4
                    if event.key == pygame.K_DOWN:
                        action = 3
            ################## Step ############################            
            _,reward, self.game_over = self.step(action)
        ################ Draw Environment ###################
        WINDOW.fill(self.WHITE)
        self.field.update_field(self.walls, self.player)
        for r in range(self.field.body.shape[0]):
            for c in range(self.field.body.shape[1]):
                pygame.draw.rect(WINDOW,
                                 self.val2color[self.field.body[r][c]],
                                 (c*self.WIDTH_MUL, r*self.HEIGHT_MUL, self.WIDTH_MUL, self.HEIGHT_MUL))

        self.print_text(WINDOW = WINDOW, text_cords = (self.WINDOW_WIDTH // 2, int(self.WINDOW_HEIGHT*0.1)),
                       text = str(self.score), color = self.RED, center = True)
        self.print_text(WINDOW = WINDOW, text_cords = (0, int(self.WINDOW_HEIGHT*0.9)),
                       text = str(self.player.stamina), color = self.RED)
        
        pygame.display.update()

**Environment attributes:**

|Attribute   |Type       |Description                                                                         |
|------------|-----------|------------------------------------------------------------------------------------|
|P_HEIGHT    |int        |Height of the player                                                                |
|F_HEIGHT    |int        |Height of the field                                                                 |
|W_HEIGHT    |int        |Height of the walls                                                                 |
|WIDTH       |int        |Width of the field and the walls                                                    |
|MIN_H_WIDTH |int        |Minimum width of the holes                                                          |
|MAX_H_WIDTH |int        |Maximum width of the holes                                                          |
|MIN_P_WIDTH |int        |Minimum Width of the player                                                         |
|MAX_P_WIDTH |int        |Maximum Width of the player                                                         |
|HEIGHT_MUL  |int        |Height Multiplier (used to draw np.array as blocks in pygame )                      |
|WIDTH_MUL   |int        |Width Multiplier (used to draw np.array as blocks in pygame )                       |
|WINDOW_HEIGHT|int        |Height of the pygame window                                                         |
|WINDOW_WIDTH|int        |Width of the pygame window                                                          |
|ENVIRONMENT_SHAPE|tuple      |(field height ; field width ; 1)                                                    |
|ACTION_SPACE|list       |list of actions an agent can perform                                                |
|ACTION_SPACE_SIZE|int        |number of actions an agent can perform                                              |
|PUNISHMENT  |int ; float|Punishment increment                                                                |
|REWARD      |int ; float|Reward increment                                                                    |
|score       |int ; float|Initial Score                                                                       |
|MOVE_WALL_EVERY|int        |Every how many frames the wall moves.                                               |
|MOVE_PLAYER_EVERY|int        |Every how many frames the player moves.                                             |
|frames_counter|int        |used to handle the wall speed                                                       |
|field       |Field      | the field object that holds walls and players                                      |
|walls       |double ended queue of Wall objects|a que of walls                                                                      |
|player      |Player     |the player object                                                                   |
|current_state|np.array   |holds the current state of the field (the array representation of the game field)   |
|val2color   |dictionary |used to color the blocks depending on their values (ex: if you want to color the player RED you will put 'self.player.body_unit:RED' in val2color dictionary)|
|MAX_VAL     |int ; float| used to detect collisions between walls and players (MAX_VAL = self.player.body_unit + self.wall.body_unit) |


***Environment methods:***
* \__***init***__(self) : initializes the environment by initializing some attributes and calling the reset method.
* ***reset***(self) : resets the environment and returns the state of the game field after resetting it.
* ***print_text***(self, WINDOW = None, text_cords = (0,0), center = False, text = "", color = (0,0,0), size = 32): prints a text in a given pygame.display (WINDOW) with the given features.
---
**+ step(self, action):**

1. Call the player's move method to move the player.
2. Call the player's change_width method to move the player.
3. Move the wall one step.
4. Update the field.
5. Check if the player passed a wall successfully. If so, gives the player a reward and increase its stamina.
6. Check the three losing conditions: the player loses the game if at least one of these three conditions met.

**Losing Conditions**:

|Condition   |Explanation|Code                                                                                |
|------------|-----------|------------------------------------------------------------------------------------|
|C1          |The player hits a wall|self.MAX_VAL in self.field.body                                                     |
|C2          |Player's width was far thinner than hole's width|((self.player.y == self.walls[-1].y) and (self.player.width < (self.walls[-1].hole_width-1)))|
|C3          |Player fully consumed its stamina (energy)|self.player.stamina <=0                                                             |


when a player loses, the value of returned reward will equal PUNISHMENT, and the indicator of the game state (game_over) changes from false to true.

7. Check if the current wall hits the bottom of the field, when that happens, the out of range wall is replaced by a new wall.
8. Return next_state normalized, reward, game_over
---
**+render**(self, WINDOW = None, human=False):

**Arguments:**
* ***WINDOW*** (pygame.display): the pygame.display that the game will be rendered on.
* ***human*** (bool): If a human will play the game, this argument is set to True, in this case pygame catch pressed keyboard keys to get the action that will be performed.

**Explanation of render method line by line:**
1. Check if the player is a human. If so, get the pressed key and translate it to the corresponding action (ex: if the right arrow is pressed then set action = 2, that means move the player on step to the right), then call step method to perform the chosen action.
2. Update the field then start drawing the walls and the player as blocks.
3. Print the score and the player's stamina.
4. Finally, update the display to show the rendered screen.

## Finally : Put it all together
Now we are going to use everything we explained and play the game:

The following code repeats the game until the player wins by getting a score higher than or equals  winning_score, or quits the game.

In [23]:
import numpy as np 
from random import randint
from collections import deque
import pygame
from time import sleep




In [24]:
'''# Make an environment object
env            = Environment()
# Change wall speed to 3 (one step every 3 frames)
env.WALL_SPEED = 4

# Initialize some variables 
WINDOW          = pygame.display.set_mode((env.WINDOW_WIDTH, env.WINDOW_HEIGHT))
clock           = pygame.time.Clock()
win             = False
winning_score   = 100

# Repeaat the game untill the player win (got a score of winning_score) or quits the game.
while not win:
  score_increased = False
  game_over       = False
  _ = env.reset()
  pygame.display.set_caption("Game")
  while not game_over:
      clock.tick(25)
      env.render(WINDOW = WINDOW, human=True)
      game_over = env.game_over
  #####################################################
  sleep(1)
  WINDOW.fill(env.WHITE)
  if env.score >= winning_score:
    win = True
    env.print_text(WINDOW = WINDOW, text_cords = (env.WINDOW_WIDTH // 2, env.WINDOW_HEIGHT// 2),
                       text = f"You Win - Score : {env.score}", color = env.RED, center = True)
  else:
    env.print_text(WINDOW = WINDOW, text_cords = (env.WINDOW_WIDTH // 2, env.WINDOW_HEIGHT// 2),
                       text = f"Game Over - Score : {env.score}", color = env.RED, center = True)
  pygame.display.update()'''

'# Make an environment object\nenv            = Environment()\n# Change wall speed to 3 (one step every 3 frames)\nenv.WALL_SPEED = 4\n\n# Initialize some variables \nWINDOW          = pygame.display.set_mode((env.WINDOW_WIDTH, env.WINDOW_HEIGHT))\nclock           = pygame.time.Clock()\nwin             = False\nwinning_score   = 100\n\n# Repeaat the game untill the player win (got a score of winning_score) or quits the game.\nwhile not win:\n  score_increased = False\n  game_over       = False\n  _ = env.reset()\n  pygame.display.set_caption("Game")\n  while not game_over:\n      clock.tick(25)\n      env.render(WINDOW = WINDOW, human=True)\n      game_over = env.game_over\n  #####################################################\n  sleep(1)\n  WINDOW.fill(env.WHITE)\n  if env.score >= winning_score:\n    win = True\n    env.print_text(WINDOW = WINDOW, text_cords = (env.WINDOW_WIDTH // 2, env.WINDOW_HEIGHT// 2),\n                       text = f"You Win - Score : {env.score}", color = en

In [25]:
from keras.models import load_model

# EX : say you have the model in a folder called "models" and model's name is "myModel.model"
model_name_ = "M2__32C__32D__32D__ECC2__1A-5Ac___Eps(100)__max(_-90.00)__avg(_-99.00)__min(-100.00).model.keras"
model = load_model(f'models/{model_name_}')


# Now We can use the trained agent to play the game :
# Next code initializes the environment and the agent 
# then uses the trained agent to play 20 rounds of the game and 
# records the score and time of each round 
env2 = Environment()
env2.WALL_SPEED = 2
import time

start = time.time()
for _ in range(20):
    WINDOW          = pygame.display.set_mode((env2.WINDOW_WIDTH, env2.WINDOW_HEIGHT))
    clock           = pygame.time.Clock()
    score_increased = False
    game_over       = False
    _ = env2.reset()
    pygame.display.set_caption("Game")
    while not game_over:
        clock.tick(27)
        prd = model.predict((env2.field.body/env2.MAX_VAL).reshape(-1, *env2.ENVIRONMENT_SHAPE))
        actions = [np.argmax(prd[0]),np.argmax(prd[0])]
        _,reward, game_over = env2.step(actions)
        env2.render(WINDOW = WINDOW)


    #####################################################
    a = int(time.time()-start)
    print(f"Score {env2.score} in {a//60}:{a%60}")
    sleep(0.5)
    WINDOW.fill(env2.WHITE)
    env2.print_text(WINDOW = WINDOW, text_cords = (env2.WINDOW_WIDTH // 2, env2.WINDOW_HEIGHT// 2),
                       text = f"Game Over - Score : {env2.score}", color = env2.RED, center = True)
    pygame.display.update()
    sleep(0.5)
pygame.quit()

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 343ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 77ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 40ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 119ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 70ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 38ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 36ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 69ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 59ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 

In [26]:
'''import numpy as np
import pygame
import time
from keras.models import load_model

# Load the trained model
model_name_ = "M2__32C__32D__32D__ECC2__1A-5Ac___Eps(100)__max(_-90.00)__avg(_-99.00)__min(-100.00).model.keras"
model = load_model(f'models/{model_name_}')

# Initialize the environment
env2 = Environment()
env2.WALL_SPEED = 1

# Initialize the game
WINDOW = pygame.display.set_mode((env2.WINDOW_WIDTH, env2.WINDOW_HEIGHT))
clock = pygame.time.Clock()

# Play 20 rounds of the game
start = time.time()
for _ in range(20):
    score_increased = False
    game_over = False
    _ = env2.reset()
    pygame.display.set_caption("Game")
    while not game_over:
        clock.tick(27)

        # Get model's prediction for the action probabilities
        prd = model.predict((env2.field.body / env2.MAX_VAL).reshape(-1, *env2.ENVIRONMENT_SHAPE))

        # Apply softmax to ensure valid probabilities
        action_probs = prd[0]
        action_probs = np.clip(action_probs, 0, None)  # Ensure no negative probabilities

        # Check if sum is zero to avoid division by zero errors
        action_probs_sum = np.sum(action_probs)
        if action_probs_sum == 0:
            # If sum is zero, assign equal probability to all actions
            action_probs = np.ones_like(action_probs) / len(action_probs)
        else:
            # Normalize the probabilities to sum to 1
            action_probs /= action_probs_sum

        # Introduce randomness: choose actions based on probabilities
        actions = np.random.choice(len(action_probs), size=2, p=action_probs)  # Pick actions based on probabilities

        # Perform each action separately
        for action in actions:
            _, reward, game_over = env2.step(action)  # Take one action at a time
            env2.render(WINDOW=WINDOW)

    # After the round ends, print the score and time taken
    a = int(time.time() - start)
    print(f"Score {env2.score} in {a // 60}:{a % 60}")
    
    # Show the result on the screen
    sleep(0.5)
    WINDOW.fill(env2.WHITE)
    if env2.score >= 100:
        env2.print_text(WINDOW=WINDOW, text_cords=(env2.WINDOW_WIDTH // 2, env2.WINDOW_HEIGHT // 2),
                        text=f"You Win - Score: {env2.score}", color=env2.RED, center=True)
    else:
        env2.print_text(WINDOW=WINDOW, text_cords=(env2.WINDOW_WIDTH // 2, env2.WINDOW_HEIGHT // 2),
                        text=f"Game Over - Score: {env2.score}", color=env2.RED, center=True)
    pygame.display.update()
    sleep(1)

pygame.quit()'''


'import numpy as np\nimport pygame\nimport time\nfrom keras.models import load_model\n\n# Load the trained model\nmodel_name_ = "M2__32C__32D__32D__ECC2__1A-5Ac___Eps(100)__max(_-90.00)__avg(_-99.00)__min(-100.00).model.keras"\nmodel = load_model(f\'models/{model_name_}\')\n\n# Initialize the environment\nenv2 = Environment()\nenv2.WALL_SPEED = 1\n\n# Initialize the game\nWINDOW = pygame.display.set_mode((env2.WINDOW_WIDTH, env2.WINDOW_HEIGHT))\nclock = pygame.time.Clock()\n\n# Play 20 rounds of the game\nstart = time.time()\nfor _ in range(20):\n    score_increased = False\n    game_over = False\n    _ = env2.reset()\n    pygame.display.set_caption("Game")\n    while not game_over:\n        clock.tick(27)\n\n        # Get model\'s prediction for the action probabilities\n        prd = model.predict((env2.field.body / env2.MAX_VAL).reshape(-1, *env2.ENVIRONMENT_SHAPE))\n\n        # Apply softmax to ensure valid probabilities\n        action_probs = prd[0]\n        action_probs = np.cli

You can get the full code [HERE](https://github.com/ModMaamari/reinforcement-learning-using-python)