### Random Agent and Environment using Python ###

In [1]:
## Importing necessary library ##

import math
import random

In this project we are going to develop a simple environment and an agent without the use of any open source RL library.

So, what we will prepare in this project?

- [ ] We will build **2 Classes**. The first one is the **Environment Class** and the second one is the **Agent class**.
- [ ] In the constructor of the **Environment Class** we will initialize a class variable, *total_step*, with a value of 15. Apart from that The Environment class needs to have the 5 other methods. 
    - The first is the *observation_space(self)* method, which gives the current observation that the Agent will see in the environment. For our project this will be a list with 3 random values. We don't care actually.
    - The second method we need to create is the *action_space(self)* method, which contains the actions an agent can take in the environment. For our project it will be a list with 2 values 0. and 1. .
    - Next we will create a *done(self)* method, which will check if our agent has reached the terminal state. For our project it will return True if the class variable total_steps has reduced to 0.
    - We also need to create a *reset(self)* method, which resets the environment to the initial state. For our project this would be nothing but to reset the value of the class variable total_steps to 15.
    - Finally we will create a method called *do_action(self,action)* which takes in a action from the action_space and returns a reward. For our project it would be simply to reduce the total_steps.
- [ ] The **Agent Class** is much simpler than the **Environment Class**. In the constructor, we will initialize the total_reward to zero. We will also create one method called *step(self)* which will take in the environment as its argument. The step method will access the current observation and will choose the current action from the action space. After that it will do the action and gain a reward based on the value of the action and will keep on accumulating it in the total_reward variable.    


In [2]:
## Creation of the Environment ##

class Environment:
    '''
    Creates a random environment.
    '''
    
    def __init__(self):
        self.total_step = 15
        
        
    def observation_space(self):
        '''
        Returns the current observation space.
        '''
        
        obs_space = []
        
        for i in range(3):
            obs_space.append(round(random.random() , 2))
            
        return obs_space
    
    def action_space(self):
        '''
        Returns the actions available for the agent.
        '''
        
        return [0. , 1.]
    
    def done(self):
        '''
        Returns True if the agent has reached the terminal state.
        '''
        
        return self.total_step == 0
    
    def reset(self):
        '''
        Resets the environment to its initial state.
        '''
        
        self.total_step = 15
        
    def do_action(self , action):
        '''
        Alters the environment by doing an action by the agent.
        '''
        
        if self.done():
            raise Exception('Game Over!')
            
        self.total_step -= 1
        
        return random.random()

In [3]:
## Building the Agent Class ##

class Agent:
    '''
    Develops a random agent.
    '''
    def __init__(self):
        self.reward = 0.0
        
    def step(self , environment):
        '''
        An agent step in the environment.
        '''
        
        observation = environment.observation_space()
        
        action = random.choice(environment.action_space())
        
        self.reward += environment.do_action(action)

In [4]:
## Setting the agent in the wild environment ##

env = Environment()

agent = Agent()

steps_taken = 0

while not env.done():
    
    agent.step(env)
    
    steps_taken += 1
    
print('Total rewards gained is :' , agent.reward)
print('Total steps is :' , steps_taken)

Total rewards gained is : 6.007433063018077
Total steps is : 15


Easy breeze. That was one simple start to our RL journey. It was very short but it is indeed very important. From the next notebook we will try to build our agents and environments with other open source RL libraries like OpenAI Gym. Moreover, I would try to develop the theoretical intuition as I go along.