The anatomy of the RL Components:

Agent: An entity that observe the states of the envirobment and takes an action based on the current state
and a policy. In practice, the agent is a piece of code that implements the policy and decide what action
is needed at every time step, given the observations.

Environment: some model of the world, which is external to the agent and has the responsibility of providing the agents with observations and giving them rewards. It changes its state based on the actions done by the agents.

The following excerpts from the book represent important basic concepts about the RL model. 
The environment could be an extremely complicated physics model, and an agent could easily be a large neural network implementing the latest RL algorithm.
However, the basic pattern stays the same: on every step, an agent takes some observations from the environment, does its calculations, and selects the action to issue. The result of this action is a reward and new observations.

In [2]:
import random

In [9]:
# An example of en environment that gives the agent randomo rewards for a limited number of steps,
# regardless of the agent's actions

class Environment:
    # initializing the envirobment's internal state
    def __init__(self):
        # a counter thta limits the number of time steps the agent is allowed to interact with the env.
        self.steps_left = 10

    def get_observation(self):
        """
        Return the current state of the environment
        """
        return [0.0, 0.0, 0.0]

    def get_actions(self):
        """
        Allows the agents to query the set of actions it can execute. 
        Twon actions are allows in this scenario and are encoded with the integers 0 and 1.
        """
        return [0,1]

    def is_done(self):
        """
        This method signals the end of the episode to the agent.
        """
        return self.steps_left == 0

    def action(self, action):
        """
        The central Piece in the environment's functionality:
        - Handles the agent's action
        - Returns the reward for the action
        In this example the reward is random and the action is discarded.
        """
        if self.is_done():
            raise Exception("Game is over")
        self.steps_left -= 1
        return random.random()


class Agent:
    def __init__(self):
        self.total_reward = 0.0
        
    def step(self, env):
        """
        The step function accepts the environment instance and allows the agent to do:
        - Observe the environment
        - Maka a decision about the action to take based on the observaions
        - Submit the action to the environment
        - Get the reward for the current step
        """
        current_obs = env.get_observation()
        actions = env.get_actions()
        reward = env.action(random.choice(actions))
        self.total_reward += reward

In [19]:
# The glue code to create both classes and run one episode
if __name__ == "__main__":
    env = Environment()
    agent = Agent()
    
    while not env.is_done():
        agent.step(env)
    
    print(f'Total reward is: {agent.total_reward:.4f}')

Total reward is: 4.5876
