# Reward for falling down in `BipedalWalker-v3`

The goal in `BipedalWalker-v3` is for the **robot to walk forward efficiently**. `gym` has engineered a reward function such that the maximization of the cumulative reward is equivalent to reaching this real world goal.

The reward function is explained in the `BipedalWalker` class' docstring in the [source code](https://github.com/openai/gym/blob/master/gym/envs/box2d/bipedal_walker.py). The rewards after each step are calculated by adding the following.

- Rewards proportional to the displacement of the robot in the step (this rewards forward movement)
- Small negative rewards proportional to motor torque applied in the step (more motor torque = less efficient)
- Big negative reward -100 if the robot is lying down on the ground at the end of the step (punishes non-walking states e.g. lying down, crawling etc.). In this happens, the displacement related reward or the motor torque related reward do not apply for this step.

You can convince yourself that maximizing the cumulative reward would be equivalent to the robot walking forward efficiently.

In this exercise, we are going to check if the robot really gets a reward of -100 when it falls down.

In [None]:
# Fill in the blanks and run the cell
import gym
import numpy as np

env = gym.make("BipedalWalker-v3")
obs = env.reset()
for _ in range(200):
    ____, ____, ____, ____ = ____    # Take the action [1., 0., 1., 0.] and store the stepwise reward
    print(____)    # Print the reward
    env.render()
env.close()    

When you run the cell, you should see the robot fall down after a while in the graphical window. It looks kind of funny actually.

The printed rewards should reflect that the robot has fallen down. 

When the robot was still upright, it didn't make much forward progress but was spending motor torque. So it should mostly get small negative rewards in this phase.

After it falls down, it should receive -100 for every step that it's lying down.

Check if the printed rewards change from small (usually negative) values to -100 at some point. 

If so, hurray! You have verified that the stepwise reward when the robot is lying on the ground is -100. 