# Create Reward function

## In this exercise we will create our own simple reward function
#### The reward function describes immediate feedback (as a score for reward or penalty) when the vehicle takes an action to move from a given position on the track to a new position. Its purpose is to encourage the vehicle to make moves along the track to reach its destination quickly. The model training process will attempt to find a policy which maximizes the average total reward the vehicle experiences.

Lets begin designing our reward function based on the principle of __following the centre line__.
As our DeepRacer races along the track we would like it to stick to the middle of the track by following the centre line.
This can be done by placing __markers__ which rewards the DeepRacer when it drives near the centre line and penalise as it drifts away from the centre line.

We will take __3 markers__ to denote these points on the track.

<img src="images/Markers.PNG">

### Setting up Markers

Steps to follow:
> 1. Marker 1 denotes the area nearest to the centre line. We can calibrate the width of Marker 1 by changing it's value on the slider __Marker 1__ 
>2. Similarly we do the same for the __Marker 2__(further away from Marker 1) and __Marker 3__(Further away from marker 2)

Note: the placement of the marker is calculated by the simple mathematical formula. __Maker value__ is denoted by value chosen by the user(Marker 1, Marker 2, Marker 3)

`Marker position = Marker value X Track width`

Track width is a preporgrammed standard value inside the DeepRacer

In [2]:
from myfunctions import sliders_init

m1,m2,m3=sliders_init()
display(m1)
display(m2)
display(m3)

FloatSlider(value=0.1, continuous_update=False, description='Marker1:', max=0.15, min=0.01, step=0.01)

FloatSlider(value=0.25, continuous_update=False, description='Marker2:', max=0.35, min=0.16, step=0.01)

FloatSlider(value=0.5, continuous_update=False, description='Marker3:', max=0.5, min=0.36, step=0.01)

Tips for selecting Marker values:

1. Have a healthy gap between markers
2. Do not give very small values to marker 1

Reason: Both these actions will give very little room for the DeepRacer to explore better paths for it to go faster around the track

Good values for the markers would be:

Marker 1: 0.1<br>
Marker 2: 0.25<br>
Marker 3: 0.5

### Setting up Reward Values

After setting up the distance markers we will now configure reward points based on how far our DeepRacer is from the centre line. Our aim is to give maximum reward to our DeepRacer when it closer to the centre line and least when it is far away from it.

Reward 1 denotes the reward value we will give to our DeepRacer when it is inside the Marker 1 width<br>
Reward 2 denotes the reward value we will give to our DeepRacer when it is inside the Marker 2 width but wider than Marker 1<br>
Reward 3 denotes the reward value we will give to our DeepRacer when it is inside the Marker 3 width

Reward 4 denotes the reward value we will give to our DeepRacer when it is outside the Marker 3 width and has most probably crashed/ close to off track.

In [1]:
from myfunctions_rewards import sliders_init_reward

r1,r2,r3,r4=sliders_init_reward()
display(r1)
display(r2)
display(r3)
display(r4)

FloatSlider(value=0.5, continuous_update=False, description='Reward1:', max=1.0, min=0.1, step=0.01)

FloatSlider(value=0.5, continuous_update=False, description='Reward2:', max=1.0, min=0.1, step=0.01)

FloatSlider(value=0.5, continuous_update=False, description='Reward3:', max=1.0, min=0.1, step=0.01)

FloatSlider(value=0.005, continuous_update=False, description='Reward4:', max=0.05, readout_format='.3f', stepâ€¦

### Creating our reward function

Run the code present below and the output will generate your customized Reward function based on the values you selected for your reward graph.

Copy the output and paste it inside the code editor on your AWS Console.

In [4]:
from print_reward import print_reward_func
print_reward_func(m1,m2,m3,r1,r2,r3,r4)

def reward_function(params): 

    track_width = params['track_width'] 
    distance_from_center = params['distance_from_center']

    marker_1 = 0.1 * track_width
    marker_2 = 0.25 * track_width
    marker_3 = 0.5 * track_width

    if distance_from_center <= marker_1:
        reward = 0.5
    elif distance_from_center <= marker_2:
        reward = 0.5
    elif distance_from_center <= marker_3:
        reward = 0.5
    else:
        reward = 0.0455

    return float(reward)


<img src="images/AWS_Code_editor.PNG">

Make sure to select the __Validate__ button present on your AWS Code editor to make sure the code is correct. 

If there is an error. Select __Reset__ and rerun your code and paste it again.
In case of presistent error kindly reach out to us.