# 3. AWS DeepRacer Reward Function Examples - Advanced

Based on experiences of people at various different AWS DeepRacer events across the globe, we have collated a set of advanced Rewward functions which could help you achieve faster times

#### __Source- [Github- scottpletcher/deepracer](https://github.com/scottpletcher/deepracer)__

Selecting the top reward functions by the author. You can learn more by clicking on the link

__PurePursuit__

Based on an acedemic paper from 1992 by R. Craig Coulter titled "Implementation of the Pure Pursuit Tracking Algorithm".

When we drive a real car, we don't look out the side window and ensure we're a distance from the side of the road rather, we identify a point down the road and use that to orient ourselves.

All the hyperparameters were set to default and training period was for about 4 hours

__Note__: this function is not an exact replica of the one stated at the source as the parameter listing have changed over time

In [None]:
def reward_function(params):
    
    import math
    
    reward = 1e-3
        
    rabbit = [0,0]
    pointing = [0,0]
    
    # Read input variables
    waypoints = params['waypoints']
    closest_waypoints = params['closest_waypoints']
    heading = params['heading']
    x=params['x']
    y=params['y']
    # Reward when yaw (car_orientation) is pointed to the next waypoint IN FRONT.
    
    # Find nearest waypoint coordinates
    rabbit = waypoints[closest_waypoints[1]]
    
    radius = math.hypot(x - rabbit[0], y - rabbit[1])
    
    pointing[0] = x + (radius * math.cos(heading))
    pointing[1] = y + (radius * math.sin(heading))
    
    vector_delta = math.hypot(pointing[0] - rabbit[0], pointing[1] - rabbit[1])
    
    # Max distance for pointing away will be the radius * 2
    # Min distance means we are pointing directly at the next waypoint
    # We can setup a reward that is a ratio to this max.
        
    if vector_delta == 0:
        reward += 1
    else:
        reward += ( 1 - ( vector_delta / (radius * 2)))

    return reward

__SelfMotivator__

_With supervised learning, your model will only be as good as the ground truth you have to give it. With reinforcement learning, the model has the potential to become better than anything or anyone has ever done that thing._

Trust in the reinforcement learning process to figure out the best way around the track.<br>
The author decided to create a simple function that simply motivated the model to stay on the track and get around in as few steps as possible. 

Trained for about 3 hours

In [None]:
def reward_function(params):

    if params["all_wheels_on_track"] and params["steps"] > 0:
        reward = ((params["progress"] / params["steps"]) * 100) + (params["speed"]**2)
    else:
        reward = 0.01
        
    return float(reward)

<hr style="border:1px solid gray"> </hr>

#### __Source- [Medium- beSharp](https://medium.com/proud2becloud/deepracer-our-journey-to-the-top-ten-257ff69922e)__

Selecting the top reward functions by the author. You can learn more by clicking on the link

In [2]:
import math
def reward_function(params):
    
    track_width = params['track_width']
    distance_from_center = params['distance_from_center']
    steering = abs(params['steering_angle'])
    direction_stearing=params['steering_angle']
    speed = params['speed']
    steps = params['steps']
    progress = params['progress']
    all_wheels_on_track = params['all_wheels_on_track']
    ABS_STEERING_THRESHOLD = 15
    SPEED_TRESHOLD = 5
    TOTAL_NUM_STEPS = 85
    
    # Read input variables
    waypoints = params['waypoints']
    closest_waypoints = params['closest_waypoints']
    heading = params['heading']
    
    reward = 1.0
        
    if progress == 100:
        reward += 100
    
    # Calculate the direction of the center line based on the closest waypoints
    next_point = waypoints[closest_waypoints[1]]
    prev_point = waypoints[closest_waypoints[0]]
    
    # Calculate the direction in radius, arctan2(dy, dx), the result is (-pi, pi) in radians
    track_direction = math.atan2(next_point[1] - prev_point[1], next_point[0] - prev_point[0]) 
    
    # Convert to degree
    track_direction = math.degrees(track_direction)
    
    # Calculate the difference between the track direction and the heading direction of the car
    direction_diff = abs(track_direction - heading)
    
    # Penalize the reward if the difference is too large
    DIRECTION_THRESHOLD = 10.0
    
    malus=1
    
    if direction_diff > DIRECTION_THRESHOLD:
        malus=1-(direction_diff/50)
        if malus<0 or malus>1:
            malus = 0
        reward *= malus
    
    return reward

<hr style="border:1px solid gray"> </hr>

#### __Source- [Medium- Sarah Lueck](https://medium.com/axel-springer-tech/how-to-win-aws-deepracer-ce15454f594a)__

Selecting the top reward functions by the author. You can learn more by clicking on the link

In [3]:
import math


def reward_function(params):

    # Read input parameters
    track_width = params['track_width']
    distance_from_center = params['distance_from_center']
    all_wheels_on_track = params['all_wheels_on_track']
    is_left_of_center = params['is_left_of_center']
    steering_angle = params['steering_angle']
    speed = params['speed']
    
    if is_left_of_center == True:
        distance_from_center *= -1

    # implementation of reward function for distance from center
    reward = (1 / (math.sqrt(2 * math.pi * (track_width*2/15) ** 2)) * math.exp(-((
            distance_from_center + track_width/20) ** 2 / (4 * track_width*2/15) ** 2))) *(track_width*1/3)
    
    if not all_wheels_on_track:
        reward = 1e-3

    # implementation of reward function for steering angle
    STEERING_THRESHOLD = 14.4
    
    if abs(steering_angle) < STEERING_THRESHOLD:
        steering_reward = math.sqrt(- (8 ** 2 + steering_angle ** 2) + math.sqrt(4 * 8 ** 2 * steering_angle ** 2 + (12 ** 2) ** 2) ) / 10
    else:
        steering_reward = 0

    # aditional reward if the car is not steering too much
    reward *= steering_reward

    # reward for the car taking fast actions (speed is in m/s)
    reward *= math.sin(speed/math.pi * 5/6)
    
    # same reward for going slow with greater steering angle then going fast straight ahead 
    reward *= math.sin(0.4949 * (0.475 * (speed - 1.5241) + 0.5111 * steering_angle ** 2))

    return float(reward)

#### __Source- [Github- VilemR/AWS_DeepRacer](https://github.com/VilemR/AWS_DeepRacer)__

Learn more by clicking on the link