# Reward building

# Overview of ideas:

Utility / potential / shaping combined reward function candidates:
- Align ball:
	- expected values are higher near the player team goal - not that great, need to make sure weight is not that high
	- could be enhanced by:
		- distance ball to goal: 
			- alignment + near goal means higher probability of scoring while also securing
			- problem: opponent team could tackle the player and have the ball move toward team goal in a straight line very quickly
			- idea: use the ball distance difference between the two goals. This way alignment is scored higher near the center and lower near the goals.
		- velocity player to ball:
			- velocity + alignment could throw the ball directly into the goal
			- problem: moving forward with high velocity and having to move backward if ball not scored can take time and is costly
- Ball to goal wall distance / y coord:
	- the higher the y axis value the more the team pushes the opponent team to defend
- Distance ball to goal: 
	- one of the two best possible rewards
- Distance player to ball:
	- increases control strength
- Velocity ball to goal: 
	- one of the two best possible rewards
	- increases probability of scoring
- Velocity player to ball:
	- mediocre as a reward BUT
	- idea: enhance by player to ball and ball to goal offensive vector alignment
    - idea extended: include player to ball distance, players that are far away have a very low chance of scoring. Reward name idea: offensive potential.

Reward:
- Touch ball with acceleration (difference between current and previous ball linear velocity) toward goal (current ball linear velocity vector)
- Save boost reward inversely weighted by the absolute value of the utility score:
	- the larger or smaller the utility, the more in demand would be to use boost
- Event rewards:
	- Demolish and demolished
	- Goal / team goal and concede
	- Shot (as detected by the game)
	- Save (as detected by the game)

------

- But what kind of weights?
- How to distribute reward between multiple agents of different purposes?


Import needed libraries

In [1]:
import sys
from pathlib import Path

# Modify this as you wish, used for importing custom project packages, such as `utils`
project_path = str(Path.home()) + "\\Projects\\rlgym_demo"

sys.path.insert(0, project_path)

In [2]:
import numpy as np
import matplotlib.pyplot as plt
from utils.analysis import plotting, generate
from utils.analysis.reward_functions import common, custom
from rlgym.utils import common_values
import pandas as pd
import seaborn as sns

Retrieve arena positions

In [3]:
arena_positions = plotting.arena_positions

Generate uniformly distributed dummy data

In [4]:
grid_positions = generate.grid_positions()
player_velocities = np.linspace((1327.9,) * 3, (0,) * 3, 8)
ball_velocities = np.linspace((3464.1,) * 3, (0,) * 3, 8)
forward_vectors = generate.sphere_points()

Training measurements

In [5]:
# maximum episode length in seconds
max_episode_len = 300
# default frame skip
frame_skip = 8
# the number of seconds it takes the gamma exponential to reach 0.5
half_life_seconds = 10
# game physics engine runs at 120 Hz\fps
fps = 120 // frame_skip
# inverse function gamma computation
gamma = np.exp(np.log(0.5) / (fps * half_life_seconds))

In [6]:
gamma

0.9953896791032291

Assumed average values for a normal game

In [None]:
average_ball_vel = common_values.BALL_MAX_SPEED // 2 // fps
average_car_vel = common_values.CAR_MAX_SPEED // 2 // fps

average_ball_height = 900
average_car_height = 700
# We assume a case for which the blue team is winning in offense / scoring a goal
average_ball2goal_vel = 1500 // fps
average_car2ball_vel = 850 // fps



## Utility function