This uses Unity's Tennis environment.
A double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.
- +0.1 if hit the ball over the net
- -0.01 if let the ball hit the ground
- -0.01 if let the ball hit out of bounds
- The score is rewarded to both players
- Continuous 8 dimensions
- They correspond to:
- positions (2)
- velocities (2)
- above for the ball and the racket (x2)
- Each player observes independently
- Continuous 2 dimensions
- racket's movement toward net and jumping
- Each player takes actions independently
- Each has range of [-1, 1]
- get an average episode score of more than +0.5
- the look back period to calculate the average is 100 consecutive eposodes
- For the agents, used DDPG method
- Used simple NN for each of actor and critic network as following:
- actor network
- 3 fully connected layers with ReLu and Tanh activation functions
- The first two have size of 128
- last layer's output size = action size (4)
- critic network
- 3 fully connected layers with ReLu and Tanh activation functions
- the first layer size = 128
- second layer size = 128 + action size (4)
- last layer's output size = 1
- actor network
The agents learned the tennis and achieved +0.5 point score in average around in 900 episodes.
- Architecture: Currently, just used simple NN. Going forward, using more complex architecture may improve the score
- Hyperparameter: the parameter has not been fully optimized. Here is another opportunity of improvement. (ex. Decay of noise can be introduced)
- Replay Buffer: Since the play ground is symmetry, it may increase the training efficiency if the flipped results are added to the training samples
- The trained model is
checkpoint_actor.pth
. You can use this on Unity environment mentioned below
-
Download the Unity environment from one of the links below
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
-
Place the file in this repository and unzip