Skip to content

Sample project tennis learning is slowly training, need fixing script? #1739

@Hamachan7

Description

@Hamachan7

Hi.
I try to run sample project "tennis learning" but agents seems to be trained slowly.
The below result was obtained with the original hyper parameters.

[terminal's output]

INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 1000. Mean Reward: 0.009. Std of Reward: 0.038. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 10000. Mean Reward: 0.030. Std of Reward: 0.053. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 20000. Mean Reward: 0.048. Std of Reward: 0.057. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 25000. Mean Reward: 0.057. Std of Reward: 0.063. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 30000. Mean Reward: 0.085. Std of Reward: 0.112. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 40000. Mean Reward: 0.978. Std of Reward: 0.927. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 45000. Mean Reward: 1.332. Std of Reward: 1.000. Training.

I noticed that both agentRb and ballRb refer to Agent's Rigidbody in TennisAgent.cs.
Here original script link

Line 28   agentRb = GetComponent<Rigidbody>();
Line 29   ballRb = GetComponent<Rigidbody>();
Line 30   var canvas = GameObject.Find(CanvasName);

Line 29, " ballRb = ball.GetComponent(); " is correct ?
According to the script context, agent needs ball's velocity to decide next action.
Fixed script gave below results, maybe this is going well.

INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 1000. Mean Reward: 0.006. Std of Reward: 0.034. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 10000. Mean Reward: 0.052. Std of Reward: 0.076. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 20000. Mean Reward: 0.153. Std of Reward: 0.148. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 25000. Mean Reward: 0.309. Std of Reward: 0.331. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 30000. Mean Reward: 0.833. Std of Reward: 0.728. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 40000. Mean Reward: 1.315. Std of Reward: 0.988. Training.
INFO:mlagents.trainers: tennis-0: TennisLearning: Step: 45000. Mean Reward: 1.408. Std of Reward: 1.075. Training.

Thank you.

Metadata

Metadata

Assignees

Labels

bugIssue describes a potential bug in ml-agents.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions