# Introduction
The aim of this notebook is to study the <b>crawler</b> example of the [ml-agents](https://github.com/Unity-Technologies/ml-agents) repository. The crawler is an agent that has a main body and 4 legs composed by 2 limbs each and moves in a plane to reach a target.

We will explain how the example works and train one with a modification to compare the results with an unmodified environment using the information that Tensor Board provides us. All of the work done here will be focused on the `CrawlerDynamicLearning` scene from the Unity project.
## Team Information
### David Pérez Gallego
    Student at ENTI-UB: Interactive digital content
    Email: davidperezgallego@enti.cat  
<img width='150px' align='center' src='img/david.jpg'>

### Eduard Arnau Romeu
    Student at ENTI-UB: Interactive digital content
    Email: eduardarnauromeu@enti.cat    
<img width='150px' align='center' src='img/eduard.jpg'>

# Case analysis
First of all, we will take a look at the crawler as a GameObject and all the parts and scripts involved in it:

<img width='900px' align='center' src='img/crawler-gameobject.png'>

<center><cite>Crawler GameObject in the ml-agents unity project</cite></center>

As we can see, the crawler is formed by <b>4 legs</b> (each one with its relative foreleg) and a <b>body</b>.

### Legs
Each leg part is attached to its father limb by a configurable joint which applies an angular constraint in each axis and prevents any position variation between the leg part and the anchored component.

The upper leg part can rotate in the X and Y axis while the lower part can only rotate in the X axis.

The foreleg is attached to the leg which is attached to the main body.

#### <center> Main Body ←(Joint)← Leg ←(Joint)← Foreleg </center>

Apart from the movement, each leg also has a script called `GroundContact.cs` which checks the collision with the `ground` layer. This script allows us to use the collision with the ground to either punish the agent, set the agent as done or use the collision flag as an observation for the agent.

The <b>forelegs are used as observations</b> when they collide with the ground and <b>the upper part of the legs are used for punishment</b> since we don't want the agent to use the upper part of the legs to move to the target.

### Body
The body also has the `GroundContact.cs` code attached. To prevent the agent from dragging its body while walking, <b>the agent is punished whenever the body collides with the ground.</b>

### Controller
The agent, which runs in a script called `CrawlerAgent.cs` handles all the behaviour while the important body parts of the body are handled by the script `JointDriveController.cs` to store relevant information for acting and learning of each relevant body part.

This last script allows the agent to reset the joints, set their target rotation and their strength in order to achieve the desired behaviour. Each joint has the

The controller overrides certain functions from the `Agent.cs` class. The added funcitonalities of those functions are the following:

##### InitializeAgent()
This function <b>initializes all the agent body parts</b>.

First of all stores a reference to the `JointDriveController.cs` script that the agent has attached and then initializes the body parts of the agent. <b>Each body part is stored in a dictionary</b> with the transform of each body part as the key and a custom class `BodyPart` as the data. 
`BodyPart` is a class which belongs to `JointDriveController.cs` and contains all the relevant information of the body part and allows easy access and modification to the `ConfigurableJoint` attached to the GameObject.

##### CollectObservations()
This functions <b>collects observations for the agent brain so it can learn</b>.

It starts by observing the current position relative to the target (referenced as dirToTarget in the code) andhen it stores the body orientation (up and front) and also its Y position. Remember that the agent is punished whenever the body touches the ground.

Now the funcitons proceeds to analyze each body part. For each joint the crawler agent checks if the joint is touching the ground using the code `GroundContact.cs` previously explained, the velocity and the angular velocity of the part. When the joint it is observing is not the body (which is the root joint of the agent), it also stores the position of the joint relative to the body, the current rotation (in each axis) and how much strength the joint is applying with a value between 0 and 1 relative to the maximum force that the joint can apply.

##### AgentAction()
This function <b>checks if the agent has reached the target</b>, it also <b>updates the joints based on the decision frequency and input action </b> and finally it <b>rewards or punishes the agent.</b>

This function starts by checking if any body part has touched the target. In case any is touching it, the agent gets a substantial reward and the target is set to a new random position in the environment.

After checking if the target is reached, the direction to the target (referenced as dirToTarget in the code) is updated.


#### <center>dirToTarget = target.position - body.position<center>

The agent checks if it has to take a decision in the current step and if the flag is set to true it takes action.
The agent proceeeds to apply torque in two axis to the upper limbs (X and Y as stated in the Legs part above) and in one axis for the lower limbs (the X axis). After applying torque it sets the joints strength for this decision step.

The function now proceeds to reward or punish the agent depending on 3 factors:
    - The agent moving towards the objective.
    - The agent body facing the target.
    - The time taken by the agent to reach the target.

Last but not the least, the function increments the decision timer which modifies the decision flag and allows the agent to know if it will have to decide in the next step.




# Performance Analysis

Anàlisis de Tensorflow

# New Case Porposal

Decidir si afegim una altra cama, fem que salti o alguna altra cosa i explicar quin procediment fem servir.