# Input Output. Delivery One

<img src="../docs/images/image-banner.png" align="middle" width="3000"/>

## G01: Victor Armisen and David Recuenco

## 1. Introduction
The purpose of this notebook is explaining our work with the environment "Tennis" from the ml-agents examples.
This example, as its name says, simulates a tennis match between two agents which follows the real tennis rules.
What we wanted to do with the example was using only an agent and:
* Make the agent play paddle alone and following the game's rules: the ball has to touch the ground once before being hit and if the ball touches the ground two times in a row without touching the front wall, the point is lost.
* Make the agent do keepy-ups not letting the ball touching the ground. 
* The same keepy-ups as above but with wind, making it harder for the agent to keep the ball in the air.

### The team
Name | Enti email | Picture
--- | --- | ---
Victor Armisen | victorarmisencapo@enti.cat | placeholder picture
David Recuenco | davidrecuencooliver@enti.cat | <img src="../docs/images/DRO.png" width="100"/>

## 2. Case Analysis
The example is managed by 3 scripts:

* **TennisArea**

Manages the ball's physics and has the function to reset the match, spawning the ball in a random side of the court.

In [None]:
    public void MatchReset()
    {
        var ballOut = Random.Range(6f, 8f);
        var flip = Random.Range(0, 2);
        if (flip == 0)
        {
            ball.transform.position = new Vector3(-ballOut, 6f, 0f) + transform.position;
        }
        else
        {
            ball.transform.position = new Vector3(ballOut, 6f, 0f) + transform.position;
        }
        m_BallRb.velocity = new Vector3(0f, 0f, 0f);
        ball.transform.localScale = new Vector3(.5f, .5f, .5f);
        ball.GetComponent<HitWall>().lastAgentHit = -1;
    }

* **TennisAgent**

Obviously, manages the agent which in this case is the racket. The script has the heuristic, movement and reset.
In the agent we slightly change its properties in the case of Keep Up to simulate the touches.
In all cases, we use the inputs of the vectorActions to change the speeds and rotations of the agent.

In [None]:
// Add some code here

* **HitWall**

We check the collisions of the ball with the objects in the environment and give the corresponding rewards according to these events.
## CAMBIAR ESTO

In [None]:
            else if (collision.gameObject.name == "wallB")
            {
                // Agent B hits into wall or agent A hit a winner
                if (lastAgentHit == 1 || lastFloorHit == FloorHit.FloorBHit)
                {
                    AgentAWins();
                }
                // Agent A hits long
                else
                {
                    AgentBWins();
                }
            }
            else if (collision.gameObject.name == "floorA")
            {
                // Agent A hits into floor, double bounce or service
                if (lastAgentHit == 0 || lastFloorHit == FloorHit.FloorAHit || lastFloorHit == FloorHit.Service)
                {
                    AgentBWins();
                }
                else
                {
                    lastFloorHit = FloorHit.FloorAHit;
                    //successful serve
                    if (!net)
                    {
                        net = true;
                    }
                }
            }

### Rewards:
We give positive rewards in relation if the ball touches it simply touches the racket. According to distances and time.
We give negative rewards if the ball hits the ground and if it hits invisible walls that represent that it goes off the court.
## CAMBIAR ESTO

In [None]:
    void AgentAWins()
    {
        m_AgentA.SetReward(1);
        m_AgentB.SetReward(-1);
        m_AgentA.score += 1;
        Reset();

    }

    void AgentBWins()
    {
        m_AgentA.SetReward(-1);
        m_AgentB.SetReward(1);
        m_AgentB.score += 1;
        Reset();

    }

### States:
We use States to check more specific events and thus achieve better results.
For example, to check if the ball bounces on the ground, if it hits the wall on the ground ...
## CAMBIAR ESTO Y EXPLICAR DEL EJEMPLO

In [None]:
    public enum FloorHit
        {
            Service,
            FloorHitUnset,
            FloorAHit,
            FloorBHit
        }

    public FloorHit lastFloorHit;

### Training:
We do learning checks around a 100K Steps. From then on, we consider that, if we see correct results, the model is worth it.
We check these results through the variables used and the graphs that we obtain locally with Tensor.

## 3. Performance Analysis
Explicación

Add pictures here

## 4. New case proposal

As mentioned in the introduction, we have three different cases to train the agent with.
For making this possible we had to remove one of the scene's agents and adapt the scripts to one agent only since they were made for two.

The spawn of the ball had to be modified since it was randomly spawned to both sides.

In [None]:
        var ballOut = Random.Range(-6f, -8f); // distancia en x
        ball.transform.position = new Vector3(ballOut, 8f, 0f) + transform.position;

### Paddle:

<img src="../docs/images/Paddle.png" align="middle"/>

For the Paddle case all was focused on the collisions the ball made. For this, an enum was used as for a status machine in order to check what it colided with.

In [None]:
    void OnCollisionEnter(Collision collision) {
        switch (state) {
            case Status.Floor:
                if (collision.gameObject.name == "Agent") {
                    state = Status.Agent;
                } 
                else Death();
                break;

            case Status.Agent:
                if (collision.gameObject.name == "WallFront") {
                    if (!firstGame) {
                        currentTouches++;
                        GivePositiveReward();
                    }
                    else {
                        firstGame = false;
                        GivePositiveReward_Less();
                    }
                    state = Status.Wall;
                }
                else Death();
                break;

            case Status.Wall:
                if (collision.gameObject.name == "Floor") state = Status.Floor;
                else Death();
                break;
        }
    }

There are two types of reward given to the agent:
* **GivePositiveReward()** Gives a reward of value 1. Used for the normal touches.
* **GivePositiveReward_Less()** Gives a reward of value 0.5. Used for the first successful touch since the next ones are the ones that count.

As for the results, the agent's learning is slow at first but as it's swon in the graphs, it learns exponentially
<img src="../docs/images/Paddle_ELO.png" align="middle"/>
<img src="../docs/images/Paddle_AR.png" align="middle"/>
<img src="../docs/images/Paddle_EL.png" align="middle"/>

One trick used to help out the agent to learn faster got nothing to do with the rewards: I changed the height position of the spawn of the ball so the agent does not need to wait for the ball to fall. The ball spawns close to the agent so he can hit the ball and let it enough space to hit the wall and then the floor without letting the agent hit the ball in the middle that easily. Another trick used to help out the agent was keeping the rewards constant to it gets used to the training without changes while learning.

### Keepy-ups
<img src="../docs/images/KeepsUps.png" align="middle"/>
In the first case of Keepy-ups, we have a first case without rotation in which the racket has to learn to approach the instanced ball in a random way and keep the touches up.
In the second case, the racket has to learn to control the rotation of the racket and we send the ball to different places already in XYZ. When instantiating the ball, we apply a force to it so that it does not simply fall.

In [None]:
// Code Keepy-ups
//EX3: Wind and Rotation
Vector3 dir = ball.transform.position - transform.position;
dir.Normalize();

distance = ball.transform.position.x - transform.position.x;
distance = Mathf.Abs(distance);
if (distance < 2.0f)
{
AddReward(1);
}
else
{
AddReward(-1);
}
m_AgentRb.velocity = new Vector3(moveX * dir.x * 30.0f, m_AgentRb.velocity.y, 0f);

//EX3: Wind and Rotation
m_AgentRb.velocity = new Vector3(moveX * dir.x * magnitude, m_AgentRb.velocity.y, moveX * dir.z * magnitude);
m_AgentRb.transform.rotation = Quaternion.Euler(-180f, -180f, 55f * rotate);

### Keepy-ups with wind
<img src="../docs/images/EX3_Tennis_ELO.png" align="middle"/>
<img src="../docs/images/EX3_Tennis_E.png" align="middle"/>
<img src="../docs/images/EX3_Tennis_C.png" align="middle"/>


In [None]:
// code del wind I guess