# Data Analysis

In this section, we intend to present the analysis of our data. We will focus on the main characteristics of our environment (i.e. arrival times, departure times, idle times, etc.). Our aim is to compare it with previous work like in *Paper 2* and *Paper 3*. The last part of this section tries to describe the behaviour observed from the EnergyVille PV data and how it can be exploited in our approach. The ElaadNL dataset needs not to be studied because it is already analyzed in *Paper 2* and *Paper 3*.

First of all, it is important to have present that our environment is an example of a *charge near work* cluster. This means that the transactions available in our data have been generated by workers from EnergyVille, who have completed transactions making use of the charging stations from the parking lot. Belonging to the *charge near work* cluster already gives us an idea of the characteristics that the transactions will have, but nevertheless we are going to proceed with the analysis to confirm assumed hypothesis and encounter possible anomalies.

The plan is to analyze:

1. Analysis of times<br>
    *1.1 Arrival times*<br>
    *1.2 Departure times*<br>
    *1.3 Sojourn times*<br>
    *1.4 Charging times*<br>
    *1.5 Idle times*<br>
    *1.6 Top-6 user behaviour*
2. Analysis of the power flow during a transaction
3. EnergyVille PV data analysis

## 1. Analysis of times

This section studies the time distribution presented by our data. Our goal is to find patterns which accurately describe our environment and helps us estimate when the balance of the charging load could be delayed and what potential it has.

From *Paper 2* and *Paper 3*, the following characteristics are found:

* **Arrival times**: Early morning (around 6-9am).
* **Departure times**: Late afternoon.
* **Sojourn times**: Average around 9 hours. Min-Max = [5.00, 18.52]. 
* **Charging times**: It normally happens throughout the day, car sleepovers are an exception.
* **Idle times**: Average is 5h 30min. Min-Max = [0, 15.54].

Now let's compare this assumptions with our data.

### 1.1 Arrival times

Below we plot the arrival times distribution:

<img src="../../../img/startTime_densityfunction.png" alt="Arrival Times Density Function" width="700" style="float:left;"/>

As we can clearly see in the graph, the assumption related to arrival times obtained from *Paper 1* and *Paper 2* (i.e. early morning around 6-9am) holds here. The mean is **08:05h** and the interquartile range covers **[06:41h, 07:52h]**.

### 1.2 Departure times

Below we plot the departure times distribution:

<img src="../../../img/stopTime_densityfunction.png" alt="Departure Times Density Function" width="700" style="float:left;"/>

The assumption stated above for the departure times (i.e. departure times usually happen in the late afternoon) also holds in our case. The mean is **14:41h** and the interquartile range covers **[14:15h, 16:00h]**.

### 1.3 Sojourn times

Below we plot the sojourn times distribution:

<img src="../../../img/timediff_densityfunction.png" alt="Sojourn Times Density Function" width="700" style="float:left;"/>

The sojourn times also follow the same pattern as the previous assumptions. In this case, it is said that the sojourn times average is around 9 hours, the mean for our environment is **8.3 hours**, the min-max relationship is **[0.1-15] hours** and the interquartile range is **[5.3-8.9] hours**.

There is a remark to be added here. There are 4 points (230.47, 46.85, 38.4 and 22.84) which are considered outliers and have not been taken into account in the plotting of this distribution. They are considered outliers because they represent the 0.017% of the transactions. Cars that stay over night are considered exceptions, hence they will not be taken into account.

### 1.4 Charging times

Below we plot the charging times distribution:

<img src="../../../img/chargingtimes_distribution.png" alt="Charging Times Distribution" width="900" style="float:left;"/>

As mentioned in the previous section, car sleepovers are an exception and we are not going to take them into account. Having a look at the Charging Times Distribution, we can see that, given that the vast majority of the arrivals are in the morning, the charging always happens throughout the day. The mean is **2.62 hours**, the interquartile range is **[1.9, 3.1] hours** and the min-max relationship is **[0, 6.8] hours**.

The method we have used to estimate the end of the charging is the following: first, get all power data from the meter values table for the specific transaction, then iterate through every power value checking if reached 0 W. When it does we save that timestamp and we iterate over the power values for the next 30 minutes. If the power value has stayed at level 0 W during that period of time, then we conclude that the timestamp is the end of charging. If not, we go back and continue with the following timestamp.

### 1.5 Idle times

Below we plot the idle times distribution:

<img src="../../../img/idletimes_distribution.png" alt="Idle Times Distribution" width="900" style="float:left;"/>

Once again, the assumptions from the related work fit perfectly with the characteristics of our dataset. The mean is **5.2 hours** (compared to 5.5h of *Paper 3*) and the min-max relationship is **[0.5, 20.1] hours**. Additionally, the interquartile range is **[3.8, 6.5] hours**.

The idle times are the definition of the available flexibility to delay the charging of EV's but defined in a timely manner. Because of its relevance to our goal to balance the load against a renewable energy generation source, we propose to investigate deeper and visualize the idle times for the top-6 users with most transactions.

(The idle times shorter than 15 minutes have been removed from the dataset for a better visualization, just as mentioned in *Paper 3*.)

### 1.6 Top-6 user behaviour

In this section we will show the behaviour of the top-6 users with most transactions.

<tr>
    <td><img src="../../../img/user_chargingsojourn_per_user/chargingsojourn_user1.png"></td>
    <td><img src="../../../img/user_chargingsojourn_per_user/chargingsojourn_user2.png"></td>
</tr>    
<tr>
    <td><img src="../../../img/user_chargingsojourn_per_user/chargingsojourn_user3.png"></td>
    <td><img src="../../../img/user_chargingsojourn_per_user/chargingsojourn_user4.png"></td>
</tr>
<tr>
    <td><img src="../../../img/user_chargingsojourn_per_user/chargingsojourn_user5.png"></td>
    <td><img src="../../../img/user_chargingsojourn_per_user/chargingsojourn_user6.png"></td>
</tr>   

These graphs show us the distance between the sojourn and charging times for each transaction of every user. The y axis represents the amount of hours and the x axis represents the transaction number (i.e. 1st, 2nd ,3rd, etc., but not an id of a transaction). The blue dots correspond to the charging times, the red dots to the sojourn times, but also the distances from the blue dots to the red dots are the idle times.

In general, the majority of idle time distance between sojourn and charging times is quite significant. This means that the amount of flexibility for detaining the chargin has potential. There are some red dots very close to blue dots, this means that the charging and sojourn times are almost equal, so the transaction time was optimal, meaning there is no flexibility there. The blue dots that reach the x axis (i.e. 0 hours) are biased points that represent that the EV left before the charging was finished.

## 2. Analysis of the power flow during a transaction

One of the research questions we wanted to answer is: Is it possible to predict the state of charge of an EV once a transaction starts, without knowing the battery capacity, just the current level of power? In this section, we try to answer that question. 

We will study 6 transactions completed by user 04974FFAB63780.

<tr>
    <td><img src="../../../img/powerflows/powerflow1.png"></td>
    <td><img src="../../../img/powerflows/powerflow2.png"></td>
</tr>    
<tr>
    <td><img src="../../../img/powerflows/powerflow3.png"></td>
    <td><img src="../../../img/powerflows/powerflow4.png"></td>
</tr>
<tr>
    <td><img src="../../../img/powerflows/powerflow5.png"></td>
    <td><img src="../../../img/powerflows/powerflow7.png"></td>
</tr>   

These graphs show us the power flow from the start of the transaction to the end. The y axis represents the amount of power in Watts and the x axis represents the time of connection.

Transactions 1 to 5 (i.e. top 2, middle 2 and bottom-left) are examples of finished transactions. Transaction 6 (i.e. bottom-right) is an example of unfinished transaction. From these graphs we can accurately describe the behaviour of the power flow since the start of the transaction until the end of it: right after the EV is plugged in the level of power reaches a maximum level and stays there for a certain period of time based on the level of charging of the EV. When the battery is close to being fully charged, the level of power starts decreasing until it reaches 0 W so the EV is charged. Nevertheless, we can conclude that it is not possible to predict the state of charge of an EV once a transaction starts, without knowing the battery capacity, just the current level of power, because the level of power is not proportional to the state of charge, on the contrary, it maintains a stable level for most of the duration of the transaction and then it decreases significantly fast.

Transaction 4 (i.e. medium-right) has 2 different characteristics from the rest of transactions. The level of power it reaches is 6500 W, while the other transactions not even get to 3500 W. This is most likely due to the fact that the user used a different EV that day, with different features and a different battery capacity. The second interesting characteristic is that, after the level of power decreases and it finally reaches 0 W, it increases and decreases a few times. That is precisely one of the reasons why, in our algorithm to estimate the end of charging time described in section 1.4, we make sure that the level of power stays at 0 W for a minimum of 30 minutes. This way we avoid estimating an incorrect end of charging time.