# EV Data: Electric Nation - CrowdCharge 

The Electric Nation EV data comes in one big file. This notebook assumes that the data in the original file (_CrowdChargeMeterValues.csv_) has been separated into two different files according to the charger type (3.6 or 7 kW) using the Jupyter notebook **_EV_crowdCharge_separate_data_by_charger.ipynb_**.These files should be in the folder `raw_data`. The timeseries data is sorted by individual chargers, i.e. the timeseries data for charger 1 over the whole duration of the EN project is followed by the timeseries data for charger 2 etc. 

The CrowdCharge data provides the charging current, which is resampled and converted to power, assuming a voltage of 230V. For generating the charging profiles, the data for the specified charger type is filtered for the provided time period and then the function `resample_timeseries()` resamples the current timeseries as specified with the parameter `time_resolution` and converts the current to power. The function is based on `pd.DataFrame.resample()`and in this notebook, `mean()` is used to aggregate the data over the resampling interval.

<hr>

- Time resolution for CrowdCharge data: Initially (prior to a controller software update) a value was sent for every minute when a car was connected to the charger. Following the software update values were sent every minute when the Status field = 2 (i.e. car connected and charging) and every half hour when Status = 1 (car connected, not charging).


- Time period the data covers:
CrowdCharge: start - 04/03/2017, end - 16/12/2018


- number of participant IDs > nr of charger IDs


In [None]:
from csv import reader
from pathlib import Path
import pandas as pd
from datetime import datetime, time
import logging

# Set up logging
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)

## Parameters

Specify the time period for the charging profiles, the time resolution and the charger type (3.6 or 7kW).

In [None]:
# set parameters
charger_type = 7 # 3.6 or 7 kW, depending on the charger
time_resolution = "30min"
start_date = "2018-03-04"
end_date = "2018-03-30"
save_file = False

dates = pd.date_range(start=start_date,end=end_date).to_pydatetime()
dates = [x.date() for x in dates]

## Function

In [None]:
def resample_timeseries(df, date, charger_id, voltage=230):
    """ resamples the current for one charger and day and converts it to power.
    
    The current is averaged over the specified time resolution before converted to power.
    
    Parameters
    ----------
    df_event: pd.DataFrame
        The dataframe containing the timeseries of the pv output of one system and day
    system_id: int
        System ID of the PV system.
    date: datetime.time 
        Date of the timeseries.
     voltage: int, optional
        Voltage of the network, used to convert current into power.
        
    Returns
    -------
    timeseries: pd.DataFrame
        Dataframe containing the resampled power in kW.
    """
    timeseries = pd.DataFrame(df[["Timestamp", "MaxAmpsUsed"]].set_index("Timestamp").resample(time_resolution).mean())
    timeseries = timeseries.assign(timestamps = timeseries.index.time)
    timeseries = timeseries.set_index("timestamps")
    # convert current into power in kW
    timeseries["MaxAmpsUsed"] = timeseries["MaxAmpsUsed"]*voltage/1000.0
    timeseries = timeseries.rename(columns={"MaxAmpsUsed": f"{charger_id}_{date}"})
    timeseries.fillna(0, inplace=True)
    return timeseries



## Load data

Load data for the defined charger type from `.csv`.

In [None]:
filename = f"EV_crowdCharge_data_{charger_type}kW.csv"
path_to_file = Path.cwd() / "raw_data"/filename
data = pd.read_csv(path_to_file, low_memory=False)
# convert timestamp to datetime object
data['Timestamp'] = data['Timestamp'].apply(lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M:%S.%f') ) 
# add extra column with date only
data['Date'] = data['Timestamp'].apply(lambda x: x.date()) 
data['Time'] = data['Timestamp'].apply(lambda x: x.time()) 

## Initialise dataframe for resampled charging profiles

In [None]:
# create timestamps with defined time resolution
timestamps = pd.date_range("00:00", "23:59", freq=time_resolution) # change freq to modify time resolution

charging_profiles = pd.DataFrame(timestamps, columns=["Timestamps"]) 
charging_profiles["Timestamps"] = charging_profiles["Timestamps"].apply(lambda x: x.time() )
charging_profiles.set_index("Timestamps", inplace=True)

## Re-sample charging current

Filter the data for the provided time period and re-sample each charging event individually.

In [None]:
for date in dates:
    data_filtered_for_date = data[data['Date'] == date]
    charger_ids = pd.unique( data_filtered_for_date['chargerID'] )
    for charger in charger_ids:
        df = data_filtered_for_date[data_filtered_for_date['chargerID'] == charger]
        if df['MaxAmpsUsed'].sum() > 0:
            timeseries = resample_timeseries(df, date, charger)
            charging_profiles = charging_profiles.join(timeseries)
        else:
            logging.info("empty dataframe")
charging_profiles.fillna(0, inplace=True)           

In [None]:
charging_profiles

In [None]:
if save_file:
    charging_profiles.to_csv(f"EV-crowdCharge--{charger_type}kW--start-{start_date}--end-{end_date}--{time_resolution}.csv", index=False)