# Analysis of the Public Electric Vehicle (EV) Charging Station Network in British Columbia


**Dataset Description and Overview:** The Electric Vehicle (EV) Charging Network dataset for British Columbia (BC) can be obtained as a CSV file by applying filters such as location, fuel type, and station parameters from this [link](https://natural-resources.canada.ca/energy-efficiency/transportation-alternative-fuels/electric-charging-alternative-fuelling-stationslocator-map/20487#/analyze?region=CA-BC&fuel=ELEC&status=E&status=P). 

This dataset comprises a comprehensive range of information about the BC EV charging network. It includes geographical features such as the address, latitude, and longitude of each station, as well as the facility in which they are located. The dataset also provides the EV Connector Types used, number of Level 2 and DC Fast charging ports available at each station, their respective pricing structures, and the network to which each station belongs. Additionally, it contains data on the operational date of each station and their hours of accessibility. Detailed information about each column is available [here](https://developer.nrel.gov/docs/transportation/alt-fuel-stations-v1/all/). 

## Data Cleaning

Import libraries required for data cleaning and analysis 

In [None]:
import math
import re
import folium
import numpy as np
import pandas as pd
import pandas_profiling
from datetime import datetime
import matplotlib.pyplot as plt
import plotly.graph_objs as go
import plotly.express as px
from sklearn.neighbors import KernelDensity



Read the csv file into the ev_stations dataframe using pandas

In [None]:
ev_stations = pd.read_csv("BC_alt_fuel_stations.csv")

Overview of the dataset

In [None]:
ev_stations.head()

Unnamed: 0,Fuel Type Code,Station Name,Street Address,Intersection Directions,City,State,ZIP,Plus4,Station Phone,Status Code,...,CNG PSI,CNG Vehicle Class,LNG Vehicle Class,EV On-Site Renewable Source,Restricted Access,RD Blends,RD Blends (French),RD Blended with Biodiesel,RD Maximum Biodiesel Level,NPS Unit Name
0,ELEC,City of Nanaimo - Underground Parking Lot,101 Gordon St,,Nanaimo,BC,V9R 5J6,,250-754-4251,E,...,,,,,False,,,,,
1,ELEC,Fulford Community Hall,2591 Fulford-Ganges Rd,,Salt Spring Island,BC,V8K 1Z4,,,E,...,,,,,False,,,,,
2,ELEC,Poets Cove Resort and Spa,9801 Spalding Rd,,Pender Island,BC,V0N 2M3,,250-629-2100,E,...,,,,,False,,,,,
3,ELEC,City of Merritt - City Hall,2185 Voght St,Located on the front posts of the building,Merritt,BC,V1K 1R6,,250-378-4224,E,...,,,,,False,,,,,
4,ELEC,North Shore Kia,855 W 1st St,,North Vancouver,BC,V7P 1A4,,,E,...,,,,,False,,,,,


Data Preparation

In [None]:
# Filter out all the stations that are expected to open in the future and not yet operational as of today
ev_stations['Expected Date'] = pd.to_datetime(ev_stations['Expected Date'])
today = datetime.today()
ev_stations = ev_stations[(ev_stations['Expected Date'] <= today) | (ev_stations['Expected Date'].isna())]

# Assuming that all "Expected Date" values in the past are not yet updated but the stations are operational \ 
# we can replace empty 'Open Date' values with past 'Expected Date' values
ev_stations['Open Date'].replace('', pd.NA).fillna(ev_stations['Expected Date'], inplace=True)

#Remove empty columns
ev_stations = ev_stations.dropna(axis=1, how="all")

#Remove unused French language columns
french_columns = ev_stations.filter(like='French').columns
ev_stations = ev_stations.drop(french_columns, axis=1)

# Remove columns with missing values more than 80%
ev_stations = ev_stations.loc[:, ev_stations.isnull().mean() < 0.8]

## Data Analysis and Visualization

Create a pandas profile report to understand each column and distribution of the dataset

In [None]:
ev_stations.profile_report()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]



From the above pandas profile report, we can easily observe the following:

1. The majority of the stations are available 24 hours

2. Chargepoint Network, FLO are some of the most popular EV networks in BC

3. J1772 is the most available connector type in BC

**Analysis 1**: Distribution of charging port types Level 2 and DC Fast EVSE ports

In [None]:
# Fill the blank ports with 0
ev_stations[['EV Level2 EVSE Num', 'EV DC Fast Count']] = ev_stations[['EV Level2 EVSE Num', 'EV DC Fast Count']].fillna(0)

# Select only the two ports Level 2 and DC Fast Charhing columns of interest
ev_ports = ev_stations[['EV Level2 EVSE Num', 'EV DC Fast Count']]

# Sum the the count of ports in each columns
sum_ports = ev_ports.sum()

new_labels = ['Level 2 EVSE', 'DC Fast EVSE']

# Create a bar plot using plotly.graph_objs
data = [go.Bar(
    x=new_labels,
    y=sum_ports,
    marker=dict(color=['blue', 'orange']),
)]

layout = go.Layout(
    title='Total EV Charging Stations by Type in BC',
    xaxis=dict(title='Charging Station Type'),
    yaxis=dict(title='Total Count'),
    hovermode='closest'
)

fig = go.Figure(data=data, layout=layout)

# Display the plot
fig.show()


There are close to 3500 and 1000 publicly accessible Level 2 ports , DC Fast charging ports in BC

**Analysis 2:** Distribution of Charging Ports by City

In [None]:
# Apply title() to City column to convert all city names to camelcase
ev_stations['City'] = ev_stations['City'].apply(lambda city_name: city_name.title())

# Group by city and calculate sum of Level2 and DC fast charging ports
ev_ports_city = ev_stations.groupby('City').agg({'EV Level2 EVSE Num': 'sum', 'EV DC Fast Count': 'sum'})

# Calculate the total number of charging ports per city
ev_ports_city['Total'] = ev_ports_city.sum(axis=1)

# Sort the DataFrame by total number of charging ports per city
ev_ports_city = ev_ports_city.sort_values(by='Total', ascending=False).head(10)

# Create a stacked bar plot
trace1 = go.Bar(
    x=ev_ports_city.index,
    y=ev_ports_city['EV Level2 EVSE Num'],
    name='Level 2 EVSE',
    marker=dict(color='blue')
)

trace2 = go.Bar(
    x=ev_ports_city.index,
    y=ev_ports_city['EV DC Fast Count'],
    name='DC Fast Count',
    marker=dict(color='orange')
)

data = [trace1, trace2]

layout = go.Layout(
    title='Top 10 Cities by Total EV Charging Ports',
    xaxis=dict(title='City'),
    yaxis=dict(title='Total Number of Charging Ports'),
    barmode='stack',
    hovermode='closest'
)

fig = go.Figure(data=data, layout=layout)

# Display the plot
fig.show()

Vancouver has the highest number of charging ports in BC followed by Victoria. For an electric vehicle owner, trips between Vancouver, Victoria, Burnaby, Surrey and other cities shown above need not be planned accurately since these all cities have a high concentration of charging ports. 

To-do: Plot the count of charging stations by facilities. 

**Analysis 3**: Time series analysis of the charging stations

In [None]:
# Convert Open Date strings to datetime objects
ev_stations['Open Date'] = pd.to_datetime(ev_stations['Open Date'])

# Count the number of stations opened by year
stations_opened = ev_stations.groupby(ev_stations['Open Date'].dt.year)['Open Date'].count().reset_index(name='count')

# Create an interactive line plot
fig = px.line(stations_opened, x='Open Date', y='count', title='Number of Charging Stations Opened by Year')
fig.update_traces(mode='markers+lines')
fig.update_layout(xaxis_title='Year', yaxis_title='Number of Stations')
fig.show()


Highest number of stations were opened in 2021-2022 which is amalgomous to this [news article](https://biv.com/article/2022/06/bc-electric-vehicle-sales-tops-north-america) which says that 13% of all new light vehicle sales in B.C. in 2021 were zero-emission vehicles — the highest rate on the continent on a per capita basis. To accomodate such high sales the number of charging stations were also increased accordingly.

**Analysis 4**: Spatial Analysis of the charging ports in the province of BC

In [None]:
stations_grouped_spatial = ev_stations.groupby(['Latitude', 'Longitude']).agg({'EV Level2 EVSE Num': 'sum', 'EV DC Fast Count': 'sum'})

# Calculate the density of charging stations using kernel density estimation (KDE)
kde = KernelDensity(bandwidth=0.02, metric='haversine')
points = np.vstack([stations_grouped_spatial.index.get_level_values('Latitude'), stations_grouped_spatial.index.get_level_values('Longitude')]).T
kde.fit(np.radians(points))
density = np.exp(kde.score_samples(np.radians(points)))

# Add the KDE results to the DataFrame
stations_grouped_spatial['density'] = density

# Define the cutoffs for high and low-density areas
high_cutoff = np.percentile(stations_grouped_spatial['density'], 75)
low_cutoff = np.percentile(stations_grouped_spatial['density'], 25)

# Plot the data onto the map, with high and low-density areas highlighted
map_bc = folium.Map(location=[49.2827, -123.1207], zoom_start=7)

for lat, lon, level2, dc, dens in zip(stations_grouped_spatial.index.get_level_values('Latitude'),
                                      stations_grouped_spatial.index.get_level_values('Longitude'),
                                      stations_grouped_spatial['EV Level2 EVSE Num'],
                                      stations_grouped_spatial['EV DC Fast Count'],
                                      stations_grouped_spatial['density']):
    # Color the markers based on the density of charging stations
    if dens > high_cutoff:
        color = 'red' # highest density
    elif dens < low_cutoff:
        color = 'green' # lowest density
    else:
        color = 'blue' 
    
    icon = folium.Icon(color=color)
    
    folium.Marker(location=[lat, lon],
                  popup=f'Level 2 Ports: {int(level2)}, DC Ports: {int(dc)}, Density: {int(dens)}',icon=icon).add_to(map_bc)

display(map_bc)

It can be observed that the density of charging ports is highest (red) in Vancouver city and decreases (blue and then green) as we move away from it.  

**Analysis 5:** Distribution of Charging ports by Facility

In [None]:
# Group by Facility Type and sum the Level 2 and DC fast charging ports
ev_facility = ev_stations.groupby('Facility Type').agg({'EV Level2 EVSE Num': 'sum', 'EV DC Fast Count': 'sum'})

# Create a stacked bar chart
trace1 = go.Bar(x=ev_facility.index, y=ev_facility['EV Level2 EVSE Num'], name='Level 2 EVSE')
trace2 = go.Bar(x=ev_facility.index, y=ev_facility['EV DC Fast Count'], name='DC Fast Count')

data = [trace1, trace2]

layout = go.Layout(title='Facilities by Level 2 and DC Fast Charging Ports',
                   xaxis=dict(title='Facility Type'),
                   yaxis=dict(title='Total Count'),
                   barmode='stack',
                   hovermode='closest',
                   width=1000,
                   height=600)

fig = go.Figure(data=data, layout=layout)

# Display the plot
fig.show()


There are more Level 2 chargers in Hotels compared to DC Fast chargers as people tend to stay overnight at hotels and can afford more time whereas in shopping centers where people spend 2-3 hours, more DC Fast chargers are present

**Analysis 6**: Electric Vehicle charging cost comparison with Gas vehicles

In [None]:
# The EV Pricing column has the Pricing values of the ports Level 2 and DC Fast Charger
ev_pricing = ev_stations[['EV Pricing']]

# Create new columns 'L2_rate' and 'DC_Fast_Charge_rate'
ev_pricing['L2_rate'] = np.nan
ev_pricing['DC_Fast_Charge_rate'] = np.nan


# Loop through each row and extract the rates of each ports
for i, row in ev_pricing.iterrows():
    pricing = str(row['EV Pricing']).strip()  # Convert the value to a string and remove leading/trailing whitespaces

    if not pricing:  # If the string is empty, skip the iteration
        continue

    match_hour = re.search('\$([\d\.]+) per hour|\/Hr', pricing)
    match_minute = re.search('\$([\d\.]+) per minute', pricing)
    match_dcfc = re.search('DCFC: \$([\d\.]+) per minute', pricing)
    match_special_pricing = re.search('\$0.44 per minute above 60 kW and \$0.22 per minute at or below 60 kW', pricing)

    # If the pricing is free, append 0 to both the columns
    if 'free' in pricing.lower() or 'parking fee' in pricing.lower():
        ev_pricing.at[i, 'L2_rate'] = 0
        ev_pricing.at[i, 'DC_Fast_Charge_rate'] = 0

    # If the pricing is expressed as "$ per hour", append the $ value to the L2_rate column
    if match_hour and match_hour.group(1) and not 'free' in pricing.lower() and not 'parking fee' in pricing.lower():
        ev_pricing.at[i, 'L2_rate'] = float(match_hour.group(1))

    # If the pricing is expressed as "$ per min", append the $ value to the DC_Fast_Charge_rate column
    if match_minute and not match_dcfc and match_minute.group(1):
        ev_pricing.at[i, 'DC_Fast_Charge_rate'] = float(match_minute.group(1))

    if match_dcfc and match_dcfc.group(1):
        ev_pricing.at[i, 'DC_Fast_Charge_rate'] = float(match_dcfc.group(1))

    # When the pricing is expressed as $0.44 per minute above 60 kW and \$0.22 per minute at or below 60 kW 
    # append the weighted average of $0.352/min to DC_Fast_Charge_rate column 
    if match_special_pricing:
        ev_pricing.at[i, 'DC_Fast_Charge_rate'] = 0.352




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [None]:
# Calculating Charging cost for a Tesla model 3 \n with an average charging time of 10 hours using an L2 Charger and 1 hour using a DC Fast Charger

charging_time_L2 = 8 # hours
charging_time_DC_fast = 60 # minutes

average_L2_rate = ev_pricing['L2_rate'].mean()
average_DC_fast_charge_rate = ev_pricing['DC_Fast_Charge_rate'].mean()

cost_L2 = average_L2_rate * charging_time_L2
cost_DC_fast = average_DC_fast_charge_rate * charging_time_DC_fast

print(f"The average cost to charge a Tesla Model 3 using L2 charging is ${cost_L2:.2f} and using DC fast charging is ${cost_DC_fast:.2f}")





The average cost to charge a Tesla Model 3 using L2 charging is $1.45 and using DC fast charging is $4.06


The above calculation shows that, for an average mileage of 450 km, a person would spend 45 CAD on gas cars as opposed to only 4 CAD in an electric car, resulting in significant cost savings. Moreover, the environmental impact is notably lower with electric cars, producing only 0.02 tonnes of CO2 compared to 0.09 tonnes of CO2 for gas cars. This highlights the financial and environmental benefits of choosing electric vehicles over their gas-powered counterparts. (See [carbon footprint](https://https://www.carbonfootprint.com/calculator.aspx) for CO2 calculations)

# Conclusion

With more time and resources, as well as access to additional datasets combined with this data, I would have answered the following questions:

1. Connect with EV charging network webpages, such as FLO and ChargePoint Network, to gather more information about the EV pricing column, standardize it and improve the accuracy of pricing analysis.

2. Perform time series forecasting of the EV stations using open date to check if the predicted stations will comply with our targets of achieving 100% electric vehicle sales by 2040?

3. Calculating the amount of carbon emissions saved using the data for each station's electricity source, which can be used for marketing and motivating individuals to adopt electric vehicles in addition to the cost savings on fuel.

4. What is the current usage rate of the EV stations? are they able to cope with the demand of the public?

5. With most Level 2 connectors being of the type J1772, which is not compatible with Teslas, are customers satisfied with having to spend an additional $50 for an SAE J1772 Charging Adapter, or do these charging stations provide provisions for renting them if the customer doesn't have one? 



**Concerns with B.C.’s public EV charging station network:**

In [None]:
# Calculation of average distance between two stations along the map of BC 

# Define the Haversine formula function
def haversine(lat1, lon1, lat2, lon2):
    R = 6371 # Earth's radius in km
    dLat = math.radians(lat2 - lat1)
    dLon = math.radians(lon2 - lon1)
    lat1 = math.radians(lat1)
    lat2 = math.radians(lat2)

    a = math.sin(dLat/2)**2 + math.sin(dLon/2)**2 * math.cos(lat1) * math.cos(lat2)
    c = 2 * math.asin(math.sqrt(a))

    return R * c

# Calculate the distances between all pairs of locations on the map
distances = []
locations = stations_grouped_spatial.index.tolist()
for i in range(len(locations)):
    for j in range(i+1, len(locations)):
        lat1, lon1 = locations[i]
        lat2, lon2 = locations[j]
        distance = haversine(lat1, lon1, lat2, lon2)
        distances.append(distance)

# Calculate the average distance between stations
avg_distance = sum(distances) / len(distances)
print("The average distance between stations is:", int(avg_distance), "km")


The average distance between stations is: 217 km


**Sparse Distribution of Charging Stations:** From the above calculation, the average distance between two charging stations is 217 km, which requires accurate planning for long-distance traveling. Increasing the availability of more charging stations can address range anxiety concerns for EV users.

**Uneven Distribution:** From the Spatial Analysis of the charging ports in the province of BC, It is clear that most EV charging stations are located in Vancouver, and their density decreases as we move away from Vancouver city into outer areas (places like Jaffray, Kitwanga, Fort Nelson, Elkford, Langley City, Dome Creek, Tete, Jaune Cache, Roberts Creek etc. have only 1 charging station). This not only makes it less feasible for people in rural areas to purchase electric vehicles but also less suitable for long-distance traveling. Ensuring that charging stations are available in both urban and rural areas can help improve accessibility problems.

**Limited Fast Charging Stations:** Using the Distribution of Charging Ports by City plot, the majority of chargers are of the Level 2 type, which take around 10 hours for a full charge. This is not convenient for many people, and the wait times might increase even more during peak hours.

**Suggestions to improve the B.C. public EV charging station network:**

**Increase the number of fast-charging stations:** Expanding the availability of DC Fast Charging stations can significantly reduce charging times and make long-distance travel more feasible for electric vehicle users.

**Expand coverage in rural areas:** Ensuring that charging stations are available in both urban and rural areas can improve accessibility and encourage more people to consider electric vehicles.

**Provide real-time information:** Providing real-time information on charging station availability, wait times, and pricing can help users plan their trips more efficiently and minimize downtime while waiting for a charging station to become available.

**Partner with local businesses and organizations:** Working with local businesses, such as gas stations, hotels, restaurants, and shopping centers, to install charging stations can help increase the overall charging infrastructure while providing additional amenities for customers.

**Comments and Suggestions on the quality and format of the data provided through the Electric Charging and Alternative Fuelling Stations Locator:** 

The data provided through the Electric Charging and Alternative Fuelling Stations Locator is well-organized and comprehensive, featuring spatial information such as station location coordinates, addresses, accessibility, and charging station specifications like network details, different types of ports, and connector types present at the EV stations. However, there are a few suggestions that could be made to improve the data quality and format:

1. **Standardize the EV Pricing column**: The current format is not standard and hence difficult to parse, and 77.7% of the values are missing. Addressing these issues will make it easier to draw pricing conclusions and prevent inaccurate analysis.

2. **Update the Expected Date of Opening**: The majority of stations have an expected date in 2022. It's crucial to update the database to reflect whether these stations are now operational.

3. **Enhance the EV On-Site Renewable Source column**: This is an important metric for accurately calculating carbon emissions. Providing more details in this column will help calculate carbon emissions more efficiently, which can be used as a motivation tool for promoting the adoption of electric vehicles.