# Distribution_Network_Analysis for "Power Charge Utilities"

In [None]:
# Import Libraries
#!pip install geopandas
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas as gdp

In [None]:
# Load Data into Notebook
Dd_ev = pd.read_csv('AMDARi/synthetic_ev_distribution_data.csv')
Gd_ev = pd.read_csv('AMDARi/synthetic_geospatial_data.csv')
Wd_ev = pd.read_csv('AMDARi/synthetic_weather_data.csv')

In [None]:
Dd_ev.head()

In [None]:
Gd_ev.head()

In [None]:
Wd_ev.head()

In [None]:
# Check Data Types and Missing Values
Dd_ev.info()

# Description of Numerical Columns
Dd_ev.describe()

In [None]:
# Check Data Types and Missing Values
Gd_ev.info()

# Description of Numerical Columns
Gd_ev.describe()

In [None]:
# Check Data Types and Missing Values
Wd_ev.info()

# Description of Numerical Columns
Wd_ev.describe()

## Exploratory Data Analysis
Deals with both Univariate (Single data point) and Bivariate(Two Data Point) Data Analysis

### Univariate Data Analysis
- Visualize the distribution of electricity Consumption.
- Analyze the distribution of EV types, charging habits, customer type.


### Bivariate Data Analysis
- Using geospatial data to visualize locations and EV Charging Stations.
- Analyze capacity of transmission lines.

### Univariate Analysis

In [None]:
#Set the style and Color Palettes (For Uniformity of subplot grids)
sns.set(style= 'whitegrid')
sns.set_palette('pastel')

#Create a 2 X 2 subplot grid
fig, axes = plt.subplots(2, 2, figsize=(10, 6))

#Plot distribution of electricity consumption
sns.histplot(data= Dd_ev, x = 'Electricity_Consumption (kWh)', bins = 30, kde = True, ax=axes [0, 0])
axes[0, 0]. set_title('Distribution of Electricity Consumption')
axes[0, 0]. set_xlabel('Electricity Consumption')
axes[0, 0]. set_ylabel('Frequency')

#Plot distribution of EV types
sns.countplot(data= Dd_ev, y = 'EV_Type', ax=axes [0, 1])
axes[0, 1]. set_title('Distribution of EV types')
axes[0, 1]. set_xlabel('Count')
axes[0, 1]. set_ylabel('EV Type')

 #Plot distribution of Charging Habits
sns.countplot(data= Dd_ev, y = 'Charging_Habit', ax=axes [1, 0])
axes[1, 0]. set_title('Distribution of Charging_Habit')
axes[1, 0]. set_xlabel('Count')
axes[1, 0]. set_ylabel('Charging_Habit')

 #Plot distribution of Customer Type
sns.countplot(data= Dd_ev, y = 'Customer_Type', ax=axes [1, 1])
axes[1, 1]. set_title('Distribution of Customer Type')
axes[1, 1]. set_xlabel('Count')
axes[1, 1]. set_ylabel('Customer Type')

# Adjust Layout
plt.tight_layout()

# Show plots
plt.show()

### Insights
From the Above subplots we can see the following from the visuals:
- Distribution consumption is centered around 500 kWh with instances of higher and lower consumptions 
- Most Common Distribution of EV types is the Electric Scooter with a count of 200 numbers followed by Electric Cars with 150 numbers and finally Electric Bikes with about 120 numbers
- Most Customers charging Habit is Daily, followed by Occasionally and weekly
- Most Customers are Commercial Followed by Residential with Industrial being the least

### Bivariate Analysis

In [None]:
#Extract Lat and Long For EV charging stations
Dd_ev['ev_latitude'] = Dd_ev['EV_Charging_Station_Location'].apply(
    lambda x: float(x.split(",")[0].replace("(", "").strip()))
Dd_ev['ev_longitude'] = Dd_ev['EV_Charging_Station_Location'].apply(
    lambda x: float(x.split(",")[1].replace(")", "").strip()))

#Extract Lat and Long Substation Locations
#Extract Lat and Long For EV charging stations
Gd_ev['Substation_latitude'] = Gd_ev['Substation_Location'].apply(
    lambda x: float(x.split(",")[0].replace("(", "").strip()))
Gd_ev['Substation_longitude'] = Gd_ev['Substation_Location'].apply(
    lambda x: float(x.split(",")[1].replace(")", "").strip()))

#Drop the original location columns to clean the dataframes
Dd_ev = Dd_ev.drop(columns = ['EV_Charging_Station_Location'])
Gd_ev = Gd_ev.drop(columns = ['Substation_Location'])

In [None]:
Dd_ev.head()

In [None]:
Gd_ev.head()

In [None]:
# Convert dataframes to Geodataframe
ev_gdf = gdp.GeoDataFrame(Dd_ev,
                          geometry=gdp.points_from_xy(Dd_ev.ev_longitude, Dd_ev.ev_latitude))

substation_gdf = gdp.GeoDataFrame(Gd_ev,
                          geometry=gdp.points_from_xy(Gd_ev.Substation_longitude, Gd_ev.Substation_latitude))

#Load the World Map Data
world = gdp.read_file(gdp.datasets.get_path('naturalearth_lowres'))

#Filter the map to North America
north_america = world[world['continent'] == 'North America']

#Plot the map for north America
fig, ax = plt.subplots(figsize = (10, 5))
north_america.boundary.plot(ax=ax, linewidth = 0.5, color = 'black')
north_america.plot(ax=ax, color= 'lightblue', edgecolor = 'black')

#Plot the substations on the map
substation_gdf.plot(ax=ax, marker= 's', markersize = 100, color = 'blue', label = 'substations')

#Plot ev charging stations on the map
ev_gdf.plot(ax=ax, markersize = 10, color = 'red', label = 'ev charging station', alpha = 0.5)

# Set title and axis labels

plt.title('Locations of Subtations and its Associated EV Charging Station in North America')
plt.xlabel('Longitude')
plt.ylabel('Latitude')

plt.legend()

plt.tight_layout()
plt.show()

### Observations:
- From the above map, The Stations are properly distributed throughout North America

In [None]:
#Import a line to show the connectivity of Ev_Charging Stations to the Substations
from shapely.geometry import LineString

In [None]:
# Incoparating shapely line 
# Zoom in and see Connection of substation to EV charging stations
# filter for the first substation
selected_substation = Gd_ev.iloc[0]
associated_ev = Dd_ev[Dd_ev['Substation_ID'] == selected_substation['Substation_ID']]

# Convert to GeoDataFrame
ev_gdf_selected = gdp.GeoDataFrame(
    associated_ev,
    geometry=gdp.points_from_xy(associated_ev.ev_longitude, associated_ev.ev_latitude)
)

substation_gdf_selected = gdp.GeoDataFrame(
    selected_substation.to_frame().transpose(),
    geometry=gdp.points_from_xy(
    [selected_substation['Substation_longitude']],
    [selected_substation['Substation_latitude']]
    )
)


lines_selected = [
    (ev_row['ev_longitude'], ev_row['ev_latitude'],
     selected_substation['Substation_longitude'],
     selected_substation['Substation_latitude'])
    for _, ev_row in associated_ev.iterrows()
]


line_gdf_selected = gdp.GeoDataFrame(
    geometry=[LineString([(line[0], line[1]), (line[2], line[3])]) for line in lines_selected]
)


#Load the World Map Data
world = gdp.read_file(gdp.datasets.get_path('naturalearth_lowres'))

#Filter the map to North America
north_america = world[world['continent'] == 'North America']

# Determe the Boundary Box for the Zoomed Area
buffer = 10 # degree of buffer
minx, miny, maxx, maxy = line_gdf_selected.total_bounds
xlim = [minx - buffer, maxx + buffer]
ylim = [miny - buffer, maxy + buffer]


#Plot the map for North America
fig, ax = plt.subplots(figsize = (8, 5))
north_america.boundary.plot(ax=ax, linewidth = 0.5, color = 'black')
north_america.plot(ax=ax, color= 'lightblue', edgecolor = 'black')

#Plot the substations on the map
substation_gdf_selected.plot(ax=ax, marker= 's', markersize = 100, color = 'blue', label = 'Selected Substations')

#Plot ev charging stations on the map
ev_gdf_selected.plot(ax=ax, markersize = 50, color = 'red', label = 'Associated EV Charging Stations', alpha = 0.7)

#Plotting Lines Connecting ev_charging Station to Substation
line_gdf_selected.plot(ax=ax, linewidth = 0.5, color = 'grey', label = 'connections')

#Set to Zoomed limits
ax.set_xlim(xlim)
ax.set_ylim(ylim)

# Set title and axis labels
plt.title(f'Zoomed- in View: Connections between {selected_substation["Substation_ID"]} and  Associated EV Charging Stations')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.legend()
plt.tight_layout()
plt.show()

In [None]:
# Incoparating shapely line 
# Zoom in and see Connection of substation to EV charging stations
# filter for the first substation
selected_substation = Gd_ev.iloc[1]
associated_ev = Dd_ev[Dd_ev['Substation_ID'] == selected_substation['Substation_ID']]

# Convert to GeoDataFrame
ev_gdf_selected = gdp.GeoDataFrame(
    associated_ev,
    geometry=gdp.points_from_xy(associated_ev.ev_longitude, associated_ev.ev_latitude)
)

substation_gdf_selected = gdp.GeoDataFrame(
    selected_substation.to_frame().transpose(),
    geometry=gdp.points_from_xy(
    [selected_substation['Substation_longitude']],
    [selected_substation['Substation_latitude']]
    )
)


lines_selected = [
    (ev_row['ev_longitude'], ev_row['ev_latitude'],
     selected_substation['Substation_longitude'],
     selected_substation['Substation_latitude'])
    for _, ev_row in associated_ev.iterrows()
]


line_gdf_selected = gdp.GeoDataFrame(
    geometry=[LineString([(line[0], line[1]), (line[2], line[3])]) for line in lines_selected]
)


#Load the World Map Data
world = gdp.read_file(gdp.datasets.get_path('naturalearth_lowres'))

#Filter the map to North America
north_america = world[world['continent'] == 'North America']

# Determe the Boundary Box for the Zoomed Area
buffer = 10 # degree of buffer
minx, miny, maxx, maxy = line_gdf_selected.total_bounds
xlim = [minx - buffer, maxx + buffer]
ylim = [miny - buffer, maxy + buffer]


#Plot the map for North America
fig, ax = plt.subplots(figsize = (8, 5))
north_america.boundary.plot(ax=ax, linewidth = 0.5, color = 'black')
north_america.plot(ax=ax, color= 'lightblue', edgecolor = 'black')

#Plot the substations on the map
substation_gdf_selected.plot(ax=ax, marker= 's', markersize = 100, color = 'blue', label = 'Selected Substations')

#Plot ev charging stations on the map
ev_gdf_selected.plot(ax=ax, markersize = 50, color = 'red', label = 'Associated EV Charging Stations', alpha = 0.7)

#Plotting Lines Connecting ev_charging Station to Substation
line_gdf_selected.plot(ax=ax, linewidth = 0.5, color = 'grey', label = 'connections')

#Set to Zoomed limits
ax.set_xlim(xlim)
ax.set_ylim(ylim)

# Set title and axis labels
plt.title(f'Zoomed- in View: Connections between {selected_substation["Substation_ID"]} and  Associated EV Charging Stations')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.legend()
plt.tight_layout()
plt.show()

### Observations:
From the Connections between subtations and  Associated Ev charging Station in North America,
we observe that there is connections from sustations to EV Charging Stations 

In [None]:
# Incoparating shapely line 
# Zoom in and see Connection of substation to EV charging stations
# filter for the first substation
selected_substation = Gd_ev.iloc[2]
associated_ev = Dd_ev[Dd_ev['Substation_ID'] == selected_substation['Substation_ID']]

# Convert to GeoDataFrame
ev_gdf_selected = gdp.GeoDataFrame(
    associated_ev,
    geometry=gdp.points_from_xy(associated_ev.ev_longitude, associated_ev.ev_latitude)
)

substation_gdf_selected = gdp.GeoDataFrame(
    selected_substation.to_frame().transpose(),
    geometry=gdp.points_from_xy(
    [selected_substation['Substation_longitude']],
    [selected_substation['Substation_latitude']]
    )
)


lines_selected = [
    (ev_row['ev_longitude'], ev_row['ev_latitude'],
     selected_substation['Substation_longitude'],
     selected_substation['Substation_latitude'])
    for _, ev_row in associated_ev.iterrows()
]


line_gdf_selected = gdp.GeoDataFrame(
    geometry=[LineString([(line[0], line[1]), (line[2], line[3])]) for line in lines_selected]
)


#Load the World Map Data
world = gdp.read_file(gdp.datasets.get_path('naturalearth_lowres'))

#Filter the map to North America
north_america = world[world['continent'] == 'North America']

# Determe the Boundary Box for the Zoomed Area
buffer = 10 # degree of buffer
minx, miny, maxx, maxy = line_gdf_selected.total_bounds
xlim = [minx - buffer, maxx + buffer]
ylim = [miny - buffer, maxy + buffer]


#Plot the map for North America
fig, ax = plt.subplots(figsize = (8, 5))
north_america.boundary.plot(ax=ax, linewidth = 0.5, color = 'black')
north_america.plot(ax=ax, color= 'lightblue', edgecolor = 'black')

#Plot the substations on the map
substation_gdf_selected.plot(ax=ax, marker= 's', markersize = 100, color = 'blue', label = 'Selected Substations')

#Plot ev charging stations on the map
ev_gdf_selected.plot(ax=ax, markersize = 50, color = 'red', label = 'Associated EV Charging Stations', alpha = 0.7)

#Plotting Lines Connecting ev_charging Station to Substation
line_gdf_selected.plot(ax=ax, linewidth = 0.5, color = 'grey', label = 'connections')

#Set to Zoomed limits
ax.set_xlim(xlim)
ax.set_ylim(ylim)

# Set title and axis labels
plt.title(f'Zoomed- in View: Connections between {selected_substation["Substation_ID"]} and  Associated EV Charging Stations')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.legend()
plt.tight_layout()
plt.show()

In [None]:
# Incoparating shapely line 
# Zoom in and see Connection of substation to EV charging stations
# filter for the first substation
selected_substation = Gd_ev.iloc[49]
associated_ev = Dd_ev[Dd_ev['Substation_ID'] == selected_substation['Substation_ID']]

# Convert to GeoDataFrame
ev_gdf_selected = gdp.GeoDataFrame(
    associated_ev,
    geometry=gdp.points_from_xy(associated_ev.ev_longitude, associated_ev.ev_latitude)
)

substation_gdf_selected = gdp.GeoDataFrame(
    selected_substation.to_frame().transpose(),
    geometry=gdp.points_from_xy(
    [selected_substation['Substation_longitude']],
    [selected_substation['Substation_latitude']]
    )
)


lines_selected = [
    (ev_row['ev_longitude'], ev_row['ev_latitude'],
     selected_substation['Substation_longitude'],
     selected_substation['Substation_latitude'])
    for _, ev_row in associated_ev.iterrows()
]


line_gdf_selected = gdp.GeoDataFrame(
    geometry=[LineString([(line[0], line[1]), (line[2], line[3])]) for line in lines_selected]
)


#Load the World Map Data
world = gdp.read_file(gdp.datasets.get_path('naturalearth_lowres'))

#Filter the map to North America
north_america = world[world['continent'] == 'North America']

# Determe the Boundary Box for the Zoomed Area
buffer = 10 # degree of buffer
minx, miny, maxx, maxy = line_gdf_selected.total_bounds
xlim = [minx - buffer, maxx + buffer]
ylim = [miny - buffer, maxy + buffer]


#Plot the map for North America
fig, ax = plt.subplots(figsize = (8, 5))
north_america.boundary.plot(ax=ax, linewidth = 0.5, color = 'black')
north_america.plot(ax=ax, color= 'lightblue', edgecolor = 'black')

#Plot the substations on the map
substation_gdf_selected.plot(ax=ax, marker= 's', markersize = 100, color = 'blue', label = 'Selected Substations')

#Plot ev charging stations on the map
ev_gdf_selected.plot(ax=ax, markersize = 50, color = 'red', label = 'Associated EV Charging Stations', alpha = 0.7)

#Plotting Lines Connecting ev_charging Station to Substation
line_gdf_selected.plot(ax=ax, linewidth = 0.5, color = 'grey', label = 'connections')

#Set to Zoomed limits
ax.set_xlim(xlim)
ax.set_ylim(ylim)

# Set title and axis labels
plt.title(f'Zoomed- in View: Connections between {selected_substation["Substation_ID"]} and  Associated EV Charging Stations')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.legend()
plt.tight_layout()
plt.show()

### Observations:
The Drill Down View of the Connections between substations and  Associated Ev charging Station in North America,
- It seems the  subtations are far from the EV Charging Stations

In [None]:
Dd_ev.head(1)

In [None]:
## Map of Locations Grouped by Locations and EV type, and Count of Stations
grouped_d = Dd_ev.groupby(['ev_latitude', 'ev_longitude', 'EV_Type']).size().reset_index(name='count')

#convert grouped data to GeoDataframe
grouped_df = gdp.GeoDataFrame(grouped_d,
                              geometry=gdp.points_from_xy(grouped_d.ev_longitude, grouped_d.ev_latitude))

#Load the World map data and filter for North America
world = gdp.read_file(gdp.datasets.get_path('naturalearth_lowres'))
north_america = world[world['continent'] == 'North America']

#Plot with Zoom for North America
fig, ax = plt.subplots(figsize = (12, 6))
north_america.boundary.plot(ax=ax, linewidth = 0.5, color = 'black')
north_america.plot(ax=ax, color= 'lightblue', edgecolor = 'black')

#Define Colours of EV Type:
colors = {'Electric Car': 'red', 'Electric Scooter': 'blue', 'Electric Bike': 'Green'}

# Plot EV Type
for ev_type, color in colors.items():
    sub_df = grouped_df[grouped_df['EV_Type'] == ev_type]  
    sub_df.plot(ax=ax, markersize=sub_df['count']*20, color=color, label=ev_type, alpha=0.7)

# Set title and axis labels
plt.title('Distributions of Charging Stations by Type and Frequency in North America')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.legend()
plt.tight_layout()
plt.show()

### Observation:
From the Observations
- EV Charging stations by Types is widely spreadout in North America i.e wide distribution in North America

### Network Capacty Assessment:
To carryout the network capacity assement of the substations (From the Geospatial Data, and Distributions data)
- Calculate the total electricity consumption for each substation
- Compare the total electricity consumption with the transmission line capacity

In [None]:
#Group the EV Distribution data by the Substation_ID and Calculate the total electricity consumption for each substation
total_consumption_per_substation = Dd_ev.groupby('Substation_ID')['Electricity_Consumption (kWh)'].sum().reset_index()

#Merge the total consumption data with geospatial data.
network_capacity_data = pd.merge(Gd_ev, total_consumption_per_substation, on = 'Substation_ID')

#Rename Column for better understanding
network_capacity_data.rename(columns= {'Electricity_Consumption (kWh)': 'Total_Consumption (kWh)'}, inplace = True)

#Calculate ratio of total consumption to transmission line (Metrics to define if an area reaches its potential or is overloaded or underloaded)
#Note the conversion: 1MW = 1000kWh

network_capacity_data['Consumption_to_Capacity_Ratio'] = network_capacity_data['Total_Consumption (kWh)']/(network_capacity_data['Transmission_Line_Capacity (MW)']* 1000)


In [None]:
network_capacity_data.head()

In [None]:
# IMport Shapely to help geodata
from shapely.geometry import Point

In [None]:
# Create GeoDataFrame for the network Capacity for the dataframe
from shapely.geometry import Point

# Assuming network_capacity_data contains columns 'Substation_longitude' and 'Substation_latitude'
geometry_network_capacity = [Point(lon, lat) for lon, lat in zip(network_capacity_data['Substation_longitude'], network_capacity_data['Substation_latitude'])]


gdf_network_capacity = gdp.GeoDataFrame(network_capacity_data, geometry=geometry_network_capacity)

# Plot in chloropleth style
fig, ax = plt.subplots(figsize = (8, 12))
north_america.plot(ax=ax, color = 'lightgrey', edgecolor = 'black')
gdf_network_capacity.plot(column= 'Consumption_to_Capacity_Ratio', cmap= 'coolwarm', legend = True,
                         marker='s', markersize=100, ax = ax, legend_kwds={'label': "Consumption to Capacity Ratio", 'orientation': "horizontal"})
ax.set_title("Consumption to Capacity Ratio of Substations")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
plt.tight_layout()
plt.show()

## Observation:
- The Plot show areas Potential capacity overload in Red and probable Lower capacity loading area in Blue, indicating network capacity is sufficient.
- Further investigaations needed for areas in read to be able to quantify if truely they are overloaded or potential overload.
- Check the correlations

In [None]:
Gd_ev.head(1)

In [None]:
# Investigate
# Groupby Substation_ID for number of EVs
ev_counts = Dd_ev.groupby('Substation_ID')['Number_of_EVs'].sum().reset_index()

#merge network cnetwork_capacity_data with EV cev_counts
final_data = pd.merge(ev_counts, network_capacity_data, on= "Substation_ID")

#CORRELATION
correlation_ratio = final_data['Number_of_EVs'].corr(final_data['Consumption_to_Capacity_Ratio'])

correlation_ratio

In [None]:
# Scatter Plot with regresion line
plt.figure(figsize = (10, 5))
sns.regplot(x='Number_of_EVs', y = 'Consumption_to_Capacity_Ratio', data = final_data, scatter_kws={'s': 50}, line_kws={'color':'red'}, ci=None)
plt.xlabel('Number of EVs')
plt.ylabel('Consumption to Capacity Ratio')
plt.grid(True)
plt.tight_layout
plt.show()

## Observation:
There is no correllation from the plot thus indicating weak correlation.
- Hence the number of EV attached to a substation does not necessrily lead to overload
- Number of EV does not lead to oveload

## Identifying Bottle Necks:
-By analyzing the map we can identify the substations and areas that are potential bottlenecks in the distribution network. These are the areas where the Consumption_to_Capacity_Ratio is high
- By filtering the substations with a Consumption_to_Capacity_Ratio close to or greater than 1, to see substations where immediate action and investment is necessary to prevent overload and ensure reliable delivery of elecricity


In [None]:
#Filtering Consumption_to_Capacity_Ratio (c2C) greater than 1
bottleneck_substation = network_capacity_data[network_capacity_data['Consumption_to_Capacity_Ratio'] >= 0.9]
bottleneck_substation

## Observation
- From the data frame there is no Consumption_to_Capacity_Ratio close to or greater than 1, hence no imediate crucial bottle necks.
* Hence from the below visual of Consumption to Capacity Ratio of Substations;

In [None]:
# Assuming network_capacity_data contains columns 'Substation_longitude' and 'Substation_latitude'
geometry_network_capacity = [Point(lon, lat) for lon, lat in zip(network_capacity_data['Substation_longitude'], network_capacity_data['Substation_latitude'])]


gdf_network_capacity = gdp.GeoDataFrame(network_capacity_data, geometry=geometry_network_capacity)

# Plot in chloropleth style
fig, ax = plt.subplots(figsize = (8, 12))
north_america.plot(ax=ax, color = 'lightgrey', edgecolor = 'black')
gdf_network_capacity.plot(column= 'Consumption_to_Capacity_Ratio', cmap= 'coolwarm', legend = True,
                         marker='s', markersize=100, ax = ax, legend_kwds={'label': "Consumption to Capacity Ratio", 'orientation': "horizontal"})
ax.set_title("Consumption to Capacity Ratio of Substations")
ax.set_xlabel("Longitude")
ax.set_ylabel("Latitude")
plt.tight_layout()
plt.show()

### It shows a need for a closer look at the red colorations to avoid any future bottlenecks

## Optimizing Network Upgrades
For optimizing network upgrades, focus on substations with potential to have high consumption_to_Capacity_Ration. Upgrading the transmission lines or adding additional capacity in these areas can help in managing the potential to have increased load effectively and ensuring reliability.

`Note the analysis:
- The Geographc distribution of EV charging stations where EV charging stations are quite far from the Substations
Consideration should be given to;
- Potential Future groeth in EV adoption in different areas
- Costs associated with different upgrade options.

In [None]:
top_5_substations = network_capacity_data.nlargest(5, 'Consumption_to_Capacity_Ratio')
top_5_substations

## Correlation with weather Data Anaysis
- nalyzing the correlation between weather data and electricity consumption can provide insights into the weather conditions affecting the distribution network.

In [None]:
Wd_ev.head(1)

In [None]:
Dd_ev.head(1)

In [None]:
# Merge weather Data with Distribution Data
merged_data = pd.merge(Dd_ev, Wd_ev, on= ['Timestamp', 'Substation_ID'])

#Calculate the correlationbetween weather condition and electricity consumption
correlation_matrix = merged_data[['Electricity_Consumption (kWh)', 'Temperature (°C)', 'Precipitation (mm)', ]].corr()

# Display the cocorrelation_matrix
correlation_matrix

In [None]:
# Display the plot
# Electricity Consumption Vs temp
plt.figure(figsize=(8, 5))
sns.scatterplot(data=merged_data, x= 'Temperature (°C)', y='Electricity_Consumption (kWh)', alpha = 0.6)
plt.title('Electricity Vs Temp')
plt.xlabel('Temp')
plt.ylabel('Electricity')
plt.show()



# Electricity Consumption Vs Precipitation
plt.figure(figsize=(8, 5))
sns.scatterplot(data=merged_data, x= 'Precipitation (mm)', y='Electricity_Consumption (kWh)', alpha = 0.6)
plt.title('Electricity Vs Prep')
plt.xlabel('Prep')
plt.ylabel('Electricity')
plt.show()

### Observations:
There is seems not to be any correlation between betwen the plot values:abs
- For Electricity Vs Temperature there is no clear relationship as the data points are all scattered.
- Same Applies for Electricity Vs Precipitation as the data points are also scattered.

Based on the current temperature and precipitation factors, they do not have correlation with electricity consumption, this suggests, these factors do not influence electricity consumption in the distribution network.
It is still important to consider temperature and precipitation which are weather data for network analysis as extreme weather conditions can have impact on distribution network and its component, potentially leading to outages and other issues. 

## INSIGHTS:

a. Electricity Consumption. The Electricity consumption is mostly centered around 500kWh, with with instances of higher and lower consumptions. This indicates varied demand at different times and locations.

b. EV Types and Charging Habits: Electric scooters is the most common types of EVs. Most customers charge thier EVs daily, indicating a consistent daily load on the distribution network

c. Consumer Type: Commercial Customers malke up the most consumer types.

d. Geospatial Distribution: The spatial distribution od substations and EV charging stations is widespread.

e. Geospatial Distribution: The EV charging station seems to be far from its corresponding substation

f. Network Capacity: Some substations have high Consumption_to_Capacity_Ratio, indicating potentialbottlenecks and overloads in the network. There is also no correlation with the number of EVs per station and the Consumption_to_Capacity_Ratio, this shows that Number of EVs is not a factor for overload.
                                                                                                                                                                                                                                                                                                 
g. Weather Correlation: The correlation between weather conditions (temperature and precipitation) is weak inthe current dataset, suggesting that other factors might be more influencial in affecting electricity consumption
    

## The Optimization Strategy/Recommendation
Based on the analysis of the busniess problem at hand, the following should be incoporated into the business;
1. Potential Substation Upgrades: Prioritize upgrades at substations where the Consumption_to_Capacity_Ration is high, indicating potential overloads, upgrade the transmission lines because the EV Charging Stations are too far from thier corresponding Substations.

2. Geospatial Analysis for Upgrade Planning: Use geospatial nalysis to determine the optimal locations for new substations or upgrade to existing ones. Consider factors like the proximity to high load demand areas(areas with high consumption to capacity ratio) and geograpical constraints.

3. Demand Side Management: Implement demand-side management strategies to manage the load on the grid. Encourage customers to change thier EVs during off-peak hours through incentives to dynamic pricing.

4. Advanced Monitoring and Analytics: Deploy advanced Monitoring systems to continously monitor the health and performance of distribution network. Use analytics to predict increased capacity and take preventive action.

5. Cost-Benefit-Analysis: Conduct a comprehensive cost benefit analysis to differentiate upgrade options, Consider factors like the cost of upgrades, operational costs, poetential revenue from increased capacity and the impact on service relativity and customer satisfaction

6. Customer Engagemen: Enage with customers to undersatnd thier needs and expectations. Provide Clear communication about network upgrades and how they will enhance service relaibility and meet the growing demand for EV charging.

7. Continous Improvement: Continously monitor and assess the performance of the didtribution network. Gather feedback from the customers and other stakeholders and use this feedback to make urhter improvements and optimizations.

By Following these steps, Power Charge Utilities can develop an effective optimization startegy to maintain the increased load demand from EV charging stations, ensure the relaibility and resilience of the ditribution network and meet the expectations of customers while all optimizing costs and ensuring regulatory compliance.
                                                                                                                                                                                                                    