# 'Living' Conclusion Gathering Space

## Conclusions Data Merge

-  There seems to be no pattern in the null values in target column. This indicates we can drop these rows


_County_
- we think county 12 is weird (unknown location) and introduces NA values (not included in weather data)
- maybe drop 12, but this may lead to other problems 
- county named 'unknown'
- the forums say counties 16 and 0 may be weird

_data_block_id_
- we could reduce NaN and NaT values by excluding data_block_id 1 and 0 (beginning in data set)

_Modelling / Time Series_
- We are unsure about modelling (is time series model needed? maybe ARMA?) Forums suggest e.g. XGBoost


## Conclusions EDA
- consumption has noticeable affects by winter holidays
- seems consumption is growing over time
- county 0 is dominating, Tallinn located there
- seems like temperature to production ratio changed last year
- product_type 2 attract producers with small installed capacity thus low production
- while product_type 3 attract the opposite cluster, producers with a lot of installed_capacity
- Surface solar radiation seems to have a stronger correlation with target than direct_solar
- There seems to be a 'split' around 6000 (unit?) daily mean target
- We expected more businesses in the top-producers

# Preparing Data

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go

enefit_green = '#3f641a'


In [None]:
client = pd.read_csv('../data/client.csv')
client.head()

## Client Data

In [None]:
# Datatype conversion to datetime
client.date = pd.to_datetime(client.date)
client.info()

In [None]:
client.info()

In [None]:
# Checking timeframe of the dataset
display(min(client.date))
display(max(client.date))
display(client.data_block_id.nunique())

First day is '2021-09-01 00:00:00', last day is '2023-05-29 00:00:00'. There are 636 unique days, and data_block_id corresponds to date. 

## Electricity Prices Data

In [None]:
electricity_prices = pd.read_csv('../data/electricity_prices.csv')
electricity_prices.head()

In [None]:
electricity_prices.info()

In [None]:
# Datatype conversion to datetime
electricity_prices.forecast_date = pd.to_datetime(electricity_prices.forecast_date)
electricity_prices.origin_date = pd.to_datetime(electricity_prices.origin_date)


In [None]:
# Count unique values in each column
electricity_prices.nunique()

In [None]:
# List unique forecast dates
electricity_prices.forecast_date.unique()

Electricity price forecast are available for each hour of the day (637 days * 24 hours = 15286 unique datetimes) (for 637 days, one day more than client data; somewhere there are 2h missing)


## Forecast Weather Data

In [None]:
forecast_weather = pd.read_csv('../data/forecast_weather.csv')
forecast_weather.head()

In [None]:
forecast_weather.info()

In [None]:
# Datatype conversion to datetime
forecast_weather.origin_datetime = pd.to_datetime(forecast_weather.origin_datetime)
forecast_weather.forecast_datetime = pd.to_datetime(forecast_weather.forecast_datetime)

forecast_weather.info()

In [None]:
# Checking df columns
forecast_weather.nunique()

In [None]:
# Check unique values by latitude and longitude
forecast_weather.groupby(['latitude', 'longitude']).nunique()

There are 112 unique combinations of lat and long (unique weather stations). 
So for each forecast_date, there are 112 observations (one from each station). 


## Gas Prices Data

In [None]:
gas_prices = pd.read_csv('../data/gas_prices.csv')

gas_prices.head()

In [None]:
gas_prices.info()

In [None]:
# Datatype conversion to datetime
gas_prices.forecast_date = pd.to_datetime(gas_prices.forecast_date)
gas_prices.origin_date = pd.to_datetime(gas_prices.origin_date)

gas_prices.info()

## Historical Weather Data

In [None]:
historical_weather = pd.read_csv('../data/historical_weather.csv')

historical_weather.head()

In [None]:
# Datatype conversion to datetime
historical_weather.datetime = pd.to_datetime(historical_weather.datetime)

historical_weather.info()

## Train Data & Checking for NULL values

In [None]:
train = pd.read_csv('../data/train.csv')

train.head()

In [None]:
# Datatype conversion to datetime
train.datetime = pd.to_datetime(train.datetime, format='%Y-%m-%d %H:%M:%S')
train.info()

In [None]:
train.head()

In [None]:
# Checking null values
train.isnull().sum()

In [None]:
# Checking null values within target
train.groupby('is_consumption').agg({'target': lambda x: x.isnull().sum()})

In [None]:
train[train.target.isnull()]

There seems to be no pattern in the null values in target column. This indicates we can drop these rows

## Data Merging

We merge everything to train dataset.

In [None]:
train.head()

### Merge Client

In [None]:
client.head()

In [None]:
print(client.size, train.size)

In [None]:
# append '_client' to merged columns
client.columns = [f"{column}_client" if column not in ['data_block_id', 'county', 'is_business', 'product_type'] else column for column in client.columns]

In [None]:
# merge train and client

merged_df = pd.merge(train, client, on=['data_block_id', 'county', 'is_business', 'product_type'], how='left')

merged_df.head()


There are a few/a lot null values, especially at the beginning and end of period.


In [None]:
# how many eic counts per data_block_id?
merged_df[merged_df.eic_count_client.isnull()].data_block_id.value_counts()

In [None]:
# do dates in train and client overlap? 
print(set(client.date_client.dt.date) ^ set(train.datetime.dt.date))
print(set(train.data_block_id) ^ set(client.data_block_id))

In [None]:
def print_block(dbid):
    display("TRAIN", train[train['data_block_id'] == dbid])
#     display("FORC WEATHER", forecast_weather[forecast_weather['data_block_id'] == dbid])
    display("CLIENT", client[client['data_block_id'] == dbid])
#     display("HIST WEATHER", historical_weather[historical_weather['data_block_id'] == dbid])
#     display("E PRICES", electricity_prices[electricity_prices['data_block_id'] == dbid])
#     display("G PRICES", gas_prices[gas_prices['data_block_id'] == dbid])

print_block(10)

There is a problem with NULL values after merging. one source are the start and end dates, but we don't know whats happening in between and whether this is problematic.
Maybe some client data is sporadically missing?

### Merge Gas Prices

In [None]:
gas_prices.head()

In [None]:
# append _gas_prices to columns
gas_prices.columns = [f"{column}_gas_prices" if column != 'data_block_id' else column for column in gas_prices.columns]

In [None]:
# merge gas_prices

merged_df = pd.merge(merged_df, gas_prices, on=['data_block_id'], how='left')

merged_df.tail()


### Merge Electricity Prices

In [None]:
# add time column for merging with electricity data
merged_df['time_of_day'] = merged_df['datetime'].dt.time
merged_df.head()

In [None]:
# Merge electricity prices
# the prices are available hourly -> create new column with time 

electricity_prices['time_of_day'] = electricity_prices.forecast_date.dt.time

# append electricity_prices to column names
electricity_prices.columns = [f"{column}_electricity_prices" if column not in ['time_of_day','data_block_id'] else column for column in electricity_prices.columns]



### Merge Electricity Prices

In [None]:
# merge electricity_prices

merged_df = pd.merge(merged_df, electricity_prices, on = ['data_block_id', 'time_of_day'], how='left')

merged_df.head()

### Merge Historical Weather

In [None]:
# historic weather

historical_weather.head()

In [None]:
# get county and county_name from weather_station_to_county_mapping (merge on latitude and longitude)
weather_station_to_county_mapping = pd.read_csv('../data/weather_station_to_county_mapping.csv')

# round lat and long to avoid mismatching due to different accuracy
historical_weather.latitude = historical_weather.latitude.round(1)
historical_weather.longitude = historical_weather.longitude.round(1)

weather_station_to_county_mapping.latitude = weather_station_to_county_mapping.latitude.round(1)
weather_station_to_county_mapping.longitude = weather_station_to_county_mapping.longitude.round(1)


In [None]:
# merge historical weather to get counties
merged_hist_weather = pd.merge(historical_weather, weather_station_to_county_mapping, on=['latitude', 'longitude'], how='left')
# get time of day
merged_hist_weather['time_of_day'] = merged_hist_weather['datetime'].dt.time

# aggregate by county and time (summarize weather stations for same county)
merged_hist_weather = merged_hist_weather.groupby(['county', 'time_of_day', 'datetime', 'data_block_id']).mean(numeric_only=True).reset_index()

# append _hist_weather to column names
merged_hist_weather.columns = [f"{column}_hist_weather" if column not in ['county', 'time_of_day','data_block_id'] else column for column in merged_hist_weather.columns]


merged_hist_weather.sample()


In [None]:
# merge to merged_df
merged_df = pd.merge(merged_df, merged_hist_weather, on=['data_block_id', 'time_of_day', 'county'], how='left')

### Merge Forecast Weather

In [None]:
forecast_weather.head()

In [None]:
# forecast weather

#round lat and long
forecast_weather.latitude = forecast_weather.latitude.round(1)
forecast_weather.longitude = forecast_weather.longitude.round(1)

# merge to get counties
merged_forecast_weather = pd.merge(forecast_weather, weather_station_to_county_mapping, on=['latitude', 'longitude'], how='left')
# merged_forecast_weather['time_of_day'] = merged_forecast_weather.

# # aggregate for duplicate locations
merged_forecast_weather = merged_forecast_weather.groupby(['county', 'forecast_datetime', 'data_block_id']).mean(numeric_only=True).reset_index()

# append forecast_weather to column names
merged_forecast_weather.columns = [f"{column}_forecast_weather" if column not in ['county', 'forecast_datetime','data_block_id'] else column for column in merged_forecast_weather.columns]


merged_forecast_weather.describe()

In [None]:
# merge forecast_weather
merged_df = pd.merge(merged_df, merged_forecast_weather, left_on=['data_block_id', 'datetime', 'county'], right_on=['data_block_id', 'forecast_datetime', 'county'], how='left')

merged_df.describe()

In [None]:
merged_df.info()

In [None]:
merged_df.size

In [None]:
merged_df.columns

# EDA

## Correlation heatmap

In [None]:
numeric_columns = merged_df.select_dtypes(include=[np.number])

correlation_matrix = numeric_columns.corr()

sns.heatmap(correlation_matrix, annot=False, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

In [None]:
# Create a 1x3 grid of subplots
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(15, 5))

# Plot the first graph
merged_df.groupby('county')['target'].mean().plot(kind='bar', ax=axes[0])
axes[0].set_title('Average Consumption per County')
axes[0].set_xlabel('County')
axes[0].set_ylabel('Average Consumption')

# Plot the second graph
merged_df.groupby('product_type')['target'].mean().plot(kind='bar', ax=axes[1])
axes[1].set_title('Average Consumption per Product type')
axes[1].set_xlabel('Product type')
axes[1].set_ylabel('Average Consumption')

# Plot the third graph
merged_df.groupby('is_business')['target'].mean().plot(kind='bar', ax=axes[2])
axes[2].set_title('Average Consumption per Business')
axes[2].set_xlabel('Business or not')
axes[2].set_ylabel('Average Consumption')

# Adjust layout to prevent clipping of titles
plt.tight_layout()

# Show the combined plot
plt.show()

In [None]:
px.line(merged_df, x='datetime', y='target', color='is_business', title='Energy Consumption Over Time')


In [None]:
avg_consumption_by_county = merged_df.groupby(['county', 'is_business'])['target'].mean().reset_index()
px.bar(avg_consumption_by_county, x='county', y='target', color='is_business', barmode='group', title='Average Energy Consumption by County and Business Type')


In [None]:
# Filter for consumption
consumption_df = merged_df[merged_df['is_consumption'] == True].copy()

# Group by datetime and calculate the total consumption
total_consumption = consumption_df.groupby('datetime')['target'].sum().reset_index()

# Create a line chart for total consumption
fig_consumption = px.line(total_consumption, x='datetime', y='target', title='Total Energy Consumption Over Time')

# Update the line color
fig_consumption.update_traces(line=dict(color='rgb(63, 100, 26)'))

# Show the chart
fig_consumption.show()


In [None]:
# Group by datetime and calculate the total consumption
average_consumption = consumption_df.groupby('datetime')['target'].mean().reset_index()

# Create a line chart for total consumption
fig_consumption = px.line(average_consumption, x='datetime', y='target', title='Average Energy Consumption Over Time')

# Update the line color
fig_consumption.update_traces(line=dict(color='rgb(63, 100, 26)'))

# Show the chart
fig_consumption.show()

In [None]:
# Ensure 'datetime' is in datetime format
merged_df['datetime'] = pd.to_datetime(merged_df['datetime'])

# Extract year and month for aggregation
merged_df['year_month'] = merged_df['datetime'].dt.to_period('M')

# Filter for consumption and calculate total consumption per month
consumption_df = merged_df[merged_df['is_consumption'] == True]
monthly_consumption = consumption_df.groupby('year_month')['target'].mean().reset_index()

# Filter for production and calculate total production per month
production_df = merged_df[merged_df['is_consumption'] == False]
monthly_production = production_df.groupby('year_month')['target'].mean().reset_index()

# Assuming you have a capacity column, calculate monthly capacity
monthly_capacity = merged_df.groupby('year_month')['installed_capacity_client'].mean().reset_index()

# Convert 'year_month' to datetime for plotting
monthly_consumption['year_month'] = monthly_consumption['year_month'].dt.to_timestamp()
monthly_production['year_month'] = monthly_production['year_month'].dt.to_timestamp()
monthly_capacity['year_month'] = monthly_capacity['year_month'].dt.to_timestamp()

# Create a figure with both consumption and production
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add consumption trace to the primary y-axis
fig.add_trace(go.Scatter(x=monthly_consumption['year_month'], y=monthly_consumption['target'], 
                         name='Consumption (mWh)', line=dict(color='rgb(63, 100, 26)', width=3)),
              secondary_y=False)

# Add production trace to the primary y-axis
fig.add_trace(go.Scatter(x=monthly_production['year_month'], y=monthly_production['target'], 
                         name='Production (mWh)', line=dict(color='rgb(143, 188, 143)', dash='dash', width=3)),
              secondary_y=False)

# Add capacity trace to the secondary y-axis
fig.add_trace(go.Scatter(x=monthly_capacity['year_month'], y=monthly_capacity['installed_capacity_client'], 
                         name='Capacity (kW)', line=dict(color='rgb(255, 0, 0)', width=3)),
              secondary_y=True)

# Update layout
fig.update_layout(
    title='Energy Consumption, Production, and Capacity (monthly average)',
    xaxis=dict(
        title='Month',
        showgrid=False  # Remove x-axis gridlines
    ),
    yaxis=dict(
        title='Energy (consumption/production, mWh)',
        titlefont=dict(color='rgb(0, 128, 0)'),  # Green color for x-axis title
        tickfont=dict(color='rgb(0, 128, 0)'),       
        showgrid=False  # Remove primary y-axis gridlines
    ),
    yaxis2=dict(
        title='Capacity (kW)',
        titlefont=dict(color='rgb(255, 0, 0)'),  # Red color for secondary y-axis title
        tickfont=dict(color='rgb(255, 0, 0)'),      
        showgrid=False  # Remove secondary y-axis gridlines
    ),
    plot_bgcolor='white'  # Set background color to white
)

# Show the figure
fig.show()

In [None]:
# Ensure 'datetime' is in datetime format
merged_df['datetime'] = pd.to_datetime(merged_df['datetime'])

# Extract different time units
merged_df['month'] = merged_df['datetime'].dt.month_name()
merged_df['day_of_week'] = merged_df['datetime'].dt.day_name()
# Ensure you have a 'time_of_day' column in merged_df
# merged_df['time_of_day'] = ...

# Define the correct order of months and days
month_order = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
day_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

# Prepare subplots
fig = make_subplots(rows=3, cols=1, vertical_spacing=0.08,
                    subplot_titles=("Average Energy Consumption and Production Trends by Month",
                                    "Average Energy Consumption and Production Trends by Day of the Week",
                                    "Average Energy Consumption and Production by Time of Day"))

# MONTHLY PLOT
monthly_consumption = merged_df[merged_df['is_consumption'] == True].groupby('month')['target'].mean().reindex(month_order).reset_index()
monthly_production = merged_df[merged_df['is_consumption'] == False].groupby('month')['target'].mean().reindex(month_order).reset_index()
fig.add_trace(go.Bar(x=monthly_consumption['month'], y=monthly_consumption['target'], name='Consumption (mWh)', marker_color='rgb(63, 100, 26)'), row=1, col=1)
fig.add_trace(go.Bar(x=monthly_production['month'], y=monthly_production['target'], name='Production (mWh)', marker_color='rgb(143, 188, 143)'), row=1, col=1)

# DAILY PLOT
daily_consumption = merged_df[merged_df['is_consumption'] == True].groupby('day_of_week')['target'].mean().reindex(day_order).reset_index()
daily_production = merged_df[merged_df['is_consumption'] == False].groupby('day_of_week')['target'].mean().reindex(day_order).reset_index()
fig.add_trace(go.Bar(x=daily_consumption['day_of_week'], y=daily_consumption['target'], name='Daily Consumption', marker_color='rgb(63, 100, 26)', showlegend=False), row=2, col=1)
fig.add_trace(go.Bar(x=daily_production['day_of_week'], y=daily_production['target'], name='Daily Production', marker_color='rgb(143, 188, 143)', showlegend=False), row=2, col=1)

# TIME OF DAY PLOT
time_of_day_consumption = merged_df[merged_df['is_consumption'] == True].groupby('time_of_day')['target'].mean().reset_index()
time_of_day_production = merged_df[merged_df['is_consumption'] == False].groupby('time_of_day')['target'].mean().reset_index()
fig.add_trace(go.Bar(x=time_of_day_consumption['time_of_day'], y=time_of_day_consumption['target'], name='Time of Day Consumption', marker_color='rgb(63, 100, 26)', showlegend=False), row=3, col=1)
fig.add_trace(go.Bar(x=time_of_day_production['time_of_day'], y=time_of_day_production['target'], name='Time of Day Production', marker_color='rgb(143, 188, 143)', showlegend=False), row=3, col=1)


fig.update_layout(
    height=1000,
    xaxis=dict(showgrid=False),  # Remove x-axis gridlines for the first subplot
    xaxis2=dict(showgrid=False), # Remove x-axis gridlines for the second subplot
    xaxis3=dict(showgrid=False), # Remove x-axis gridlines for the third subplot
    yaxis=dict(showgrid=False),  # Remove y-axis gridlines for the first subplot
    yaxis2=dict(showgrid=False), # Remove y-axis gridlines for the second subplot
    yaxis3=dict(showgrid=False), # Remove y-axis gridlines for the third subplot
    plot_bgcolor='white'  # Set background color to white
)

# Show the figure
fig.show()


In [None]:
# Ensure 'datetime' is in datetime format
merged_df['datetime'] = pd.to_datetime(merged_df['datetime'])

# Extract year and month for aggregation
merged_df['year_month'] = merged_df['datetime'].dt.to_period('M')

# Filter for consumption and production before grouping
consumption_df = merged_df[merged_df['is_consumption'] == True]
production_df = merged_df[merged_df['is_consumption'] == False]

# Group by year_month for the different data points
monthly_consumption = consumption_df.groupby('year_month')['target'].mean().reset_index()  # Unit: mWh
monthly_production = production_df.groupby('year_month')['target'].mean().reset_index()  # Unit: mWh
monthly_solar_radiation = merged_df.groupby('year_month')['direct_solar_radiation_hist_weather'].mean().reset_index()  # Unit: Wh/m²
monthly_cloud_cover = merged_df.groupby('year_month')['cloudcover_total_hist_weather'].mean().reset_index()  # Unit: %

# Convert 'year_month' to datetime for plotting
monthly_consumption['year_month'] = monthly_consumption['year_month'].dt.to_timestamp()
monthly_production['year_month'] = monthly_production['year_month'].dt.to_timestamp()
monthly_solar_radiation['year_month'] = monthly_solar_radiation['year_month'].dt.to_timestamp()
monthly_cloud_cover['year_month'] = monthly_cloud_cover['year_month'].dt.to_timestamp()

# Create subplots with 2 rows
fig = make_subplots(rows=2, cols=1, specs=[[{"secondary_y": True}], [{"secondary_y": True}]])

# First chart traces (Solar Radiation)
fig.add_trace(go.Scatter(x=monthly_consumption['year_month'], y=monthly_consumption['target'], 
                         name='Consumption (mWh)', line=dict(color='rgb(63, 100, 26)', width=3)),
              row=1, col=1, secondary_y=False)
fig.add_trace(go.Scatter(x=monthly_production['year_month'], y=monthly_production['target'], 
                         name='Production (mWh)', line=dict(color='rgb(143, 188, 143)', dash='dash', width=3)),
              row=1, col=1, secondary_y=False)
fig.add_trace(go.Scatter(x=monthly_solar_radiation['year_month'], y=monthly_solar_radiation['direct_solar_radiation_hist_weather'], 
                         name='Solar Radiation (Wh/m²)', line=dict(color='rgb(255, 215, 0)', width=3)), 
              row=1, col=1, secondary_y=True)

# Second chart traces (Cloud Cover)
fig.add_trace(go.Scatter(x=monthly_consumption['year_month'], y=monthly_consumption['target'], 
                         name='Consumption (mWh)', line=dict(color='rgb(63, 100, 26)', width=3), showlegend=False),
              row=2, col=1, secondary_y=False)
fig.add_trace(go.Scatter(x=monthly_production['year_month'], y=monthly_production['target'], 
                         name='Production (mWh)', line=dict(color='rgb(143, 188, 143)', dash='dash', width=3), showlegend=False),
              row=2, col=1, secondary_y=False)
fig.add_trace(go.Scatter(x=monthly_cloud_cover['year_month'], y=monthly_cloud_cover['cloudcover_total_hist_weather'], 
                         name='Cloud Cover (%)', line=dict(color='rgb(135, 206, 235)', width=3)),
              row=2, col=1, secondary_y=True)

# Update layout with separate y-axis titles for each subplot
fig.update_layout(
    title='Energy Consumption, Production, Solar Radiation, and Cloud Cover (monthly average)',
    xaxis=dict(
        title='Month',
        showgrid=False  # Remove x-axis gridlines
    ),
    yaxis=dict(
        title='Energy (mWh)',
        titlefont=dict(color='rgb(0, 128, 0)'),
        tickfont=dict(color='rgb(0, 128, 0)'),
        showgrid=False  # Remove y-axis gridlines for primary y-axis
    ),
    yaxis2=dict(
        title='Solar Radiation (Wh/m²)',
        titlefont=dict(color='rgb(255, 215, 0)'),
        tickfont=dict(color='rgb(255, 215, 0)'),
        showgrid=False  # Remove y-axis gridlines for secondary y-axis of first subplot
    ),
    yaxis3=dict(
        title='Energy (mWh)',
        titlefont=dict(color='rgb(0, 128, 0)'),
        tickfont=dict(color='rgb(0, 128, 0)'),
        showgrid=False  # Remove y-axis gridlines for primary y-axis of second subplot
    ),
    yaxis4=dict(
        title='Cloud Cover (%)',
        titlefont=dict(color='rgb(135, 206, 235)'),
        tickfont=dict(color='rgb(135, 206, 235)'),
        showgrid=False  # Remove y-axis gridlines for secondary y-axis of second subplot
    ),
    height=700,
    width=1000,
    plot_bgcolor='white'  # Set background color to white
)

# Show the figure
fig.show()


## Net Consumption


In [None]:
net_consumption = merged_df.query('is_consumption == 0')[["datetime", "target"]].groupby("datetime").sum().reset_index()
net_consumption.rename({"target": "production"},axis=1, inplace=True)
net_consumption["consumption"] = merged_df.query('is_consumption == 1')[["datetime", "target"]].groupby("datetime").sum().reset_index()["target"]
net_consumption["net_consumption"] =  net_consumption["consumption"] - net_consumption["production"]

In [None]:
net_consumption["date"] = net_consumption["datetime"].dt.date
net_consumption

In [None]:
net_consumption_per_day = net_consumption.groupby("date")[["production", "consumption", "net_consumption"]].sum().reset_index()

In [None]:
plt.figure(figsize=(12,8))
plt.plot(net_consumption["datetime"], net_consumption["net_consumption"], c = enefit_green )
plt.plot([net_consumption.datetime.min(), net_consumption.datetime.max()], [0,0], c = 'black')

plt.fill_between(net_consumption["datetime"], net_consumption.net_consumption.max(), facecolor='red', alpha=.5)
plt.fill_between(net_consumption["datetime"], net_consumption.net_consumption.min(), facecolor='green', alpha=.5)
plt.title('The Hourly Net Consumption (consumption - production)')
plt.ylabel("mwh")

In [None]:
last_date = pd.Timestamp(2022, 9, 30, 23)
first_date = pd.Timestamp(2022, 3, 1, 0)
net_consumption_summer = net_consumption[(first_date <= net_consumption["datetime"]) & (net_consumption["datetime"] <= last_date)]

plt.figure(figsize=(12,8))
plt.plot(net_consumption_summer["datetime"], net_consumption_summer["net_consumption"], c = enefit_green)
plt.plot([net_consumption_summer.datetime.min(), net_consumption_summer.datetime.max()], [0,0], c = 'black')
plt.fill_between(net_consumption_summer["datetime"], net_consumption_summer.net_consumption.max(), facecolor='red', alpha=.5)
plt.fill_between(net_consumption_summer["datetime"], net_consumption_summer.net_consumption.min(), facecolor='green', alpha=.5)
plt.title('The Hourly Net Consumptionin the sunny Period')
plt.ylabel("mwh")

In [None]:
plt.figure(figsize=(12,8))
plt.plot(net_consumption_per_day["date"], net_consumption_per_day["net_consumption"] , c=enefit_green)
plt.plot([net_consumption_per_day.date.min(), net_consumption_per_day.date.max()], [0,0])
plt.title('Difference between production and consumption (production - consumption)')