# **Fokusthemenprojekt SoGuR** 
## *Selbstoptimierende Gebäude und urbane Räume*

<div class="alert alert-block alert-info">
<b>AP2.3 - Definition prototypischer Quartiere</b></div>

- **MS 2.3a**: Plausible Kombinationen der, in AP 2.1 und 2.2 entwickelten, Quartierstypen wurden identifiziert
- **MS 2.3b**: Prototypische Quartiere wurden definiert und die Arbeitsergebnisse in einem Kurzbericht dokumentiert


# Table of contents
1. [Settings](#settings)

2. [Download Data](#2-download-data)
    - 2.1 [Urban Energy Units](#ueu_units)
    - 2.2 [Electricity load profiles](#22-electricty-load-profiles)
    - 2.3 [Time period](#23-time_period)
3. [Plot modelled typical UEU profiles](#Plot)
- 3.1 [Data normalization per square meter and proportion of electricity energy consumption of each instant on a year](#plot_e_profile)
- 3.2 [Separate the UEU in different classes](#32-separate-the-ueu-in-different-classes) 
- 3.3 [Arrange the data to be manipulated by securing the date-based-index](#e_data_arrange)
- 3.4 [Name dataframes](#34-arrange-the-data-to-be-manipulated-by-securing-the-date-based-index)
- 3.5 [Resample the heat in energy consumption per day, week and month](#e_resample)
- 3.6 [Plot minimum, maximum and mean heat energy demand in a 24 hours period of a day in Winter, Spring, Summer and Autumn](#e_plot_wssa)
- 3.7 [Plotting of only average day on different days a year](#e_plot_avg_day)
- 3.8 [Plot minimum, maximum and mean heat energy demand within the months of January, April, July and October](#38-plot-minimum-maximum-and-mean-electricity-at-energy-demand-within-the-months-of-january-april-july-and-october)
- 3.9 [Plot drawing per minimum, maximum and mean yearly heat energy demand per UEU](#39-plot-drawing-per-minimum-maximum-and-mean-yearly-electricity-energy-demand-per-ueu)
- 3.10 [Separate the min, mean and max values hourly per year](#e_separate_values_hours_year)
- 3.11 [Store the min, mean and max dataframes per year in csv files](#e_store_tables_hourly_year)
- 3.12 [For normalized mean hourly heat demand in a year, separate the dataframes to plot daily heat changes](#e_separation_of_ueu_tables_hourly_year)
- 3.13 [Process data](#e_process_ueu_tables_hourly_year_for_printing)
- 3.14 [Filter data](#e_filter_ueu_tables_hourly_year_for_printing)
- 3.15 [Plot normalized hourly heat demand per day during a year](#e_printing_ueu_energy_demand_hourly_year)
- 3.16 [Descriptive statistics](#e_descrptive_statistics)
- 3.17 [Correlation matrix](#e_correlation_matrix)

<a id="settings"></a>

## 1. Settings

In [None]:
import os, time
import pickle
import numpy as np
import pandas as pd
import geopandas as gpd
from scripts.global_variables import database_path, root_path, input_path, output_path
import scripts.read as read
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib.ticker import FuncFormatter, FixedLocator
from scripts.create_plots import plot_elec_demand_hour as pedh
import seaborn as sns
from sklearn.metrics import mean_absolute_error
from scipy.stats import pearsonr



# # Set-up -------------------------------------------
Time_Start = time.time()
print('Start of execution: ' + time.asctime() + '.')
print("Database is found under: " + database_path)
print("Root directory: " + root_path)
print("input data: "+ input_path)
print("output data: "+ output_path)
#----------------------------------------------------
datetime_index = pd.date_range(start='2100-01-01 00:00:00', end='2100-12-31 23:00:00', freq='1h')

<a id=download></a>

## 2. Download Data

<a id=ueu_units></a>

#### 2.1 Urban Energy Units

Download the Urban Energy Units for the city of Oldenburg previously defined in AP2.1 of the project

In [None]:
# df_ueu = gpd.read_file(database_path + "\\ueu_oldenburg.gpkg").drop(['index'], axis=1)
df_ueu = gpd.read_file(database_path + "\\ueu_oldenburg.gpkg")
df_ueu.plot()
print(df_ueu[['unique_identifier', 'UEU', 'area_ha']])

<a id=elec></a>

#### 2.2 Electricty Load Profiles

Download the electricity load profiles for each section. The index is set equal to the above defined datetime index. The columns need to be labeled with the unique identifier of the UEUs.

In [9]:
# loading data
df_ueu_elec = pd.read_pickle(input_path + "\\ueu_electricity_load_profiles.pkl")
# dropping innecesary columns
df_ueu_elec.drop(['Time (h)'],axis=1,inplace=True)

#generating a copy of the unique_identifier of the UEU filtering by residential so that it is compatible with the elctricity dataframe
df = df_ueu.copy()
df['fid'] = (df.index + 1).astype(str)

# Filter df based on conditions
df = df[(df['landuse'] == 'residential') & (df['number_of_apartments'] > 0)]

# Set index and select columns
df = df.set_index('fid')[['unique_identifier']]

# mapping the correspoing unique_identifier to each load profile
df_ueu_elec.columns = df.loc[df_ueu_elec.columns, 'unique_identifier']
# arranging dataframe so that it is consistant with the heat load profiles
df_ueu_elec.rename_axis(None, axis=1, inplace=True)
df_ueu_elec.set_index(datetime_index, inplace=True)

df_ueu_elec = df_ueu_elec/60
# show dataframe
# df_ueu_elec.head(5)

<a id="time_period"></a>

#### 2.3 Time_period



In [None]:
Time_End = time.time()
Time_Duration = Time_End - Time_Start
print('End of execution: ' + time.asctime() + '.')
print('Total processing time: %.1f seconds.' % Time_Duration)

<a id="plot"></a>

#### 3. Plot modelled UEU

<a id="plot_e_profile"></a>

#### 3.1 Plot electricity load profile

In [12]:
# generating a copy of the UEU_Classification of the UEU filtering by residential so that it is compatible with the elctricity dataframe
df = df_ueu.copy()
df['fid'] = (df.index + 1).astype(str) 

df = df[(df['landuse'] == 'residential') & (df['number_of_apartments'] > 0)].set_index('fid')[['UEU','area_ha', 'unique_identifier']]

# Create a list of unique identifiers present in df_ueu_elec columns
common_identifiers = df_ueu_elec.columns
# Filter df['unique_identifier'] based on common identifiers
df = df[df['unique_identifier'].isin(common_identifiers)]

# loading data
df_ueu_elec = pd.read_pickle(input_path + "\\ueu_electricity_load_profiles.pkl")
# dropping innecesary columns
df_ueu_elec.drop(['Time (h)'],axis=1,inplace=True)

# mapping the correspoing UEU_Classification to each load profile
df_ueu_elec.columns = df.loc[df_ueu_elec.columns, 'UEU']

# Set the defined index
df_ueu_elec.set_index(datetime_index, inplace=True)

# Add the 'Area Ha' column from the df Dataframe to the last row of the df_ueu_elec DataFrame
df_ueu_elec.loc['Area'] = df['area_ha'].values

# Normalize the data dividing the modelled electrical energy demand by the UEU's Area.
df_ueu_elec.iloc[:-1] = df_ueu_elec.iloc[:-1].div(df_ueu_elec.iloc[-1], axis=1)

# drop the unnecessary row: "Area"
df_ueu_elec = df_ueu_elec.drop(df_ueu_elec.index[-1])

# Sum all values on each column
column_sums = df_ueu_elec.sum()

# Add the 'sum' row to the DataFrame
df_ueu_elec.loc['sum'] = column_sums

# Normalize the data dividing the Normalized modelled electrical energy demand by the 'Sum' per column.
df_ueu_elec.iloc[:-1] = df_ueu_elec.iloc[:-1].div(df_ueu_elec.iloc[-1], axis=1)

# drop the unnecessary row: "sum"
df_ueu_elec = df_ueu_elec.drop(df_ueu_elec.index[-1])

# Round results to 10 decimals
df_ueu_elec = np.round(df_ueu_elec, decimals = 10)

# Add the 'unique_identifier' column from the df Dataframe to the last row of the df_ueu_elec DataFrame
df_ueu_elec.loc['unique_identifier'] = df['unique_identifier'].values

<a id="UEU_classes"></a>

#### 3.2 Separate the UEU in different classes

In [13]:
# Create the new DataFrames 'df_UEUi', i ∈ [1,16], containing only columns with header 'UEUi'
df_UEU1 = df_ueu_elec.filter(like='UEU1', axis=1)
df_UEU2 = df_ueu_elec.filter(like='UEU2', axis=1)
df_UEU3 = df_ueu_elec.filter(like='UEU3', axis=1)
df_UEU4 = df_ueu_elec.filter(like='UEU4', axis=1)
df_UEU5 = df_ueu_elec.filter(like='UEU5', axis=1)
df_UEU7 = df_ueu_elec.filter(like='UEU7', axis=1)
df_UEU8 = df_ueu_elec.filter(like='UEU8', axis=1)
df_UEU9 = df_ueu_elec.filter(like='UEU9', axis=1)

<a id="e_data_arrange"></a>

#### 3.3 Arrange the data to be manipulated by securing the date-based-index 

In [14]:
from scripts.process_ueu_df import process_ueu as pad

df_UEU1_el = pad(df_UEU1)
df_UEU2_el = pad(df_UEU2)
df_UEU3_el = pad(df_UEU3)
df_UEU4_el = pad(df_UEU4)
df_UEU5_el = pad(df_UEU5)
df_UEU7_el = pad(df_UEU7)
df_UEU8_el = pad(df_UEU8)
df_UEU9_el = pad(df_UEU9)

<a id="e_Name_Dataframes"></a>

#### 3.4 Arrange the data to be manipulated by securing the date-based-index 

In [15]:
# Reset index
# List of DataFrame names
df_names = ['df_UEU1_el', 'df_UEU2_el', 'df_UEU3_el', 'df_UEU4_el', 'df_UEU5_el', 'df_UEU7_el', 'df_UEU8_el', 'df_UEU9_el']

# Iterate over DataFrame names and set the index
for df_name in df_names:
    df = globals()[df_name]  # Get the DataFrame by name
    df.index = datetime_index[:len(df)]

<a id="e_resample"></a>

#### 3.5 Resample the electricity energy demand per day, week and month

In [16]:
from scripts.resampling_fn import resample_dataframes as rs

df_UEU1_el_day, df_UEU1_el_week, df_UEU1_el_monthly = rs(df_UEU1_el)
df_UEU2_el_day, df_UEU2_el_week, df_UEU2_el_monthly = rs(df_UEU2_el)
df_UEU3_el_day, df_UEU3_el_week, df_UEU3_el_monthly = rs(df_UEU3_el)
df_UEU4_el_day, df_UEU4_el_week, df_UEU4_el_monthly = rs(df_UEU4_el)
df_UEU5_el_day, df_UEU5_el_week, df_UEU5_el_monthly = rs(df_UEU5_el)
df_UEU7_el_day, df_UEU7_el_week, df_UEU7_el_monthly = rs(df_UEU7_el)
df_UEU8_el_day, df_UEU8_el_week, df_UEU8_el_monthly = rs(df_UEU8_el)
df_UEU9_el_day, df_UEU9_el_week, df_UEU9_el_monthly = rs(df_UEU9_el)

<a id="e_plot_wssa"></a>

#### 3.6 Plot minimum, maximum and mean electricitiy energy demand in a 24 hours period of a day in Winter, Spring, Summer and Autumn

In [None]:

data_frames = [df_UEU1_el, df_UEU2_el, df_UEU3_el, df_UEU4_el, df_UEU5_el, df_UEU7_el, df_UEU8_el, df_UEU9_el]
labels = ['UEU1', 'UEU2', 'UEU3', 'UEU4', 'UEU5', 'UEU8', 'UEU8', 'UEU9']
target_dates = pd.to_datetime(['2100-01-15', '2100-04-16', '2100-07-16', '2100-10-15'])
line_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']
face_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']

pedh(data_frames, labels, target_dates, line_colors, face_colors)

<a id="e_plot_avg_day"></a>

### 3.7 Plotting of only average day on different days a year

In [None]:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib.ticker import FuncFormatter, FixedLocator

def plot_elec_demand_hour(data_frames, labels, target_dates, line_colors, face_colors):
# Convert target dates to datetime objects (remove this line to use the provided target_dates)
    target_dates = pd.to_datetime(target_dates)

    # Create a figure with 4 subplots for each DataFrame
    num_data_frames = len(data_frames)
    num_target_dates = len(target_dates)
    fig, axs = plt.subplots(num_data_frames, num_target_dates, figsize=(25, 32), sharey=True)

    for i, df in enumerate(data_frames):
        # Process the DataFrame
        df = df.apply(pd.to_numeric, errors='coerce').dropna(axis=1, how='all')

        # Calculate min, max, and mean values for each row in the processed DataFrame
        #hourly_min = df.min(axis=1)
        #hourly_max = df.max(axis=1)
        hourly_mean = df.mean(axis=1)

        for j, target_date in enumerate(target_dates):
            start_time = target_date
            end_time = target_date + pd.DateOffset(hours=23)

           # filtered_min = hourly_min[start_time:end_time]
           # filtered_max = hourly_max[start_time:end_time]
            filtered_mean = hourly_mean[start_time:end_time]

            ax = axs[i, j]  # Get the current subplot

            # Plot min, mean, and max values on the current subplot with customizable colors
            line_color = line_colors[i % len(line_colors)]  # Cycle through colors
            #ax.plot(filtered_min.index, filtered_min, label='Min', linewidth=0.5, color=line_color)
            ax.plot(filtered_mean.index, filtered_mean, label='Mean', linewidth=2, color=line_color)
            #ax.plot(filtered_max.index, filtered_max, label='Max', linewidth=0.5, color=line_color)
            
           # face_color = face_colors[i % len(face_colors)]  # Cycle through colors
           # ax.fill_between(filtered_mean.index, filtered_min, filtered_mean, facecolor=face_color, alpha=0.2)
           # ax.fill_between(filtered_mean.index, filtered_mean, filtered_max, facecolor=face_color, alpha=0.2)

            ax.set_xticks(filtered_mean.index)

            # Customize subplot titles and legends using labels parameter
            if i == 0:
                ax.set_title(f'{target_date.strftime("%B-%d")}')
            
            # Add grid lines to the current subplot
            ax.grid(color='#FFFFFF', linestyle='dashed', linewidth=1)
            ax.set_xlim(start_time, end_time)

            #'{:.3%}' specifies that the y-axis labels should be formatted as percentages with three decimal places. 
            # This will format values like 0.000000% as 0.000% and values like 0.004000% as 0.004%.
            ax.yaxis.set_major_formatter(FuncFormatter(lambda y, _: '{:.3%}'.format(y)))

            # Check if we are in the last set of 4 plots (corresponding to the last dataframe)
            if i == len(data_frames) - 1:
                ax.tick_params(axis='x')  # Show x-axis labels for the last set
                ax.set_xlabel(f'Time (h)')
                ax.set_xticklabels([hour.strftime("%H") for hour in filtered_mean.index])            
            
            else:
                ax.set_xticklabels([])  # Hide x-axis labels for other sets

        # Display the legend for the first subplot in the set of four
        axs[i, 0].legend(loc = "upper left", title = labels[i])
        axs[i, 0].set_ylabel(f'Normalized electricity demand')
        
    # Display the subplots
    plt.tight_layout()
    plt.subplots_adjust(wspace=0.1, hspace=0.2)
    plt.show()

data_frames = [df_UEU1_el, df_UEU2_el, df_UEU3_el, df_UEU4_el, df_UEU5_el, df_UEU7_el, df_UEU8_el, df_UEU9_el]
labels = ['UEU1', 'UEU2', 'UEU3', 'UEU4', 'UEU5', 'UEU8', 'UEU8', 'UEU9']
target_dates = pd.to_datetime(['2100-01-15', '2100-04-16', '2100-07-16', '2100-10-15'])
line_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']
face_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']

plot_elec_demand_hour(data_frames, labels, target_dates, line_colors, face_colors)

<a id=e_Plot_months></a>

#### 3.8 Plot minimum, maximum and mean electricity at energy demand within the months of January, April, July and October

In [None]:
from scripts.create_plots import plot_electricity_demand_month as pedm

data_frames = [df_UEU1_el_day, df_UEU2_el_day, df_UEU3_el_day, df_UEU4_el_day, df_UEU5_el_day, df_UEU7_el_day, df_UEU8_el_day, df_UEU9_el_day]
labels = ['UEU1', 'UEU2', 'UEU3', 'UEU4', 'UEU5', 'UEU8', 'UEU8', 'UEU9']
line_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']  # Customize line colors
face_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']  # Customize face colors

# Define date ranges
date_ranges = [
    ('2100-01-01', '2100-01-31'),
    ('2100-04-01', '2100-04-30'),
    ('2100-07-01', '2100-07-31'),
    ('2100-10-01', '2100-10-31')
]

pedm(data_frames, labels, line_colors, face_colors, date_ranges)

<a id=e_Plot_year></a>

#### 3.9 Plot drawing per minimum, maximum and mean yearly electricity energy demand per UEU

In [None]:
from scripts.create_plots import plot_electricity_demand_year as pedy

data_frames = [df_UEU1_el_day, df_UEU2_el_day, df_UEU3_el_day, df_UEU4_el_day, df_UEU5_el_day, df_UEU7_el_day, df_UEU8_el_day, df_UEU9_el_day]
labels = ['UEU1', 'UEU2', 'UEU3', 'UEU4', 'UEU5', 'UEU7', 'UEU8', 'UEU9']
line_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']  # Customize line colors
face_colors = ['red', 'blue', 'green', 'orange', 'purple', 'lightseagreen', 'magenta', 'steelblue']  # Customize face colors

start_date = '2100-01-01'
end_date = '2100-12-31'

pedy(data_frames, labels, start_date, end_date, line_colors, face_colors)

<a id=e_separate_values_hours_year></a>

#### 3.10 Separate the min, mean and max values hourly per year

In [22]:
from scripts.tables import process_date_range as pdr

dataframes = [df_UEU1_el, df_UEU2_el, df_UEU3_el, df_UEU4_el, df_UEU5_el, df_UEU7_el, df_UEU8_el, df_UEU9_el]
labels = ['UEU1_el', 'UEU2_el', 'UEU3_el', 'UEU4_el', 'UEU5_el', 'UEU7_el', 'UEU8_el', 'UEU9_el']

# Define the target date range
start_date_1 = pd.to_datetime('2100-01-01 00:00:00')
end_date_1 = pd.to_datetime('2100-12-31 23:00:00')

e_min_h_year, e_max_h_year, e_mean_h_year = pdr(dataframes, labels, start_date_1, end_date_1)

<a id=e_store_tables_hourly_year></a>

#### 3.11 Store the min, mean and max dataframes hourly per year in csv files

In [23]:
e_min_h_year.to_csv(output_path + "\\e_min_h_year.csv", index=False)
e_max_h_year.to_csv(output_path + "\\e_max_h_year.csv", index=False)
e_mean_h_year.to_csv(output_path + "\\e_mean_h_year.csv", index=False)

<a id=e_separation_of_ueu_tables_hourly_year></a>

#### 3.12 For normalized mean hourly electricity demand in a year, separate the dataframes to plot daily electricity changes

In [24]:
from scripts.create_plots import separate as sp

e_ueu1_copy = sp(e_mean_h_year, 0)
e_ueu2_copy = sp(e_mean_h_year, 1)
e_ueu3_copy = sp(e_mean_h_year, 2)
e_ueu4_copy = sp(e_mean_h_year, 3)
e_ueu5_copy = sp(e_mean_h_year, 4)
e_ueu7_copy = sp(e_mean_h_year, 5)
e_ueu8_copy = sp(e_mean_h_year, 6)
e_ueu9_copy = sp(e_mean_h_year, 7)

<a id=e_process_ueu_tables_hourly_year_for_printing></a>

#### 3.13 Process data

In [25]:
from scripts.create_plots import process_data

e_ueu1_copy = process_data(e_ueu1_copy, 'UEU1_el')
e_ueu2_copy = process_data(e_ueu2_copy, 'UEU2_el')
e_ueu3_copy = process_data(e_ueu3_copy, 'UEU3_el')
e_ueu4_copy = process_data(e_ueu4_copy, 'UEU4_el')
e_ueu5_copy = process_data(e_ueu5_copy, 'UEU5_el')
e_ueu7_copy = process_data(e_ueu7_copy, 'UEU7_el')
e_ueu8_copy = process_data(e_ueu8_copy, 'UEU8_el')
e_ueu9_copy = process_data(e_ueu9_copy, 'UEU9_el') 

<a id=e_filter_ueu_tables_hourly_year_for_printing></a>

#### 3.14 Filter data

In [26]:
from scripts.create_plots import filter_dataframe as fd

e_ueu1 = fd(e_ueu1_copy)
e_ueu2 = fd(e_ueu2_copy)
e_ueu3 = fd(e_ueu3_copy)
e_ueu4 = fd(e_ueu4_copy)
e_ueu5 = fd(e_ueu5_copy)
e_ueu7 = fd(e_ueu7_copy)
e_ueu8 = fd(e_ueu8_copy)
e_ueu9 = fd(e_ueu9_copy)

<a id=e_printing_ueu_energy_demand_hourly_year></a>

#### 3.15 Plot normalized hourly electricity demand per day during a year

In [None]:
from scripts.create_plots import plot_energy_demand_distribution as pedd

# List of dataframes and labels
data_frames = [e_ueu1, e_ueu2, e_ueu3, e_ueu4, e_ueu5, e_ueu7, e_ueu8, e_ueu9]
labels = ['UEU1', 'UEU2', 'UEU3', 'UEU4', 'UEU5', 'UEU7', 'UEU8', 'UEU9']

# Call the function to plot all dataframes
pedd(data_frames, labels)


<a id=e_descrptive_statistics></a>

### 3.16 Descriptive statistics

In [32]:
# First guarantee the data type as floats
e_min_h_year = e_min_h_year.astype(float)
e_mean_h_year = e_mean_h_year.astype(float)
e_max_h_year = e_max_h_year.astype(float)

# Calculate descriptive statistics
min_desc = e_min_h_year.describe()
mean_desc = e_mean_h_year.describe()
max_desc = e_max_h_year.describe()

# Path where the Excel file will be saved
file_path = output_path + '\\descriptive_statistics.xlsx'

# Use ExcelWriter to write each DataFrame to a separate sheet
with pd.ExcelWriter(file_path, engine='openpyxl') as writer:
    min_desc.to_excel(writer, sheet_name='Min Descriptive Stats')
    mean_desc.to_excel(writer, sheet_name='Mean Descriptive Stats')
    max_desc.to_excel(writer, sheet_name='Max Descriptive Stats')

<a id=e_correlation_matrix></a>

### 3.17 Correlation matrix


A correlation matrix is a table that shows the correlation coefficients between many variables. Each cell in the table represents the correlation coefficient between two variables.

Correlation coefficients quantify the strength and direction of the relationship between two variables. The value of a correlation coefficient ranges between -1 and 1:

If the correlation coefficient is close to 1, it indicates a strong positive correlation, meaning that as one variable increases, the other variable tends to increase as well.
If the correlation coefficient is close to -1, it indicates a strong negative correlation, meaning that as one variable increases, the other variable tends to decrease.
If the correlation coefficient is close to 0, it indicates little to no linear relationship between the variables.
The correlation matrix is a useful tool in exploratory data analysis (EDA) to understand the relationships between multiple variables in a dataset. It helps in identifying patterns, dependencies, and potential multicollinearity issues in the data. Additionally, it provides insights into which variables might be more strongly related to each other, which can be useful for feature selection or further analysis.

In [None]:
# Specify the path to the CSV file
e_mean_h_year_path = output_path + '/e_mean_h_year.csv'

# Load the data into a DataFrame
e_mean_h_year = pd.read_csv(e_mean_h_year_path)

# Calculate correlation matrix
correlation_matrix = e_mean_h_year.corr()

# Print correlation matrix
print("Correlation Matrix:")
print(correlation_matrix)

# Plot the heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title("Correlation Matrix")
plt.show()
