# Gridded Data EMDA
## Ensemble of Meterological Dataset for North America 
### using probabilistc methods to estimate the uncertainty in spatial fields
https://essd.copernicus.org/articles/13/3337/2021/

1.	**Where is the observed data from?** *From weather stations on the ground, using statistical methods to fill in (spatially) daily data as a grid.* https://uofc-my.sharepoint.com/personal/heba_abdelmoaty_ucalgary_ca/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fheba%5Fabdelmoaty%5Fucalgary%5Fca%2FDocuments%2FUnfinished%5Fwork%2FCapstone%2FData&ga=1
2.	**What are the X and Y?**  *They are the grids overlaying the catchment. The X and Y of our project is in  245.25 and 49.55 respectively (so only need to use the one row).*
3.	**What are the 4 scenarios?** *Each scenario represents a predicted socio-economic status and the radiative energy that the globe will reach by the end of the century (4 socio-economic scenarios paired with radiative force we are producing (KJ/m2)).  **SSP1-2.6** = assumes reduced emissions, **SSP2-4.5** = assumes the same trend of emissions as historical, **SSP3-7.0** = medium to high emissions scenario, **SSP5-8.5**=optimal for economic development but with high emissions.*
4.	**What is the Trange?** *Trange = | Tmax – Tmin |    (probably wont need this)*


![Beautiful Sunset](CatchmentGrid_location_basemap.jpg)


# Hard Coded Variables

In [1]:
tmean_path = "C:/Users/14037/OneDrive - University of Calgary/Documents/ENCI_570/TM_PHES_code/obs_tmean_final.csv"
precip_path = "C:/Users/14037/OneDrive - University of Calgary/Documents/ENCI_570/TM_PHES_code/obs_precip_final.csv"

# Install Libraries

In [2]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
from scipy.stats import linregress
from itertools import combinations
import math


# Utility Functions

In [3]:
def filter_dataframe(df, x_value=245.25, y_value=49.55):
    """
    Filter a DataFrame to keep only rows where 'X' is equal to x_value and 'Y' is equal to y_value.

    Parameters:
    - df: pandas DataFrame
    - x_value: Value to match in the 'X' column
    - y_value: Value to match in the 'Y' column

    Returns:
    - pandas DataFrame containing filtered rows
    """
    filtered_df = df[(df['x'] == x_value) & (df['y'] == y_value)]
    return filtered_df

In [4]:
def reshape_dataframe(df, value_column_name):
    """
    Reshape a DataFrame from wide to long format by melting the date columns.

    Parameters:
    - df: pandas DataFrame with 'x', 'y', and date columns
    - value_column_name: Desired name for the value column ('tmean' or 'precip')

    Returns:
    - Reshaped pandas DataFrame with columns 'Date' and 'Value'
    """
    # Melt the DataFrame, keeping 'x' and 'y' as identifiers
    melted_df = pd.melt(df, id_vars=['x', 'y'], var_name='Date', value_name=value_column_name)

    # Convert the 'Date' column to datetime format
    melted_df['Date'] = pd.to_datetime(melted_df['Date'])

    return melted_df[['Date', value_column_name]]

In [5]:
def merge_dataframes(df_tmean, df_precip):
    """
    Merge two DataFrames based on the 'Date' column.

    Parameters:
    - df_tmean: DataFrame with 'Date' and 'tmean' columns
    - df_precip: DataFrame with 'Date' and 'precip' columns

    Returns:
    - Merged DataFrame on 'Date' column
    """
    merged_df = pd.merge(df_tmean, df_precip, on='Date', how='outer')

    return merged_df

# Data Engineering

In [6]:
temp_df = pd.read_csv(tmean_path)
temp_df = filter_dataframe(temp_df)
temp_df = reshape_dataframe(temp_df, value_column_name='tmean')
temp_df.head()

Unnamed: 0,Date,tmean
0,1979-01-01,-24.9175
1,1979-01-02,-21.8485
2,1979-01-03,-22.255
3,1979-01-04,-23.5575
4,1979-01-05,-21.4305


In [7]:
temp_df.describe()

Unnamed: 0,Date,tmean
count,13149,13149.0
mean,1996-12-31 00:00:00,2.163725
min,1979-01-01 00:00:00,-33.621
25%,1988-01-01 00:00:00,-3.800295
50%,1996-12-31 00:00:00,2.401555
75%,2005-12-31 00:00:00,9.62755
max,2014-12-31 00:00:00,22.321
std,,9.32836


In [8]:
precip_df = pd.read_csv(precip_path)
precip_df = filter_dataframe(precip_df)
precip_df = reshape_dataframe(precip_df, value_column_name='precip')
precip_df.head()

Unnamed: 0,Date,precip
0,1979-01-01,2.7711
1,1979-01-02,0.70797
2,1979-01-03,0.11758
3,1979-01-04,0.0
4,1979-01-05,0.000486


In [9]:
precip_df.describe()

Unnamed: 0,Date,precip
count,13149,13149.0
mean,1996-12-31 00:00:00,3.147572
min,1979-01-01 00:00:00,0.0
25%,1988-01-01 00:00:00,0.1917
50%,1996-12-31 00:00:00,1.1147
75%,2005-12-31 00:00:00,4.0277
max,2014-12-31 00:00:00,74.461
std,,5.130141


In [10]:
temp_precip_df = merge_dataframes(temp_df, precip_df)
temp_precip_df.head()

Unnamed: 0,Date,tmean,precip
0,1979-01-01,-24.9175,2.7711
1,1979-01-02,-21.8485,0.70797
2,1979-01-03,-22.255,0.11758
3,1979-01-04,-23.5575,0.0
4,1979-01-05,-21.4305,0.000486


In [11]:
temp_precip_df.describe()

Unnamed: 0,Date,tmean,precip
count,13149,13149.0,13149.0
mean,1996-12-31 00:00:00,2.163725,3.147572
min,1979-01-01 00:00:00,-33.621,0.0
25%,1988-01-01 00:00:00,-3.800295,0.1917
50%,1996-12-31 00:00:00,2.401555,1.1147
75%,2005-12-31 00:00:00,9.62755,4.0277
max,2014-12-31 00:00:00,22.321,74.461
std,,9.32836,5.130141


In [12]:
# save dataframe as csv
output_file_path = 'tmean_precip_gridded_data_245.25_49.55.csv'
temp_precip_df.to_csv(output_file_path, index=False)