## NASA Center Analysis
<details>
<summary>IMPORTANT (FEB 2025)</summary>
<br>
NEX GDDP currently has a bug that is affected tmin (minimum temperature) and tas (average temperature) for some Centers, only for SSP 1-2.6 and 3-7.0. These are the following Centers that are affected:

AMES
LARC
GISS
JPL
JSC
KSC
WFF

That means any variable that includes tmin or tas, for SSP 1-2.6 and 3-7.0, will be affected, and we will plan on not using those variables for now. The variables include:
Max DTR, Tmin >20C, Tmin < 0C, Coldest Tmin of the Year, Annual Average Tmin, any of the humid heat diagnostics (e.g., heat index, WBGT, etc).
</details>

<details>
<summary>VARIABLES</summary>

| Variable Name       | Long Name                                          | Variable Category | Units     | Description                                                                                                                                                          |
| ------------------- | -------------------------------------------------- | ----------------- | --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| tmax_days_35C       | \# days Tmax ≥35°C                                 | extreme index     | \# days   | Number of days, per year, with Tmax >=35C                                                                                                                            |
| tmax_days_90th      | \# days Tmax ≥90th Percentile                      | extreme index     | \# days   | Number of days, per year, with Tmax 90th percentile. 90th percentile calculated using all daily tmax values from 1995-2014.                                          |
| tmax_days_95th      | \# days Tmax ≥95th Percentile                      | extreme index     | \# days   | Number of days, per year, with Tmax 95th percentile. 95th percentile calculated using all daily tmax values from 1995-2014.                                          |
| tmax_days_99th      | \# days Tmax ≥99th Percentile                      | extreme index     | \# days   | Number of days, per year, with Tmax 99th percentile. 99th percentile calculated using all daily tmax values from 1995-2014.                                          |
| Hottest_Tmax        | Hottest Tmax of the Year (°C)                      | extreme index     | degrees C | Hottest Tmax value every year                                                                                                                                        |
| Max_DTR             | Largest Diurnal Temperature Range of the Year (°C) | extreme index     | degrees C | largest diurnal temperature range (tmax minus tmin) each year                                                                                                        |
| tmin_tropnights_20C | \# days Tmin ≥20°C                                 | extreme index     | \# days   | Number of days, per year with Tmin >=20C                                                                                                                             |
| tmin_frostdays_0C   | \# days Tmin ≤0°C                                  | extreme index     | \# days   | Number of days per year with Tmin <=0C                                                                                                                               |
| Coldest_Tmin        | Coldest Tmin of the Year (°C)                      | extreme index     | degrees C | Coldest minimum temperature each year                                                                                                                                |
| prec_days_dry       | \# days with precipitation ≤0.001 in               | extreme index     | \# days   | Number of days, per year, where precipitation <=1e-3 inches                                                                                                          |
| prec_days_oneinch   | \# days with precipitation ≥1 in                   | extreme index     | \# days   | Number of days, per year, where precipitation >=1 inch                                                                                                               |
| prec_days_90th      | \# days with precipitation ≥90th Percentile        | extreme index     | \# days   | Number of days, per year, where precipitation >=90th percentile. 90th percentile calculated usingd all daily precipitation values (dry days EXCLUDED) from 1995-2014 |
| prec_days_95th      | \# days with precipitation ≥95th Percentile        | extreme index     | \# days   | Number of days, per year, where precipitation >=95th percentile. 95th percentile calculated usingd all daily precipitation values (dry days EXCLUDED) from 1995-2014 |
| prec_days_99th      | \# days with precipitation ≥99th Percentile        | extreme index     | \# days   | Number of days, per year, where precipitation >=99th percentile. 99th percentile calculated usingd all daily precipitation values (dry days EXCLUDED) from 1995-2014 |
| tmax_annave         | Annual Average Tmax (°C)                           | annual average    | degrees C | Annual average maximum daily temperature                                                                                                                             |
| tmin_annave         | Annual Average Tmin (°C)                           | annual average    | degrees C | Annual average minimum daily temperature                                                                                                                             |
| prec_annave         | Annual Total Precipitation (mm)                    | annual SUM        | degrees C | Annual SUM of precipitation                                                                                                                                          |</details>

In [1]:
# Imports
import os
import warnings
from IPython.display import display
import ipywidgets as widgets
import pandas as pd
import pandasql as psql
import seaborn as sns
from matplotlib import pyplot as plt

# Suppress warnings
warnings.filterwarnings('ignore')

# Configure matplotlib
plt.rcParams['figure.figsize'] = (6, 4)
plt.rcParams['figure.dpi'] = 400
plt.rcParams['font.size'] = 8
plt.rcParams['figure.titlesize'] = 15
plt.rcParams['axes.linewidth'] = 0.1
plt.rcParams['patch.linewidth'] = 0

## Initialization

In [2]:
# Data directory
path = 'updated_extremes'
# NASA center to analysis
center = 'LARC'.upper()
# Flag to use only 2020-2099
only_future = True 

In [3]:
# DO NOT CHANGE THIS CELL
##################################
##          File names          ##
##  variable_CENTER_ssp###.csv  ##
##################################
# NASA Centers
centers = ['AMES', 'GSFC', 'JPL', 'KSC', 'MSFC', 'MAF', 'GISS',
           'LARC', 'SSC', 'GRC', 'WFF', 'JSC', 'WSTF', 'AFRC']

# Temperature variables (number of days)
temp_d = ['tmax_days_35C', 'tmax_days_90th', 'tmax_days_95th', 'tmax_days_99th',
          'tmin_tropnights_20C', 'tmin_frostdays_0C', 'HI_days_100F']

# Temperature variables (degrees Celsius)
temp_c = ['Coldest_Tmin', 'Hottest_Tmax', 'Max_DTR']

# Precipitation variables (number of days)
prec_d = ['prec_days_dry', 'prec_days_oneinch', 'prec_days_90th',
          'prec_days_95th', 'prec_days_99th']

# All variables
variables = sorted(temp_d + temp_c + prec_d)

# Not present in folder
# annave = ['tmax_annave', 'tmin_annave', 'prec_annave']

# Shared Socioeconomic Pathways (SSPs) climate scenarios
ssp = sorted(list({f.split('_')[-1][:-4] for f in os.listdir(path) 
                   if 'ssp' in f.split('_')[-1]}))
print(f'Available: {ssp}')

##################################
##         Column names         ##
##################################

# Climate Models
models = ['ACCESS-CM2', 'ACCESS-ESM1-5', 'BCC-CSM2-MR', 'CESM2', 'CMCC-ESM2',
          'CNRM-CM6-1', 'CNRM-ESM2-1', 'EC-Earth3', 'FGOALS-g3', 'GFDL-ESM4',
          'GISS-E2-1-G', 'INM-CM4-8', 'INM-CM5-0', 'IPSL-CM6A-LR',
          'KACE-1-0-G', 'MIROC6', 'MIROC-ES2L', 'MPI-ESM1-2-HR',
          'MPI-ESM1-2-LR', 'MRI-ESM2-0', 'NorESM2-LM', 'NorESM2-MM']

# Multi-model statistics
mme = ['mme-mean', 'mme-median', 'mme-pct25', 'mme-pct75']

##################################
##         Time Periods         ##
##################################
# 10 years before+after a decade
time_periods = {'short': (2020, 2049),  # 2030's: 2020-2029, 2030-2039, 2040-2049
                'mid':   (2040, 2069),  # 2050's: 2040-2049, 2050-2059, 2060-2069
                'long':  (2070, 2099),  # 2080's: 2070-2079, 2080-2089, 2090-2099
                }

Available: ['ssp126', 'ssp245', 'ssp370', 'ssp585']


# Get Files/Data

In [4]:
def get_files(path: str, center: str, variables: list = None):
    '''
    Returns list of filenames in a directory that contain a specific string.

    Args:
        path: the path to the data directory to search
        center: the NASA center name in the filename
        variables: optional list of variables to filter files (default: None)
    '''
    # Check if the provided center is valid
    if center not in centers:
        raise ValueError(f'{center} not in {centers}')

    # Get all csv files in the directory that contain the center name
    files = [os.path.join(path, f) for f in os.listdir(path) 
             if center in f and 'ssp585' not in f and f.endswith('.csv')]

    # If provided, filter files that contain any of the variables
    return files if not variables else [f for f in files if any(v in f for v in variables)]

def check_df_consistency(df_list: list[pd.DataFrame]):
    '''
    Returns T/F if all dataframes in the list have the same shape and columns.
    
    Args:
        df_list: List of pandas DataFrames to check
    '''
    if not df_list:
        return False
    
    # Get reference shape and columns from the first dataframe
    ref_shape, ref_cols = df_list[0].shape, list(df_list[0].columns)
    
    # Check if all other dataframes have the same shape and columns
    return all(df.shape == ref_shape and list(df.columns) == ref_cols for df in df_list[1:])


def label_term(year: int):
    '''
    Returns list of time period labels for the given year
    
    Args:
        year: The year to label
    
    '''
    return [t for t, (s, e) in time_periods.items() if s <= year <= e]


def preprocess(filename: str, only_future: bool = True):
    '''
    Returns preprocessed pandas DataFrame based on CSV file
    
    Args:
        filename: Path to the CSV file
        only_future: If True, drop rows with NaN terms (assumed to be past data)
    '''
    df = pd.read_csv(filename)
    
    # Extract variable name from the filename
    name = filename.split('/')[-1][:-4].split('_')
    var = '_'.join(name[:-2])
    
    # Add new columns: term, scenario, and variable
    df.insert(0, 'term', df.years.apply(label_term))
    df.insert(0, 'scenario', name[-1])
    df.insert(0, 'variable', var + ('_c' if var in temp_c else '_d'))
    
    # Explode the 'term' column (in case a year belongs to multiple terms)
    df = df.explode('term')

    # Remove rows with NaN terms if only_future is True, otherwise return all rows
    return df.dropna(subset=['term']) if only_future else df



In [5]:
# Get files for a center
files = sorted(get_files(path, center))

# Number of files found
print(f'{len(files)} {center} files')
# First 5 files
print(files[:5])

# Preprocess all files
df = [preprocess(f, only_future) for f in files]

# Check if all dataframes have the same format
if not check_df_consistency(df):
    raise ValueError('DataFrames are inconsistent')

# Combine all dataframes into one
df = pd.concat(df).reset_index(drop=True)

# Check the number of years per time period (expected is 30)
years_per_term = df.groupby(['variable', 'scenario', 'term']).size().unique()
if len(years_per_term) != 1 or years_per_term[0] != 30:
    raise ValueError(
        f'# of years per time period is incorrect: {years_per_term}')


# REMOVE THIS PART IF ERROR FIXED
# For: AMES LARC GISS JPL JSC KSC WFF
# Erronous: Max DTR, Tmin >20C, Tmin < 0C, Coldest Tmin of the Year, Annual Average Tmin, any of the humid heat diagnostics (e.g., heat index, WBGT, etc) in ssp126 and 370
errors = ['Max_DTR', 'tmin', 'Coldest_Tmin']
if center in ['AMES', 'LARC', 'GISS', 'JPL', 'JSC', 'KSC', 'WFF']:
    # Remove rows with erroneous variables for specific scenarios
    df = df[~(df.variable.str.contains('|'.join(errors), na=False, regex=True) &
              df.scenario.isin(['ssp126', 'ssp370']))]

df.info()

45 LARC files
['updated_extremes/Coldest_Tmin_LARC_ssp126.csv', 'updated_extremes/Coldest_Tmin_LARC_ssp245.csv', 'updated_extremes/Coldest_Tmin_LARC_ssp370.csv', 'updated_extremes/HI_days_100F_LARC_ssp126.csv', 'updated_extremes/HI_days_100F_LARC_ssp245.csv']
<class 'pandas.core.frame.DataFrame'>
Index: 3330 entries, 90 to 3959
Data columns (total 30 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   variable       3330 non-null   object 
 1   scenario       3330 non-null   object 
 2   term           3330 non-null   object 
 3   years          3330 non-null   float64
 4   ACCESS-CM2     3330 non-null   float64
 5   ACCESS-ESM1-5  3330 non-null   float64
 6   BCC-CSM2-MR    3330 non-null   float64
 7   CESM2          1620 non-null   float64
 8   CMCC-ESM2      3330 non-null   float64
 9   CNRM-CM6-1     3330 non-null   float64
 10  CNRM-ESM2-1    3330 non-null   float64
 11  EC-Earth3      3330 non-null   float64
 12  FGOALS-g3      3

# Calculations

In [6]:
# Calculate average for each term
cols = ['variable', 'scenario', 'term']
# term_mme = df.groupby(cols)[mme].mean().reset_index(
# ).sort_values(cols, ascending=[1, 1, 0], ignore_index=True)

# Calculate term-wise statistics
# For variables ending with '_d', use median
# For variables ending with '_c', use mean
term_mme = pd.concat([df[df.variable.str.endswith('_d')].groupby(cols)[mme].median(),
                      df[df.variable.str.endswith('_c')].groupby(cols)[mme].mean()
                     ]).reset_index().sort_values(cols, ascending=[1, 1, 0],
                                                  ignore_index=True)
display(term_mme.head())

Unnamed: 0,variable,scenario,term,mme-mean,mme-median,mme-pct25,mme-pct75
0,Coldest_Tmin_c,ssp245,short,-11.268358,-10.740965,-13.503494,-8.679038
1,Coldest_Tmin_c,ssp245,mid,-10.260196,-9.92698,-12.113423,-7.821688
2,Coldest_Tmin_c,ssp245,long,-8.900938,-8.415058,-10.516808,-6.578249
3,HI_days_100F_d,ssp126,short,13.25,11.5,5.25,19.0
4,HI_days_100F_d,ssp126,mid,16.159091,14.5,7.625,22.75


In [7]:
# Recalculate mme columns
df['calc_mean'] = df[models].mean(axis=1)
df['calc_median'] = df[models].median(axis=1)
df['calc_pct25'] = df[models].quantile(0.25, axis=1)
df['calc_pct75'] = df[models].quantile(0.75, axis=1)
calc = ['calc_mean', 'calc_median', 'calc_pct25', 'calc_pct75']
# term_calc = df.groupby(cols)[calc].mean(
# ).reset_index().sort_values(cols, ascending=[1, 1, 0], ignore_index=True)

# Calculate term-wise statistics
# For variables ending with '_d', use median
# For variables ending with '_c', use mean
term_calc = pd.concat([df[df.variable.str.endswith('_d')].groupby(cols)[calc].median(),
                      df[df.variable.str.endswith('_c')].groupby(cols)[calc].mean()
                     ]).reset_index().sort_values(cols, ascending=[1, 1, 0],
                                                  ignore_index=True)
# Rename columns to match term_mme
term_calc.columns = term_mme.columns

display(term_calc.head())

Unnamed: 0,variable,scenario,term,mme-mean,mme-median,mme-pct25,mme-pct75
0,Coldest_Tmin_c,ssp245,short,-11.268358,-10.740965,-13.503494,-8.679038
1,Coldest_Tmin_c,ssp245,mid,-10.260196,-9.92698,-12.113423,-7.821688
2,Coldest_Tmin_c,ssp245,long,-8.900938,-8.415058,-10.516808,-6.578249
3,HI_days_100F_d,ssp126,short,13.25,11.5,5.25,19.0
4,HI_days_100F_d,ssp126,mid,16.159091,14.5,7.625,22.75


In [8]:
# Check if original mme and recalculated values are equal
display((term_mme == term_calc).sum())
cc = term_mme.compare(term_calc, result_names=('term_mme', 'term_calc'))
display(cc)

variable      111
scenario      111
term          111
mme-mean      104
mme-median    111
mme-pct25     111
mme-pct75     111
dtype: int64

Unnamed: 0_level_0,mme-mean,mme-mean
Unnamed: 0_level_1,term_mme,term_calc
33,11.681818,11.681818
35,12.022727,12.022727
39,11.545455,11.545455
40,12.181818,12.181818
97,9.785714,9.785714
98,9.309524,9.309524
109,94.214286,94.214286


## Calculate Change Per (variable, scenario)
- short - mid
- short - long
- mid - long

In [9]:
query = """
SELECT
    a.variable,
    a.scenario,
    CASE
        WHEN a.term = 'short' AND b.term = 'mid' THEN 'short-mid'
        WHEN a.term = 'short' AND b.term = 'long' THEN 'short-long'
        WHEN a.term = 'mid' AND b.term = 'long' THEN 'mid-long'
    END AS term_diff,
    b.'mme-mean' - a.'mme-mean' AS 'mme-mean',
    b.'mme-median' - a.'mme-median' AS 'mme-median',
    b.'mme-pct25' - a.'mme-pct25' AS 'mme-pct25',
    b.'mme-pct75' - a.'mme-pct75' AS 'mme-pct75'
FROM term_mme a
JOIN term_mme b
    ON a.variable = b.variable
    AND a.scenario = b.scenario
    AND (
        (a.term = 'short' AND b.term = 'mid') OR
        (a.term = 'short' AND b.term = 'long') OR
        (a.term = 'mid' AND b.term = 'long')
    )
ORDER BY 1, 2, 3 DESC
"""

change = psql.sqldf(query, locals())

display(change.head())

Unnamed: 0,variable,scenario,term_diff,mme-mean,mme-median,mme-pct25,mme-pct75
0,Coldest_Tmin_c,ssp245,short-mid,1.008162,0.813985,1.390072,0.857351
1,Coldest_Tmin_c,ssp245,short-long,2.36742,2.325906,2.986686,2.100789
2,Coldest_Tmin_c,ssp245,mid-long,1.359258,1.511921,1.596615,1.243439
3,HI_days_100F_d,ssp126,short-mid,2.909091,3.0,2.375,3.75
4,HI_days_100F_d,ssp126,short-long,2.863636,3.25,2.25,3.5


# Results

In [10]:
# Create the filter dropdown
filter_dropdown = widgets.SelectMultiple(
    options=['All'] + term_mme.variable.unique().tolist(), 
    value=('All',), 
    description='Filter:'
)

# Create the sort dropdown
sort_dropdown = widgets.Dropdown(
    options=['Default'] + mme,
    value='Default',
    description='Sort by:'
)

def update_display(filter_selection, sort_by):
    def process_dataframe(df):
        if 'All' not in filter_selection:
            df = df[df.variable.isin(filter_selection)]
        
        if sort_by != 'Default':
            df = df.sort_values(by=sort_by)
        
        display(df)
        col = [c for c in df.columns if 'term' in c][0]
        print('Average of Table Above')
        display(df.groupby(col)[mme].mean().reset_index()
            .sort_values(col, ascending=False))
        
    
    print('Short/Mid/Long Term Averages')
    process_dataframe(term_mme)
    print('Short/Mid/Long Term Changes')
    process_dataframe(change)


# Create the interactive widget
interactive_widget = widgets.interactive(
    update_display, 
    filter_selection=filter_dropdown,
    sort_by=sort_dropdown
)

# Display the interactive widget
display(interactive_widget)

interactive(children=(SelectMultiple(description='Filter:', index=(0,), options=('All', 'Coldest_Tmin_c', 'HI_…

## Variables

In [11]:
print('variables:')
display({i: variables[i] for i in range(len(variables))})
print('\nssp scenario:')
display({i: ssp[i] for i in range(len(ssp))})

variables:


{0: 'Coldest_Tmin',
 1: 'HI_days_100F',
 2: 'Hottest_Tmax',
 3: 'Max_DTR',
 4: 'prec_days_90th',
 5: 'prec_days_95th',
 6: 'prec_days_99th',
 7: 'prec_days_dry',
 8: 'prec_days_oneinch',
 9: 'tmax_days_35C',
 10: 'tmax_days_90th',
 11: 'tmax_days_95th',
 12: 'tmax_days_99th',
 13: 'tmin_frostdays_0C',
 14: 'tmin_tropnights_20C'}


ssp scenario:


{0: 'ssp126', 1: 'ssp245', 2: 'ssp370', 3: 'ssp585'}

## Short/Mid/Long Term Averages/Median

In [12]:
# Change number/index in square brackets 
# to get variable/ssp name according to ouput above
var = variables[2]
s = ssp[0]

print(f'Results for {var} and maybe {s}')
print('Original Table')
temp = term_mme[term_mme.variable.str.contains(var)]#[term_mme.scenario==s]
display(temp)

print('Overall Statistics')
display(temp.describe())

# Group by 'term' and apply aggregation
print('Short/Mid/Long Term Statistics')
display(temp[['term'] + mme].groupby('term')
        .agg({col: ['min', 'median', 'max'] for col in mme}))

Results for Hottest_Tmax and maybe ssp126
Original Table


Unnamed: 0,variable,scenario,term,mme-mean,mme-median,mme-pct25,mme-pct75
12,Hottest_Tmax_c,ssp126,short,39.446548,39.302368,37.945853,40.872249
13,Hottest_Tmax_c,ssp126,mid,39.947213,39.703359,38.368416,41.43042
14,Hottest_Tmax_c,ssp126,long,39.878945,39.65352,38.303831,41.313862
15,Hottest_Tmax_c,ssp245,short,39.078562,38.982095,37.72787,40.399634
16,Hottest_Tmax_c,ssp245,mid,39.55323,39.477477,38.136423,40.921464
17,Hottest_Tmax_c,ssp245,long,40.11275,40.004212,38.667057,41.446354
18,Hottest_Tmax_c,ssp370,short,39.775092,39.792292,38.341625,41.182024
19,Hottest_Tmax_c,ssp370,mid,40.45273,40.523494,38.920641,41.810544
20,Hottest_Tmax_c,ssp370,long,41.974741,41.83387,40.486604,43.474324


Overall Statistics


Unnamed: 0,mme-mean,mme-median,mme-pct25,mme-pct75
count,9.0,9.0,9.0,9.0
mean,40.024423,39.919188,38.544258,41.427875
std,0.831782,0.838556,0.810104,0.868233
min,39.078562,38.982095,37.72787,40.399634
25%,39.55323,39.477477,38.136423,40.921464
50%,39.878945,39.703359,38.341625,41.313862
75%,40.11275,40.004212,38.667057,41.446354
max,41.974741,41.83387,40.486604,43.474324


Short/Mid/Long Term Statistics


Unnamed: 0_level_0,mme-mean,mme-mean,mme-mean,mme-median,mme-median,mme-median,mme-pct25,mme-pct25,mme-pct25,mme-pct75,mme-pct75,mme-pct75
Unnamed: 0_level_1,min,median,max,min,median,max,min,median,max,min,median,max
term,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
long,39.878945,40.11275,41.974741,39.65352,40.004212,41.83387,38.303831,38.667057,40.486604,41.313862,41.446354,43.474324
mid,39.55323,39.947213,40.45273,39.477477,39.703359,40.523494,38.136423,38.368416,38.920641,40.921464,41.43042,41.810544
short,39.078562,39.446548,39.775092,38.982095,39.302368,39.792292,37.72787,37.945853,38.341625,40.399634,40.872249,41.182024


## Short/Mid/Long Term Averages/Median Changes

In [13]:
# Change number/index in square brackets 
# to get variable/ssp name according to ouput above
var = variables[0]
s = ssp[0]

print(f'Results for {var} and maybe {s}')
print('Change Table')
temp = change[change.variable.str.contains(var)]#[term_mme.scenario==s]
display(temp)

print('Overall Statistics')
display(temp.describe())

# Group by 'term' and apply aggregation
print('Short/Mid/Long Term Statistics')
display(temp[['term_diff'] + mme].groupby('term_diff')
        .agg({col: ['min', 'median', 'max'] for col in mme}))

Results for Coldest_Tmin and maybe ssp126
Change Table


Unnamed: 0,variable,scenario,term_diff,mme-mean,mme-median,mme-pct25,mme-pct75
0,Coldest_Tmin_c,ssp245,short-mid,1.008162,0.813985,1.390072,0.857351
1,Coldest_Tmin_c,ssp245,short-long,2.36742,2.325906,2.986686,2.100789
2,Coldest_Tmin_c,ssp245,mid-long,1.359258,1.511921,1.596615,1.243439


Overall Statistics


Unnamed: 0,mme-mean,mme-median,mme-pct25,mme-pct75
count,3.0,3.0,3.0,3.0
mean,1.57828,1.550604,1.991124,1.400526
std,0.705602,0.756703,0.868345,0.636429
min,1.008162,0.813985,1.390072,0.857351
25%,1.18371,1.162953,1.493343,1.050395
50%,1.359258,1.511921,1.596615,1.243439
75%,1.863339,1.918914,2.29165,1.672114
max,2.36742,2.325906,2.986686,2.100789


Short/Mid/Long Term Statistics


Unnamed: 0_level_0,mme-mean,mme-mean,mme-mean,mme-median,mme-median,mme-median,mme-pct25,mme-pct25,mme-pct25,mme-pct75,mme-pct75,mme-pct75
Unnamed: 0_level_1,min,median,max,min,median,max,min,median,max,min,median,max
term_diff,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
mid-long,1.359258,1.359258,1.359258,1.511921,1.511921,1.511921,1.596615,1.596615,1.596615,1.243439,1.243439,1.243439
short-long,2.36742,2.36742,2.36742,2.325906,2.325906,2.325906,2.986686,2.986686,2.986686,2.100789,2.100789,2.100789
short-mid,1.008162,1.008162,1.008162,0.813985,0.813985,0.813985,1.390072,1.390072,1.390072,0.857351,0.857351,0.857351


# Quality check Plots - uncomment to run

In [14]:
# unit_filter = ['_c', '_d'][1]

# dropdown = widgets.SelectMultiple(
#     options=['All'] +(term_mme.variable[term_mme.variable.str.endswith(unit_filter)]
#                       .unique().tolist()),
#     value=('All',), description='Filter:'
# )

# def update_plot(selection):
#     filter_condition = (term_mme.variable.str.endswith(unit_filter) 
#                         if 'All' in selection 
#                         else term_mme.variable.isin(selection))
#     term_mme_f, change_f = term_mme[filter_condition], change[filter_condition]
    
#     unit_display = '# days' if unit_filter == '_d' else 'degree C'
    
#     for data, title in [(term_mme_f, 'Term MME'), (change_f, 'Change')]:
#         pp = sns.pairplot(data=data, vars=mme, hue='variable', 
#                           corner=True, plot_kws={'alpha': 0.5})
#         pp.fig.suptitle(f'{title}: unit({unit_display})')
    
#     plt.show()

# widgets.interactive(update_plot, selection=dropdown)


## Time Series: check for sudden spikes/drops

In [15]:
# temp = df[df.variable.str.contains(temp_d[0])][df.scenario==ssp[0]]

# for m in models:
#     sns.lineplot(temp, x='years', y=m, label=m, linewidth=0.1)
# plt.show()

# for m in mme:
#     sns.lineplot(temp, x='years', y=m, label=m, linewidth=0.2)
# plt.show()