# mcdc_analysis_d22a

## Purpose
Using Monte Carlo Drift Correction (MCDC), analyse data produced by [data_d22a.ipynb](https://github.com/grandey/d22a-mcdc/blob/main/data_d22a.ipynb), including production of figures and tables.

## Input data requirements
NetCDF files in [data/](https://github.com/grandey/d22a-mcdc/tree/main/data/) (produced by [data_d22a.ipynb](https://github.com/grandey/d22a-mcdc/blob/main/data_d22a.ipynb)), each containing a global mean time series for a given variable, AOGCM variant, and CMIP6 experiment.

## Output files written
Figures (TODO) and tables (TODO).

## History
BSG, 2022.

In [1]:
! date

Mon Aug 15 17:15:56 +08 2022


In [2]:
from functools import cache
import itertools
import math
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pathlib
from scipy import stats
import statsmodels.api as sm
import xarray as xr

In [3]:
# Matplotlib settings
%matplotlib inline
plt.rcParams['savefig.dpi'] = 300

In [4]:
# Package versions
for p in [xr, np, pd, sm, xr]:
    print(f'{p.__name__}: {p.__version__}')

xarray: 2022.6.0
numpy: 1.23.1
pandas: 1.4.3
statsmodels.api: 0.13.2
xarray: 2022.6.0


In [5]:
# Random number generator
rng = np.random.default_rng(12345)
rng

Generator(PCG64) at 0x17089B220

## Identify AOGCM variants (source-member pairs)
Note: the AOGCM variants identified should match those identified by data_d22a.ipynb.

In [6]:
# Location of data produced by data_d22a.ipynb
in_base = pathlib.Path.cwd() / 'data' / 'regrid_missto0_yearmean_fldmean_mergetime'

# Core variables required
core_var_list = ['rsdt', 'rsut', 'rlut', # R = rsdt-rsut-rlut
                 'hfds',  # H (without flux correction)
                 'zostoga']  # Z

# Experiments required (with corresponding names, used for figs later)
exp_dict = {'piControl': 'Control', 'historical': 'Historical',
            'ssp126': 'SSP1-2.6', 'ssp245': 'SSP2-4.5',
            'ssp370': 'SSP3-7.0', 'ssp585': 'SSP5-8.5'}

# Identify source-member pairs to use
source_member_list = sorted([d.name for d in in_base.glob(f'rsdt/[!.]*_*')])  # this list will be reduced
for source_member in source_member_list.copy():  # loop over copy of source-member pairs to check data availability
    for var in core_var_list:  # loop over required variables
        for exp in exp_dict.keys():  # loop over experiments
            #in_fns = sorted(in_base.glob(f'{var}/{source_member}/{var}_{source_member}_{exp}.mergetime.nc'))
            in_fn = in_base.joinpath(f'{var}/{source_member}/{var}_{source_member}_{exp}.mergetime.nc')
            if not in_fn.is_file():  # if input file for this experiment does not exist...
                try:
                    source_member_list.remove(source_member)  # ... do not use this source-member pair
                except ValueError:  # when source-member pair has previously been removed
                    pass

print(f'{len(source_member_list)} source-member pairs identified.')
source_member_list

20 source-member pairs identified.


['ACCESS-CM2_r1i1p1f1',
 'ACCESS-ESM1-5_r1i1p1f1',
 'CMCC-CM2-SR5_r1i1p1f1',
 'CMCC-ESM2_r1i1p1f1',
 'CNRM-CM6-1_r1i1p1f2',
 'CNRM-ESM2-1_r1i1p1f2',
 'CanESM5_r1i1p1f1',
 'EC-Earth3-Veg-LR_r1i1p1f1',
 'EC-Earth3-Veg_r1i1p1f1',
 'EC-Earth3_r1i1p1f1',
 'GISS-E2-1-G_r1i1p5f1',
 'GISS-E2-1-H_r1i1p1f2',
 'IPSL-CM6A-LR_r1i1p1f1',
 'MIROC6_r1i1p1f1',
 'MPI-ESM1-2-HR_r1i1p1f1',
 'MPI-ESM1-2-LR_r1i1p1f1',
 'MRI-ESM2-0_r1i1p1f1',
 'NorESM2-LM_r1i1p1f1',
 'NorESM2-MM_r1i1p1f1',
 'UKESM1-0-LL_r1i1p1f2']

## Read input data

In [7]:
%%time
# Dictionary to hold input DataArrays
in_da_dict = {}  # keys will be tuples of (source_member, exp, var)

# List of input data variables 
in_var_list = ['rsdt', 'rsut', 'rlut',  # R = rsdt-rsut-rlut
               'hfds',  # H (without flux correction)
               'hfcorr',  # flux correction, available for very few source-member pairs
               'zostoga']  # Z

# Loop over source-member pairs, experiments, and variables
for source_member in source_member_list:
    for exp in exp_dict.keys():
        for var in in_var_list:
            # Read input data (if they exist)
            in_fn = in_base.joinpath(f'{var}/{source_member}/{var}_{source_member}_{exp}.mergetime.nc')
            try:
                in_ds = xr.open_dataset(in_fn)  # Dataset
                in_da = in_ds[var]  # DataArray
                # Remove degenerate lon and lat dimensions
                in_da = in_da.squeeze()
                # Convert time units to year
                in_da['time'] = (in_da['time'] // 1e4).astype(int)
                in_da['time'].attrs['units'] = 'a'
                # Convert zostoga units to mm
                if var == 'zostoga':
                    in_da.data = in_da.data * 1e3
                    in_da.attrs['units'] = 'mm'
                # Check: do data have non-zero values?
                if (in_da**2).sum() == 0:
                    print(f'Skipping {source_member} {exp} {var} (no non-zero values)')
                else:
                    # Save to dictionary
                    in_da_dict[(source_member, exp, var)] = in_da
            except FileNotFoundError:
                pass

print(f'in_da_dict contains {len(in_da_dict)} DataArrays')

in_da_dict contains 606 DataArrays
CPU times: user 1.51 s, sys: 48.8 ms, total: 1.56 s
Wall time: 1.67 s


In [8]:
! date

Mon Aug 15 17:15:59 +08 2022
