<font size="8"> **Identifying areas with long-term pack ice presence** </font>  
Long-term presence of pack ice has been found to have a high correlation with crabeater seal (*Lobodon carcinophagus*) distribution.  
  
We will calculate the proportion of time that a grid cell has a sea ice concentration (SIC) of 85\% or more. This is similar to the definition given by [Oosthuizen et al 2021](https://doi.org/10.3354/meps13787), but we use a different reference period: 7 years prior to observation taken place, rather than a set time frame as they did (Jan 2003 to Dec 2010).

We will use monthly sea ice concentration (`aice_m`) outputs to identify these areas with long-term pack ice presence.

# Setting working directory
In order to ensure these notebooks work correctly, we will set the working directory. We assume that you have saved a copy of this repository in your home directory (represented by `~` in the code chunk below). If you have saved this repository elsewhere in your machine, you need to ensure you update this line with the correct filepath where you saved these notebooks.

In [1]:
import os
os.chdir(os.path.expanduser('~/Chapter2_Crabeaters/Scripts'))

# Loading modules

In [2]:
#Accessing model data
import cosima_cookbook as cc
#Dealing with data
import xarray as xr
import numpy as np
import pandas as pd
#Data visualisation
import matplotlib.pyplot as plt
#Collection of useful functions developed for this project
import UsefulFunctions as uf
#Parallelising work
from dask.distributed import Client

# Defining dictionary of useful variables
In this dictionary we will define a variables that will be used multiple times throughout this notebook to avoid repetition. It will mostly contain paths to folders where intermediate or final outputs will be stored.

In [3]:
varDict = {'var_mod': 'aice_m',
           'model': 'ACCESS-OM2-01',
           'exp': '01deg_jra55v140_iaf_cycle4',
           'exp_ext': '01deg_jra55v140_iaf_cycle4_jra55v150_extension',
           'freq': '1 monthly',
           'base_folder': '/g/data/v45/la6889/Chapter2_Crabeaters/SeaIce/LongTerm_PackIce/'}

# Creating a session in the COSIMA cookbook

In [4]:
session = cc.database.create_session()

# Accessing ACCESS-OM2-01 data
First, we will start a cluster with multiple cores to make analysis faster. Remember the number of CPUs cannot exceed the CPUs you have access to.

In [5]:
client = Client()
client

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: /proxy/35217/status,

0,1
Dashboard: /proxy/35217/status,Workers: 4
Total threads: 16,Total memory: 64.00 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:36947,Workers: 4
Dashboard: /proxy/35217/status,Total threads: 16
Started: Just now,Total memory: 64.00 GiB

0,1
Comm: tcp://127.0.0.1:44279,Total threads: 4
Dashboard: /proxy/35341/status,Memory: 16.00 GiB
Nanny: tcp://127.0.0.1:36901,
Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-yvnx7p6b,Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-yvnx7p6b

0,1
Comm: tcp://127.0.0.1:41345,Total threads: 4
Dashboard: /proxy/34765/status,Memory: 16.00 GiB
Nanny: tcp://127.0.0.1:39611,
Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-1hfy27y2,Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-1hfy27y2

0,1
Comm: tcp://127.0.0.1:41009,Total threads: 4
Dashboard: /proxy/44085/status,Memory: 16.00 GiB
Nanny: tcp://127.0.0.1:40377,
Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-3ofbswx0,Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-3ofbswx0

0,1
Comm: tcp://127.0.0.1:37495,Total threads: 4
Dashboard: /proxy/36207/status,Memory: 16.00 GiB
Nanny: tcp://127.0.0.1:39935,
Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-kfo5xbwc,Local directory: /jobfs/89631478.gadi-pbs/dask-worker-space/worker-kfo5xbwc


The fourth run of the ACCESS-OM2-01 model has sea ice concentration (`aice_m`) outputs available from 1958 to 2022. However, these outputs are available through two different experiments: `01deg_jra55v140_iaf_cycle4` and `01deg_jra55v140_iaf_cycle4_jra55v150_extension`. Below, we are accessing the sea ice data for these experiments and merging into a single dataset to calculate the long-term presence of pack ice.

In [6]:
#Loading data from fourth cycle (1958 to 2018)
var_ice = uf.getACCESSdata_SO(varDict['var_mod'], '1971-01', '2019-01', 
                              freq = varDict['freq'], ses = session, minlat = -80,
                              exp = varDict['exp'], ice_data = True)

#Loading data from fourth cycle extension (2019 to 2022)
var_ice_ext = uf.getACCESSdata_SO(varDict['var_mod'], '2019-01', '2023-01', 
                              freq = varDict['freq'], ses = session, minlat = -80,
                              exp = varDict['exp_ext'], ice_data = True)

## Creating a single dataset for our study period
We need to merge both datasets so we can calculate the long-term presence of pack ice for our entire study period (1978 to 2022).

In [7]:
#Concatenating both data arrays into one
var_ice = xr.concat([var_ice, var_ice_ext], dim = 'time')
var_ice = uf.corrlong(var_ice)

#Removing duplicate variable
del var_ice_ext

# Long-term pack ice presence calculation
This calculation will require the following steps:
1. Identify grid cells where sea ice concentration (SIC) was 85\% or higher: We will assign a value of `1` to any grid cells that meet our condition, otherwise a value of `0` will be assigned.
2. For each timestep (month) within our period of interest (1978 to 2022) calculate proportion of time a grid cell meet our SIC condition: We add all timesteps within a 7 year period and divide by the total number of months in 7 years.
3. Create a new data array with proportion calculations.
4. Save results to local disk: Yearly files are saved due to limitations with saving very large files.

In [8]:
#Assigning a value of 1 when SIC condition is met
pack_ice = xr.where(var_ice >= 0.85, 1, 0).where(~np.isnan(var_ice))
#Checking results
pack_ice

Unnamed: 0,Array,Chunk
Bytes,11.95 GiB,759.38 kiB
Shape,"(625, 713, 3600)","(1, 270, 360)"
Dask graph,20625 chunks in 1262 graph layers,20625 chunks in 1262 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 11.95 GiB 759.38 kiB Shape (625, 713, 3600) (1, 270, 360) Dask graph 20625 chunks in 1262 graph layers Data type float64 numpy.ndarray",3600  713  625,

Unnamed: 0,Array,Chunk
Bytes,11.95 GiB,759.38 kiB
Shape,"(625, 713, 3600)","(1, 270, 360)"
Dask graph,20625 chunks in 1262 graph layers,20625 chunks in 1262 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


## Subsetting data every 7 years

In [9]:
#Defining months in 7 years
months_in_7_yrs = 7*12
#Creating a list of timesteps within our study period
times_interest = pd.period_range('1978-01', '2022-12', freq = 'M')
#Identifying the date when the 7 year period begins
times_begin = [(t-pd.offsets.MonthEnd(months_in_7_yrs)).to_timestamp() for t in times_interest]

In [10]:
#Creating empty list to save results
long_term_pack_ice = []

#Loop through each timestep of our interest
for i, t in enumerate(times_interest):
    #Select 7-year periods and calculate proportion of time a grid cell covered by at least 85% SIC
    da = pack_ice.sel(time = slice(times_begin[i], t.to_timestamp())).sum('time')/months_in_7_yrs
    #Assign a date to each timestep - Here we assign the end date of the 7 year period
    da['time'] = t.to_timestamp()
    #Add results to list
    long_term_pack_ice.append(da)

In [11]:
#Concatenate results into a single file
long_term_pack_ice = xr.concat(long_term_pack_ice, dim = 'time')
#Checking results - Note there are fewer time steps that original data. As we do not need the initial seven years.
long_term_pack_ice

Unnamed: 0,Array,Chunk
Bytes,10.33 GiB,759.38 kiB
Shape,"(540, 713, 3600)","(1, 270, 360)"
Dask graph,17820 chunks in 6664 graph layers,17820 chunks in 6664 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 10.33 GiB 759.38 kiB Shape (540, 713, 3600) (1, 270, 360) Dask graph 17820 chunks in 6664 graph layers Data type float64 numpy.ndarray",3600  713  540,

Unnamed: 0,Array,Chunk
Bytes,10.33 GiB,759.38 kiB
Shape,"(540, 713, 3600)","(1, 270, 360)"
Dask graph,17820 chunks in 6664 graph layers,17820 chunks in 6664 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


# Saving outputs to local machine
Data saved as yearly outputs due to limitations in storing a single large file.

In [12]:
#Ensuring output directory exists
os.makedirs(varDict['base_folder'], exist_ok = True)

In [13]:
#Grouping data by year
for yr, da in long_term_pack_ice.groupby('time.year'):
    #Creating name for yearly output file
    file_out = os.path.join(varDict['base_folder'], f'LongTerm_PackIce_Monthly_Jan-Dec_{yr}.nc')
    #Saving yearly output file
    da.to_netcdf(file_out)