# Limiting factors (non-border) analysis

_Dataset:_ Supplementary data for Megill and Grewe (2024): "Investigating the limiting aircraft design-dependent and environmental factors of persistent contrail formation".

_Authors:_

- Liam Megill (1, 2), https://orcid.org/0000-0002-4199-6962   
- Volker Grewe (1, 2), https://orcid.org/0000-0002-8012-6783  

_Affiliation (1)_: Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany

_Affiliation (2)_: Delft University of Technology (TU Delft), Faculty of Aerospace Engineering, Section Aircraft Noise and Climate Effects (ANCE), Delft, The Netherlands

_Corresponding author_: Liam Megill, liam.megill@dlr.de

_doi_: https://doi.org/10.5194/egusphere-2024-3398

---


### Summary
This notebook analyses the limiting factors of persistent contrail formation within each grid cell (not along the border). The goal is to understand the aircraft design-, altitude-, latitude- and seasonal dependence of these factors. 

### Inputs
- `data/aircraft_specs_v2.nc`: Aircraft specifications created with `02-lm-create_aircraft_specs.ipynb`.
- `data/processed/limfac/areas_grib.pickle`: Grid cell areas
- ERA5 GRIB data: If not performing the study on DKRZ Levante, the ERA5 GRIB data needs to be saved locally and `dir_path` updated. We recommend placing the ERA5 files in `data/raw/`. Ensure Ensure that the file naming matches that of `t_file_path` and `r_file_path`.

### Outputs
- `data/processed/limfac/all/nonborder_limfac_r1M_{season_year}-{i_mon}_ERA5_GRIB_{cor_savename_ext}.nc`: Limiting factors results season-year (e.g. 2010 DJF) `season_year`, month index `i_mon` and RHi enhancement `cor_savename_ext`.
- 

---

### Copyright

Copyright © 2024 Liam Megill

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

To start, set the top-level directory path `project_dir`. Then, select the starting and ending seasons. Unlike the boundary limiting factors analyses, this analysis considers all aircraft designs simultaneously. In the linked paper, a `random_seed` of 42 and `num_h_per_s` (number of hours selected per season) of 2160. The RHi enhancement ("correction") can be selected by modifying variables `rhi_cor` (multiplier) and `cor_savename_ext` (extension that gets added to the end of the save name). 

In [None]:
# load modules
import numpy as np
import pandas as pd
import xarray as xr
import warnings
import cf2cdm
from collections import defaultdict
import datetime
import pickle
import time

# options
random_seed = 42  # seed for np.random
num_h_per_s = 2160  # total number of random hours per season (DJF, MAM, JJA, SON)
starting_season = "2013DJF"
ending_season = "2014SON"  # inclusive!

# define directories
project_dir = ""  # set top-level directory path
processed_data_dir = project_dir + "data/processed/limfac/"

# correction
rhi_cor = 0.98  # correction to RHi
cor_savename_ext = "cor-98p"  # this gets added to the end of the savename

The first step is to select the random hours within the 2010 decade that will be analysed.

In [None]:
from helper import select_random_hours, get_season_year, generate_season_years

# select, sort and group dates by season-year
selected_hours = select_random_hours(num_h_per_s, seed=random_seed) 
all_selected_hours = np.concatenate(selected_hours)  # flatten across seasons
all_selected_hours.sort()  # sort by datetime
all_selected_hours = pd.to_datetime(all_selected_hours)  # convert to pandas for easier manipulation
season_year_to_dates = defaultdict(list)
for hour in all_selected_hours:  # group by season-year
    season_year = get_season_year(hour)
    season_year_to_dates[season_year].append(hour)

# create full list of season-years
all_season_years = generate_season_years(2010, 2019)
starting_season_idx = all_season_years.index(starting_season)
ending_season_idx = all_season_years.index(ending_season)
season_years = all_season_years[starting_season_idx:ending_season_idx+1]

The next step is to calculate the limiting factors. We start by loading `areas`, which are required by the main function. Next, we run through each day within each `season_year`, load the ERA5 data from file, select relevant hours and then run the limiting factors function `calc_limfacs_nonborder`. We concatenate the results of each day into `dsg_mon`, normalise the results and then save the resulting `dsg_sum` to file. To save memory, we save the results per month (rather than per season) and remove `dsg_season` and `dsg_sum` after each loop.

In [None]:
from helper import calc_limfacs_nonborder

# load areas
with open(processed_data_dir+"areas_grib.pickle", "rb") as f:
    areas = pickle.load(f)

ac_ids = ["AC8", "AC7", "AC3", "AC0", "AC1", "AC4"]
ac_full = xr.load_dataset(project_dir+"data/aircraft_specs_v2.nc")

for season_year in season_years:
    start_time = time.time()
    if season_year != season_years[0]:
        time_info = f"Time of last season-year calculation: {d_time}"
    else:
        time_info = ""
    print(f"Processing season-year: {season_year}. {time_info}")
    datetimes = season_year_to_dates[season_year]
    datetimes = pd.to_datetime(datetimes)
    unique_dates = pd.Series(datetimes).dt.date.unique()

    # calculate and save results per month to limit RAM use
    for i_mon in range(3):
        print(f"Month index {i_mon}")
        mon_num = seasons[season_year[-3:]][i_mon]
        date_mon_idxs = [d.month == mon_num for d in unique_dates]
        mon_unique_dates = unique_dates[date_mon_idxs]

        for idx, date in enumerate(mon_unique_dates):
            date_str = date.strftime("%Y-%m-%d")
            dir_path = "/pool/data/ERA5/E5/pl/an/1H/" 
            t_file_path = f"{dir_path}130/E5pl00_1H_{date_str}_130.grb"  # temperature file (130)
            r_file_path = f"{dir_path}157/E5pl00_1H_{date_str}_157.grb"  # relative humidity file (157)
            
            # load and merge dataset dsg
            dsg_t = xr.open_dataset(t_file_path, engine="cfgrib", backend_kwargs={"indexpath": None})
            dsg_r = xr.open_dataset(r_file_path, engine="cfgrib", backend_kwargs={"indexpath": None})
            dsg = xr.merge([dsg_t, dsg_r])
            with warnings.catch_warnings():  # ignoring UserWarning from cf2cdm when converting coordinate time -> time
                warnings.simplefilter('ignore')
                dsg = cf2cdm.translate_coords(dsg, cf2cdm.ECMWF)  # convert to ECMWF coordinates
            dsg = dsg.isel(level=[18, 19, 20, 21, 22, 23, 24])  # selecting only the levels that are interesting
            
            # extract relevant hours for the current date and filter dsg
            hours_for_date = [hour for hour in datetimes if hour.date() == date]
            relevant_times = pd.to_datetime(hours_for_date)
            dsg = dsg.sel(time=relevant_times)
            
            # calculate limiting factors and concatenate across season
            res = calc_limfacs_nonborder(dsg, ac_full, ac_ids, rhi_cor)
            res = res.assign_coords(time=date)
            if idx == 0:
                dsg_mon = res
            else:
                dsg_mon = xr.concat([dsg_mon, res], dim="time")


        # sum and save dataset
        mon_len_datetimes = sum(datetimes.month == mon_num)
        dsg_sum = dsg_mon.sum(dim="time") / mon_len_datetimes
        dsg_sum = dsg_sum.assign(n_time=mon_len_datetimes)
        dsg_sum.attrs.update({"author": "Liam Megill",
                              "institution": "Deutsches Zentrum für Luft- und Raumfahrt, Institute of Atmospheric Physics",
                              "description": "Monthly non-border contrail limiting factors, calculated using random hours within the 2010 decade of ERA5 GRIB data stored on DKRZ Levante",
                              "seed": random_seed,
                              "aircraft_ids": ac_ids,
                              "timespan": f"{season_year}",
                              "n_time": mon_len_datetimes,
                              "created": "{} CET".format(datetime.datetime.today().strftime("%Y-%m-%d %H:%M:%S")),
                              "corrections": "None"})
    
        savename = f"{processed_data_dir}all/nonborder_limfac_r1M_{season_year}-{i_mon}_ERA5_GRIB_{cor_savename_ext}.nc"
        dsg_sum.to_netcdf(savename)

        # save memory
        del dsg_sum
        del dsg_mon

    end_time = time.time()
    d_time = end_time - start_time