# Calculate histograms of the maximum mixing line slope

_Dataset:_ Supplementary data for Megill and Grewe (2024): "Investigating the limiting aircraft design-dependent and environmental factors of persistent contrail formation".

_Authors:_

- Liam Megill (1, 2), https://orcid.org/0000-0002-4199-6962   
- Volker Grewe (1, 2), https://orcid.org/0000-0002-8012-6783  

_Affiliation (1)_: Deutsches Zentrum für Luft- und Raumfahrt (DLR), Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany

_Affiliation (2)_: Delft University of Technology (TU Delft), Faculty of Aerospace Engineering, Section Aircraft Noise and Climate Effects (ANCE), Delft, The Netherlands

_Corresponding author_: Liam Megill, liam.megill@dlr.de

_doi_: https://doi.org/10.5194/egusphere-2024-3398

---


### Summary
This notebook calculates histograms of the maximum mixing line slope $G_{max}$ using ERA5 data between `start_date` and `end_date`. This mixing line slope is the steepest slope that an aircraft can have for any given ambient conditions before a persistent contrail begins to form (see the second figure created in `11-lm-supporting_graphs.ipynb`). By cumulatively summing the histograms, the potential persistent contrail formation can be obtained as a function of the mixing line slope. This is done in `17-lm-analyse_Gmax.ipynb`. One month takes around 40 minutes to calculate on DKRZ Levante and requires approximately 15 GB of RAM.

### Inputs
- ERA5 GRIB data: If not performing the study on DKRZ Levante, the ERA5 GRIB data needs to be saved locally and `dir_path` updated. We recommend placing the ERA5 files in `data/raw/`. Ensure Ensure that the file naming matches that of `t_file_path` and `r_file_path`.

### Outputs
- `data/processed/ppcf/ppcfhist_1M_{YYYY-MM}_ERA5_GRIB_{cor_savename_ext}.nc`: Histogram for a given combination of year, month and RHi enhancement ("correction"). 

---

### Copyright

Copyright © 2024 Liam Megill

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

In [None]:
import xarray as xr
import numpy as np
import datetime
import cf2cdm
import warnings

# define directories
project_dir = ""  # set top-level directory path
processed_data_dir = project_dir + "data/processed/ppcf/"

# dates
start_date = datetime.date(2010, 1, 1)
end_date = datetime.date(2010, 12, 1)

# other options
test_savename_ext = False  # this adds "test" to the savename
rhi_cor = 1.0  # correction to RHi
cor_savename_ext = "uncor"  # this gets added to the end of the savename

The first step is to format the dates that will be analysed.

In [None]:
from helper import find_month_boundaries

# create dates
date_arr = [start_date + datetime.timedelta(days=x) for x in range((end_date - start_date).days + 1)]
formatted_date_arr = [date.strftime("%Y-%m-%d") for date in date_arr]
date_dt64 = np.array(date_arr, dtype="datetime64[ns]")

# find month start and end dates
month_dates = find_month_boundaries(start_date, end_date)

The next step is to calculate the histograms. We start by defining the histogram, which in the linked study has a `bin_size` of 0.2 with `bin_limits` of [0.0, 4.6]. We then loop through each month, ensuring that only full months are run. For each day in the month, we load the ERA5 data from file, select the relevant pressure levels and calculate the histogram using the `calc_hist_arr` function. We save the results on a monthly basis.

In [None]:
from helper import calc_hist_arr

# define histogram
bin_size = 0.2
bin_limits = [0., 4.6]
bin_edges = np.arange(bin_limits[0], bin_limits[1] + bin_size, bin_size)
bin_centres = bin_edges[:-1] + 0.5 * np.diff(bin_edges)

# do loop over months
for i_mon, (mon_start, mon_end) in enumerate(month_dates):
    date_idxs = np.where((np.array(date_arr) >= mon_start) & (np.array(date_arr) <= mon_end))[0]  # the np.where function produces an extra level
    
    # check if all days within a month are included. If not, do not run the month (prevents monthly data that is not complete!)
    if len(date_idxs) == mon_end.day-mon_start.day+1:  
        print(mon_start)
    
        # do loop over days in month
        for idx in date_idxs: 
            date = formatted_date_arr[idx]
            
            # load GRIB files
            dir_path = "/pool/data/ERA5/E5/pl/an/1H/"
            t_file_path = "130/E5pl00_1H_{}_130.grb".format(date)  # temperature file (130)
            r_file_path = "157/E5pl00_1H_{}_157.grb".format(date)  # relative humidity file (157)
            dsg_t = xr.open_dataset(dir_path+t_file_path, engine="cfgrib", backend_kwargs={"indexpath":None})
            dsg_r = xr.open_dataset(dir_path+r_file_path, engine="cfgrib", backend_kwargs={"indexpath":None})
            dsg = xr.merge([dsg_t, dsg_r])
            with warnings.catch_warnings():  # ignoring UserWarning from cf2cdm when converting coordinate time -> time
                warnings.simplefilter('ignore')
                dsg = cf2cdm.translate_coords(dsg, cf2cdm.ECMWF)  # convert to ECMWF coordinates
            dsg = dsg.isel(level=[18, 19, 20, 21, 22, 23, 24])  # selecting only the levels that are interesting

            # calculate daily histogram
            res = calc_hist_arr(dsg, bin_edges, bin_centres, rhi_cor)
            
            # if first day of the month, initialise full array
            if idx == date_idxs[0]:
                hist_arr = res
            else:
                hist_arr += res
        
        # define and save monthly histogram as dataset
        ds_hist = xr.Dataset({"tot_hist": (["level", "bin_centre"], hist_arr[:, 0, :]),
                  "xtropN_hist": (["level", "bin_centre"], hist_arr[:, 1, :]),
                  "trop_hist": (["level", "bin_centre"], hist_arr[:, 2, :]),
                  "xtropS_hist": (["level", "bin_centre"], hist_arr[:, 3, :])},
                 coords={"level": dsg.level.values, "bin_centre": bin_centres})
        ds_hist.tot_hist.attrs.update({"units": "-", "long_name": "tot_hist",
                                       "description": "Non-density histogram of all G_max"})
        ds_hist.xtropN_hist.attrs.update({"units": "-", "long_name": "xtropN_hist",
                                          "description": "Non-density histogram of G_max in the Northern extratropics (>30deg lat)"})
        ds_hist.trop_hist.attrs.update({"units": "-", "long_name": "trop_hist",
                                        "description": "Non-density histogram of G_max in the tropics (-30deg <= lat <= 30deg)"})
        ds_hist.xtropS_hist.attrs.update({"units": "-", "long_name": "xtropS_hist",
                                          "description": "Non-density histogram of G_max in the Southern extratropics (<-30deg lat)"})
        ds_hist.attrs.update({"author": "Liam Megill",
                              "institution": "Deutsches Zentrum für Luft- und Raumfahrt, Institute of Atmospheric Physics",
                              "description": "Monthly non-density histogram of G_max, calculated using ERA5 GRIB data stored on DKRZ Levante",
                              "bin_definition": f"[{bin_limits[0]}:{bin_limits[1]}] with bin size {bin_size} [Pa/K]",
                              "num_vals": f"{len(date_idxs) * 24 * len(dsg.latitude)}",
                              "timespan": mon_start.strftime("%b %Y"),
                              "created": "{} CET".format(datetime.datetime.today().strftime("%Y-%m-%d %H:%M:%S")),
                              "corrections": cor_savename_ext})
        savename = f"{'test_' if test_savename_ext else ''}ppcfhist_1M_{mon_start.strftime('%Y-%m')}_ERA5_GRIB_{cor_savename_ext}.nc"
        ds_hist.to_netcdf(processed_data_dir+savename)
