# Example for Pairing AEROMMA data with UFS-AQM

This example will demonstrate how to use MELODIES MONET to pair aircraft observations from AEROMMA (https://csl.noaa.gov/projects/aeromma/) to model output from the UFS-AQM (dyn*.nc and phy*.nc output files) and save the paired data for each flight as a netcdf. Users can then read these files back into MELODIES MONET to create plots or calculate statistics or use this paired data output file to do their own analysis. 

Pairing aircraft data takes awhile so it is recommended that users first pair the data and then produce the plots and statistics, so that you are not repairing everytime you want to change something small during your analysis.

This example resamples the data to '600S' to reduce memory constraints, so that this jupyter notebook can easily run under all conditions as a test. For examples, on how to submit a job to process more flight days with a shorter resampling, see the end of this jupyter notebook.

### First we import the loop_pairing function from melodies_monet.util.tools.

In [1]:
from melodies_monet.util.tools import loop_pairing

### Second, we read in a control file that explains all the pairing parameters.

In [2]:
control_fn='/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/jupyter_notebooks/control_aircraft_looping_AEROMMA_UFSAQM.yaml'

### There are two options for providing the model and observation data for pairing

**Option 1)** Provide the info in a dictionary like that below and then pair the data

In [3]:
file_pairs = {'0627_L1':{'model':{'ufsaqm':'/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/UFS-AQM/cmaq54_OriRave1/aqm.20230627/12/*dyn**.nc'},
                      'obs':{'aeromma':'/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs_short/short_AEROMMA-Merge_20230627_L1_20240410_1459.csv'}},
            '0627_L2':{'model':{'ufsaqm':'/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/UFS-AQM/cmaq54_OriRave1/aqm.20230627/12/*dyn**.nc'},
                      'obs':{'aeromma':'/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs_short/short_AEROMMA-Merge_20230627_L2_20240410_1502.csv'}}
            }

In [4]:
loop_pairing(control=control_fn,file_pairs=file_pairs)

rrfs
/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/UFS-AQM/cmaq54_OriRave1/aqm.20230627/12/*dyn**.nc
**** Reading RRFS-CMAQ model output...
1, in pair data
After pairing:         CO_LGR  pressure_obs    temp_obs                time   latitude  \
0  143.363846  88948.552632  292.387105 2023-06-27 16:00:00  34.631892   
1  106.124233  71204.356667  283.205000 2023-06-27 16:10:00  34.496007   
2  127.060810  80613.510000  287.579717 2023-06-27 16:20:00  33.853575   
3  136.830183  89684.255000  289.194500 2023-06-27 16:30:00  33.794704   
4  106.514583  83029.950000  289.219417 2023-06-27 16:40:00  33.847016   
5  150.625667  88200.826667  290.883117 2023-06-27 16:50:00  33.759474   
6  188.170900  90548.173333  290.358600 2023-06-27 17:00:00  34.088597   
7  166.075117  88358.660000  288.824700 2023-06-27 17:10:00  34.159883   
8  209.496875  87127.443333  288.734483 2023-06-27 17:20:00  34.235454   
9  129.856222  82075.160131  290.465229 2023-06-27 17:30:00  34.531718   

    longitude 

**Option 2)** Provide the info in a supplementary yaml file and then pair the data. This option is specifically useful when submitting a job for the analysis rather than using jupyter notebook.

In [5]:
loop_pairing(control=control_fn,
             file_pairs_yaml='/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/jupyter_notebooks/supplementary_aircraft_looping_file_pairs_AEROMMA_UFSAQM.yaml')

rrfs
/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/UFS-AQM/cmaq54_OriRave1/aqm.20230627/12/*dyn**.nc
**** Reading RRFS-CMAQ model output...
1, in pair data
After pairing:         CO_LGR  pressure_obs    temp_obs                time   latitude  \
0  143.363846  88948.552632  292.387105 2023-06-27 16:00:00  34.631892   
1  106.124233  71204.356667  283.205000 2023-06-27 16:10:00  34.496007   
2  127.060810  80613.510000  287.579717 2023-06-27 16:20:00  33.853575   
3  136.830183  89684.255000  289.194500 2023-06-27 16:30:00  33.794704   
4  106.514583  83029.950000  289.219417 2023-06-27 16:40:00  33.847016   
5  150.625667  88200.826667  290.883117 2023-06-27 16:50:00  33.759474   
6  188.170900  90548.173333  290.358600 2023-06-27 17:00:00  34.088597   
7  166.075117  88358.660000  288.824700 2023-06-27 17:10:00  34.159883   
8  209.496875  87127.443333  288.734483 2023-06-27 17:20:00  34.235454   
9  129.856222  82075.160131  290.465229 2023-06-27 17:30:00  34.531718   

    longitude 

Both of these options produce the same results. The supplementary yaml file is the preferred method for pairing data for many days over a large campaign.

### Finding time bounds of observation files

To support creating a dictionary or supplementary yaml file, to determine the pairing we have also created a function to find the time bounds in the observation file. To use this, first import the find_obs_time_bounds function from melodies_monet.util.tools.

In [6]:
from melodies_monet.util.tools import find_obs_time_bounds

Then specify the observational files and time variable name, call the find_obs_time_bounds function, and print bounds.

In [7]:
files = ['/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230627_L1_20240410_1459.csv']
bounds = find_obs_time_bounds(files=files,time_var ='Time_Start')

For /scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230627_L1_20240410_1459.csv, time bounds are, Min: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-27T16:09:08.000000000', dtype='datetime64[ns]'), Max: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-27T17:35:14.000000000', dtype='datetime64[ns]')


In [8]:
files = ['/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230627_L2_20240410_1502.csv']
bounds = find_obs_time_bounds(files=files,time_var ='Time_Start')

For /scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230627_L2_20240410_1502.csv, time bounds are, Min: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-27T21:16:06.000000000', dtype='datetime64[ns]'), Max: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-27T23:44:46.000000000', dtype='datetime64[ns]')


In [9]:
files = ['/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230628_L1_20240410_1504.csv']
bounds = find_obs_time_bounds(files=files,time_var ='Time_Start')

For /scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230628_L1_20240410_1504.csv, time bounds are, Min: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-28T16:26:51.000000000', dtype='datetime64[ns]'), Max: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-28T19:55:15.000000000', dtype='datetime64[ns]')


In [10]:
files = ['/scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230628_L2_20240410_1506.csv']
bounds = find_obs_time_bounds(files=files,time_var ='Time_Start')

For /scratch1/BMC/rcm2/rhs/monet_example/AEROMMA/obs/AEROMMA-Merge_20230628_L2_20240410_1506.csv, time bounds are, Min: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-28T22:28:06.000000000', dtype='datetime64[ns]'), Max: <xarray.DataArray 'time' ()> Size: 8B
array('2023-06-29T02:01:02.000000000', dtype='datetime64[ns]')


### Submit a job to reduce the resampling time or increase the number of flights

We have also created examples for submitting a job. Submitting a job on Hera is much faster and you can use a reduced resampling time (e.g., 30s) and increase the number of flights.

These are uploaded to the examples folder on the MELODIES MONET GitHub page:
* supplementary_aircraft_looping_file_pairs_AEROMMA_UFSAQM_submit.yaml - supplementary yaml file
* control_aircraft_looping_AEROMMA_UFSAQM-submit.yaml - control.yaml file for this analysis
* run_aircraft_pairing_loop.py - python script using the loop_pairing from melodies_monet.util.tools
* submit_hera.job - bash script to submit a job on Hera to run the run_aircraft_pairing_loop.py script
