## Processing Scripts and Descriptions for Input Files from Python

This notebook contains information about the data held within the *data* folder of MimiFAIRv2.

To replicate gathering and post-processing the various inputs for both Python replication testing and running of MimiFAIRv2, we carry out the following steps:

1. Start with the FAIR repository here: https://github.com/njleach/FAIR/tree/47c6eec031d2edcf09424394dbb86581a1b246ba noting the specific commit. 
2. Download the FAIR repository via Github, place the *mimifairv2_python_outputs.ipynb* script within the *notebooks*, and run.  
3. Copy all output files from (2) into the *data/python_replication/raw_data_from_python_from_python* folder.
4. Use the scripts in this notebook to post-process the data from (3) and place input files into their respective locations both in the main *data* folder as inputs to MimiFAIRv2 and the *python_replication* folder for replication testing.

#### Emissions Data
- data/python_replication/raw_data_from_python/rcmip_sspxx_emissions_1750_to_2500.csv (raw files from Python scripts)
- data/rcmip_sspxx_emissions_1750_to_2500.csv (inputs to MimiFAIRv2)

**Note doing this work on June 6, 2023 yieled small changes for some gases for ssp585 from previous files.  The gases involved were `ch2cl, chcl3, methyl_bromine, methyl_chlorine, so2, nox, co, nnvoc, bc, nh3, oc`**

We use the *data_retrieval.py* scripts to obtain emissions data using the Python code snippet below, fully described with setup in `mimifairv2_python_outputs.ipynb`.

```python
for ssp in ["ssp119", "ssp126", "ssp245", "ssp370", "ssp585"]:
    df = data_retrieval.RCMIP_to_FaIR_input_emms(ssp)
    filename = "rcmip_" + ssp + "_emissions_1750_to_2500_python.csv"
    df.to_csv("notebooks/fairv2_python_replication_data/" + filename)
```

We then copy the outputs into *data/python_replication/raw_data_from_python*.

Finally we postprocess these to linearly interpolate between future decadal years as indicated by the FAIR scripts using Julia.

In [3]:
using DataFrames, CSVFiles, Query, Interpolations

for ssp in ["ssp119", "ssp126", "ssp245", "ssp370", "ssp585"]
    df = load(joinpath(@__DIR__, "..", "raw_data_from_python", "rcmip_$(ssp)_emissions_1750_to_2500_python.csv")) |> DataFrame
    titles = names(df)
    titles[1] = "year"
    rename!(df, titles)

    for col in names(df)[2:end]
        idxs = (!ismissing).(df[:,col])
        itp = LinearInterpolation(df[:, :year][idxs], df[:,col][idxs])
        df[:,col] .= itp[df.year]
    end

    df |> save(joinpath(@__DIR__, "..", "..", "rcmip_$(ssp)_emissions_1750_to_2500.csv"))
end

Finally we add headers to each *rcmip_$(ssp)_emissions_1750_to_2500.csv* MimiFAIRv2 input file in the main *data* folder with the following metadata, these files are direct inputs to MimiFAIRv2.

```julia
# File Description: RCMIP SSP119 emissions scenario from 1750-2500.
# "Code Source: Extracted using default Python model version of FAIR2.0, available at https://github.com/njleach/
# FAIR/tree/47c6eec031d2edcf09424394dbb86581a1b246ba"
# "Paper Reference: Leach et al. 2021. ""FaIRv2.0.0: a generalized impulse response model for climate
# uncertainty and future scenario exploration,"" Geoscientific Model Development. https://doi.org/10.5194/gmd-14-3007-2021"
```

#### Default Gas Cycle Parameters
- data/raw_data_from_python/default_gas_cycle_parameters_python.csv
- data/python_replication/default_gas_cycle_parameters_python.csv
- data/default_gas_cycle_parameters.csv

We first obtain raw data using the Python code snippet below, fully described with setup in `mimifairv2_python_outputs.ipynb`.

```python
gas_parameters = get_gas_parameter_defaults()
gas_parameters.to_csv("notebooks/fairv2_python_replication_data/default_gas_cycle_parameters_python.csv")
```

We copy this file directly into the *python_replication_data* folder as *default_gas_parameters.csv*.  

We then postprocess to make compatible inputs to MimiFAIRv2 including (1) transpose rows and columns (2) select only gases of interest (don't need ones with `|` in title) and (3) add gas group label to produce *default_gas_cycle_parameters.csv*

#### Default Thermal Parameters
- data/raw_data_from_python/default_thermal_parameters_python.csv
- data/python_replication/default_thermal_parameters.csv

We first obtain raw data using the Python code snippet below, fully described with setup in `mimifairv2_python_outputs.ipynb`.

```python
thermal_parameters = get_thermal_parameter_defaults()
thermal_parameters.to_csv("notebooks/fairv2_python_replication_data/default_thermal_parameters_python.csv")
```

We copy this file directly into the *python_replication_data* folder as *default_thermal_parameters.csv*.  