# Silicone
### Erica Simon, 02/13/24
## Purpose: use the Silicone infilling tool to fill missing emissions data for certain species
Credit: 
- Lamboll, R. D., Nicholls, Z. R. J., Kikstra, J. S., Meinshausen, M., & Rogelj, J. (2020). Silicone v1.0.0: an open-source Python package for inferring missing emissions data for climate change research. *Geoscientific Model Development, 13(11),* 5259–5275. https://doi.org/10.5194/gmd-13-5259-2020
- Meinshausen, M., Lewis, J., McGlade, C. et al. Realization of Paris Agreement pledges may limit warming just below 2 °C. *Nature* 604, 304–309 (2022). https://doi.org/10.1038/s41586-022-04553-z
    - code at https://github.com/climate-resource/ndc-realisations-2021



## Reason for Archiving 
Ultimately, Silicone was not used for the project. Instead, missing species were infilled using trends from the F-gases basket provided or, for non-F-gases, using the CO2 trend.

In [5]:
import pandas as pd
import numpy as np
import pyam

import silicone.multiple_infillers as mi
import silicone.database_crunchers as cr

In [6]:
import silicone.database_crunchers
from silicone.time_projectors import ExtendLatestTimeQuantile
import scmdata
import scmdata.database
import matplotlib.pyplot as plt

from tqdm.autonotebook import tqdm

In [7]:
future_df = pd.read_csv('../outputs/GCAM_cleaned.csv')
hist_df = pd.read_csv('../outputs/hist_emis_cleaned.csv')

df_to_infill = pyam.IamDataFrame(future_df)
df = pyam.IamDataFrame(hist_df)

  df.set_index(index + REQUIRED_COLS + extra_cols)
  df.set_index(index + REQUIRED_COLS + extra_cols)


In [8]:
missing_vars = np.setdiff1d(hist_df['Variable'].unique(), future_df['Variable'].unique())

In [9]:
lead = ['Emissions|CO2 FFI']
variables_of_interest = ['Emissions|C3F8']
years_list = list(range(2022, 2101))

In [10]:
unavailable_variables = [
        variab for variab in variables_of_interest if variab not in df.variable
    ]


In [11]:
unavailable_variables

[]

In [12]:
df

<class 'pyam.core.IamDataFrame'>
Index:
 * model    : Historical (1)
 * scenario : GCP+CEDS+PRIMAP+GFED (1)
Timeseries data coordinates:
   region   : World (1)
   variable : Emissions|BC, Emissions|C2F6, Emissions|C3F8, ... Emissions|c-C4F8 (51)
   unit     : Gt CO2/yr, Mt BC/yr, Mt CH4/yr, Mt CO/yr, Mt N2O/yr, ... kt cC4F8/yr (50)
   year     : 1750, 1751, 1752, 1753, 1754, 1755, 1756, 1757, ... 2022 (273)

In [13]:
df_to_infill

<class 'pyam.core.IamDataFrame'>
Index:
 * model    : GCAM 6.0 NGFS (1)
 * scenario : Below 2 C, Current Policies, Delayed transition, ... Net Zero 2050 (7)
Timeseries data coordinates:
   region   : World (1)
   variable : Emissions|BC, Emissions|C2F6, Emissions|C3F8, ... Emissions|c-C4F8 (51)
   unit     : Gt CO2/yr, Mt BC/yr, Mt CH4/yr, Mt CO/yr, Mt N2O/yr, ... kt cC4F8/yr (50)
   year     : 2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, ... 2100 (81)

In [14]:
df_infilled = mi.infill_all_required_variables(
    df_to_infill,
    df,
    variable_leaders=lead,
    required_variables_list=variables_of_interest, # If None, would infill a default list
    cruncher=cr.QuantileRollingWindows,
    output_timesteps=years_list,
    infilled_data_prefix=None,
    to_fill_old_prefix=None,
    check_data_returned=False,
)

  wide_db = wide_db.applymap(lambda x: np.nan if isinstance(x, str) else x)
  self.meta[name] = meta[name].combine_first(self.meta[name])
Filling required variables: 100%|██████████| 1/1 [00:00<00:00, 10.07it/s]


In [15]:
df_infilled.filter(variable=variables_of_interest[0]).timeseries()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,2022,2023,2024,2025,2026,2027,2028,2029,2030,2031,...,2091,2092,2093,2094,2095,2096,2097,2098,2099,2100
model,scenario,region,variable,unit,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1
GCAM 6.0 NGFS,Below 2 C,World,Emissions|C3F8,kt C3F8/yr,0.392255,0.398346,0.404437,0.410527,0.412879,0.41523,0.417581,0.419933,0.422284,0.416373,...,0.17065,0.169835,0.16902,0.168204,0.167389,0.166423,0.165456,0.16449,0.163523,0.162557
GCAM 6.0 NGFS,Current Policies,World,Emissions|C3F8,kt C3F8/yr,0.398477,0.407678,0.41688,0.426082,0.441407,0.456732,0.472057,0.487382,0.502707,0.507078,...,0.450423,0.45303,0.455638,0.458245,0.460852,0.465679,0.470506,0.475332,0.480159,0.484985
GCAM 6.0 NGFS,Delayed transition,World,Emissions|C3F8,kt C3F8/yr,0.398477,0.407678,0.41688,0.426081,0.441406,0.456731,0.472055,0.48738,0.502705,0.49004,...,0.169338,0.16845,0.167563,0.166676,0.165789,0.164893,0.163997,0.163101,0.162205,0.161309
GCAM 6.0 NGFS,Fragmented World,World,Emissions|C3F8,kt C3F8/yr,0.398477,0.407679,0.41688,0.426082,0.441407,0.456732,0.472057,0.487382,0.502707,0.496745,...,0.384929,0.388431,0.391933,0.395435,0.398937,0.403076,0.407215,0.411354,0.415493,0.419632
GCAM 6.0 NGFS,Low demand,World,Emissions|C3F8,kt C3F8/yr,0.386953,0.390393,0.393833,0.397273,0.392539,0.387805,0.383071,0.378337,0.373603,0.365792,...,0.143717,0.142892,0.142066,0.141241,0.140415,0.139576,0.138736,0.137897,0.137058,0.136219
GCAM 6.0 NGFS,NDCs,World,Emissions|C3F8,kt C3F8/yr,0.398573,0.407823,0.417073,0.426323,0.439007,0.451692,0.464376,0.47706,0.489745,0.492259,...,0.255665,0.251056,0.246447,0.241839,0.23723,0.234484,0.231739,0.228993,0.226247,0.223502
GCAM 6.0 NGFS,Net Zero 2050,World,Emissions|C3F8,kt C3F8/yr,0.38946,0.394153,0.398846,0.403539,0.399256,0.394972,0.390689,0.386406,0.382122,0.376629,...,0.152784,0.151896,0.151008,0.150121,0.149233,0.148364,0.147495,0.146625,0.145756,0.144887


In [16]:
df_to_infill.filter(variable=variables_of_interest[0]).timeseries().head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,2020,2021,2022,2023,2024,2025,2026,2027,2028,2029,...,2091,2092,2093,2094,2095,2096,2097,2098,2099,2100
model,scenario,region,variable,unit,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1
GCAM 6.0 NGFS,Below 2 C,World,Emissions|C3F8,kt C3F8/yr,0.380074,0.386165,0.392255,0.398346,0.404437,0.410527,0.412879,0.41523,0.417581,0.419933,...,0.17065,0.169835,0.16902,0.168204,0.167389,0.166423,0.165456,0.16449,0.163523,0.162557
GCAM 6.0 NGFS,Current Policies,World,Emissions|C3F8,kt C3F8/yr,0.380074,0.389275,0.398477,0.407678,0.41688,0.426082,0.441407,0.456732,0.472057,0.487382,...,0.450423,0.45303,0.455638,0.458245,0.460852,0.465679,0.470506,0.475332,0.480159,0.484985
GCAM 6.0 NGFS,Delayed transition,World,Emissions|C3F8,kt C3F8/yr,0.380074,0.389275,0.398477,0.407678,0.41688,0.426081,0.441406,0.456731,0.472055,0.48738,...,0.169338,0.16845,0.167563,0.166676,0.165789,0.164893,0.163997,0.163101,0.162205,0.161309
GCAM 6.0 NGFS,Fragmented World,World,Emissions|C3F8,kt C3F8/yr,0.380074,0.389275,0.398477,0.407679,0.41688,0.426082,0.441407,0.456732,0.472057,0.487382,...,0.384929,0.388431,0.391933,0.395435,0.398937,0.403076,0.407215,0.411354,0.415493,0.419632
GCAM 6.0 NGFS,Low demand,World,Emissions|C3F8,kt C3F8/yr,0.380074,0.383514,0.386953,0.390393,0.393833,0.397273,0.392539,0.387805,0.383071,0.378337,...,0.143717,0.142892,0.142066,0.141241,0.140415,0.139576,0.138736,0.137897,0.137058,0.136219


In [17]:
def extend_timeseries(infilling_database, scenario, lead='Emissions|CO2 FFI', smoothing=0):
    cruncher = silicone.time_projectors.ExtendLatestTimeQuantile(
        # infilling_database.filter(year=range(2022, 2101, 5)).to_iamdataframe()
        infilling_database.filter(year=range(2022, 2101, 5))
        
    )
    
    filler = cruncher.derive_relationship(lead, smoothing=smoothing)

    # scenario["variable"] = lead
    # extended_scenario = filler(scenario.to_iamdataframe())
    extended_scenario = filler(scenario)
    extended_scenario = scmdata.ScmRun(scenario.append(extended_scenario)).resample(
        "AS"
    )
    extended_scenario["stage"] = "extended"
    return extended_scenario


In [18]:
selected_scenarios = df_to_infill.filter(
    year=range(2022, 2100 + 1)
)

In [19]:
extended_scenario_all = scmdata.run_append(
    [
        extend_timeseries(df_to_infill, selected_scenarios)
        # for p in tqdm(pathways)
    ]
)
# extended_scenario_2050 = extended_scenario_all.filter(pathway_id=SELECTED_PATHWAY)

ValueError: The infiller database does not extend in time past the target database, so no infilling can occur.