# Welcome to SImulation of SEctoral Pathways with Uncertainty Exploration for DEcarbonization (SISEPUEDE)


https://sisepuede.readthedocs.io/en/latest/



In [2]:
## load packages
import numpy as np
import os, os.path
import pandas as pd
import pathlib
import sys
import warnings
warnings.filterwarnings("ignore") # note: you can turn this on if you want to see warnings

# add SISEPUEDE to path
dir_py = pathlib.Path(os.path.realpath(".")).parents[0].joinpath("python")
if str(dir_py) not in sys.path:
    sys.path.append(str(dir_py))

# import the file structure
import sisepuede_file_structure as sfs
    


# First thing's first: let's access the file structure
The `sisepuede_file_structure.SISEPUEDEFileStructure` object stores relevant paths and the general file structure of the entire system It's a key piece of the `SISEPUEDE` object; however, we can use it to look at the most important object in the ecosystem, the `ModelAttributes` object.

In [3]:
# initialize a file structure
file_struct = sfs.SISEPUEDEFileStructure()

MISSIONSEARCHNOTE: As of 2023-10-06, there is a temporary solution implemeted in ModelAttributes.get_variable_to_simplex_group_dictionary() to ensure that transition probability rows are enforced on a simplex.

FIX THIS ASAP TO DERIVE PROPERLY.


# SISEPUEDE's entire model variable and unit ecosystem is managed by attribute tables
SISEPUEDE is driven primarily by a collection of CSV files located in the path stored at `file_struct.dir_attribute_tables`. This directory contains:
- Sector and Subsector attribute tables
- Category attribute tables
- Variable attribute/definition tables for each subsector
- Unit attribute/definition tables
- Other attribute tables used throughout the analysis

##  The `model_attributes.ModelAttributes` object organizes these

The `model_attributes.ModelAttributes` stores model variables (as `model_variable.ModelVariable` objects, unit conversion objects, methods for converting units between variables, methods for extracting variables from dataframes, methods for writing variables to data frames, scenario dimension management, and more. 

In [4]:
# here's where attribute tables are stored
file_struct.dir_attribute_tables

'/Users/jsyme/Documents/Projects/git_jbus/sisepuede/docs/source/csvs'

In [262]:
matt = file_struct.model_attributes

# Let's look at some of the functionality of the `ModelAttributes` object

First, let's look at the fundamental building blocks of data within the SISEPUEDE architecture: attribute tables. We can start with the *sector* and *subsector* attribute tables.
- Attribute tables are stored in the `attribute_table.AttributeTable` object, which includes some methods for mapping keys, dictionaries, etc.
    - The underlying dataframe is referenced as `AttributeTable.table`
- Showing it in a jupyter notebook displays the underlying dataframe

The **Sector** attribute table stores basic information about the 5 modeling sectors, including the shorthand id, the full name, and abbreviation, and a brief description. 

In [263]:
attr_sector = matt.get_sector_attribute_table()
print(type(attr_sector))
attr_sector



<class 'attribute_table.AttributeTable'>


Unnamed: 0,sector,sector_name,abbreviation_sector,description
0,AFOLU,"Agriculture, Forestry, and Land Use",af,Agricultural activity ....
1,Circular Economy,Circular Economy,ce,Activity (demand and emission) associated with...
2,Energy,Energy,en,Activity (demand and emission) associated with...
3,IPPU,Industrial Processes and Product Use,ip,Activity (demand and emission) associated with...
4,Socioeconomic,Socioeconomic,se,Socioeconomic and demographic activity affecti...


The **Subsector** attribute table stores basic information about the 21 modeling subsectors, including the shorthand id, the full name, and abbreviation, the **primary category**, a description, whether or not it is an emission subsector, and more. 
- The **Primary Category** is a variable schema element marker used to tell variables which category space they are defined within.
- As with other variable schema elements, it is wrapped in two hash marks (\`\`)

In [12]:
attr_subsector = matt.get_subsector_attribute_table()
attr_subsector.table.head()


Unnamed: 0,sector,subsector,abbreviation_subsector,primary_category,description,model_language,emission_subsector,primary_category_py
0,AFOLU,Agriculture,agrc,``$CAT-AGRICULTURE$``,Agricultural activity ....,Python,1,agriculture
1,AFOLU,Forest,frst,``$CAT-FOREST$``,"Forest activity, including sequestration and",Python,1,forest
2,AFOLU,Land Use,lndu,``$CAT-LANDUSE$``,"Land use activity, including deforestation and...",Python,1,landuse
3,AFOLU,Livestock Manure Management,lsmm,``$CAT-MANURE-MANAGEMENT$``,Pathways for treatment and use of manure. Incl...,Python,1,manure_management
4,AFOLU,Livestock,lvst,``$CAT-LIVESTOCK$``,Livestock activity and emissions.,Python,1,livestock


##  Each subsector is associated with 2 attribute tables

1. **Variable Definition** Attribute table, which contains definitions of all variables within that section
1. **Primary Category** Attribut table. The same primary category table can be associated with multiple subsectors (e.g., IPPU and Industrial Energy share the ``$CAT-INDUSTRY$`` category attribute table)




###  The **Variable Definition** attribute table defines all variables within a subsector

- Variables are defined in the table as a *schema*
    - Includes any primary categories, units, and any gasses
- Note: subsector names are stored as attributes in ``ModelAttributes`` as ``ModelAttributes.subsec_name_ABBREVIATION``, e.g., ``ModelAttributes.subsec_name_lndu`` for land use

###  Use the ``ModelAttributes.get_attribute_table`` method to retrieve attribute tables
- By default, the ``ModelAttributes.get_attribute_table`` method returns the category attribute

In [264]:
attr_lndu = matt.get_attribute_table(matt.subsec_name_lndu)
attr_lndu

Unnamed: 0,category_name,cat_landuse,definition,data_source,hyperlink,notes,cat_forest,crop_category,other_category,pasture_category,settlements_category,wetlands_category,mineralization_in_land_use_conversion_to_managed,reallocation_transition_probability_exhaustion_category
0,Croplands,croplands,Area of land devoted to growing crops for cons...,,,,none,1,0,0,0,0,0,0
1,Forests - Mangroves,forests_mangroves,Area with mangroves,,,,``mangroves``,0,0,0,0,0,1,0
2,Forests - Primary,forests_primary,"Area of land covered in primary forest (e.g., ...",,,,``primary``,0,0,0,0,0,1,1
3,Forests - Secondary,forests_secondary,"Area of land covered in secondary forest, incl...",,,,``secondary``,0,0,0,0,0,1,0
4,Grasslands,grasslands,"Area of grasslands, including native and pastu...",,,,none,0,0,1,0,0,1,0
5,Other,other,"Other land use categories (including savannah,...",,,,none,0,1,0,0,0,0,0
6,Settlements,settlements,Area of land devoted to urban/suburban develop...,,,,none,0,0,0,1,0,0,0
7,Wetlands,wetlands,"Wetlands, which emit :math:`\text{CH}_4`",,,,none,0,0,0,0,1,0,0


###  Use the ``ModelAttributes.get_attribute_table`` method to retrieve attribute tables
- In ``ModelAttributes.get_attribute_table``, set ``table_type = "variable_definitions"``

In [20]:
# to return the variable definiton, set 
attr_lndu = matt.get_attribute_table(matt.subsec_name_lndu, table_type = "variable_definitions")
attr_lndu

Unnamed: 0,variable_type,variable,information,variable_schema,categories,reference,default_value,default_lhs_scalar_minimum_at_final_time_period,default_lhs_scalar_maximum_at_final_time_period,simplex_group,emissions_total_by_gas_component,cat_soil_management,notes
0,Input,:math:\text{CO}_2 Land Use Conversion Emission...,Annual :math:`\text{CO}_2` emission factor for...,``ef_lndu_conv_$CAT-LANDUSE-DIM1$_to_$CAT-LAND...,all,,0,1.0,1.0,,0,none,
1,Input,Fraction of Increasing Net Exports Met,| Fractional of changes to net exports to mee...,``frac_lndu_increasing_net_exports_met_$CAT-LA...,``croplands``|``grasslands``,,0,1.0,1.0,,0,none,
2,Input,Fraction of Increasing Net Imports Met,| Fractional of changes to net imports to mee...,``frac_lndu_increasing_net_imports_met_$CAT-LA...,``croplands``|``grasslands``,,0,1.0,1.0,,0,none,
3,Input,Fraction of Pastures Improved,Fraction of pasture that is improved using sus...,``frac_lndu_improved_$CAT-LANDUSE$``,``grasslands``,,0,1.0,1.0,,0,none,
4,Input,Fraction of Soils Mineral,See ` Giulia Conchedda and Francesco N. Tubiel...,``frac_lndu_soil_mineral_$CAT-LANDUSE$``,``croplands``|``grasslands``|``forests_mangrov...,,0,1.0,1.0,,0,none,
5,Input,Initial Land Use Area Proportion,Proportion of total **country land** area (%/1...,``frac_lndu_initial_$CAT-LANDUSE$``,all,,0,1.0,1.0,1.0,0,none,
6,Input,Land Use BOC :math:\text{CH}_4 Emission Factor,Annual average :math:`\text{CH}_4` emitted per...,``ef_lndu_boc_$CAT-LANDUSE$_$UNIT-MASS$_$EMISS...,``wetlands``,,0,1.0,1.0,,0,none,
7,Input,Land Use Fraction Dry,| Used to calculate :math:`\text{N}_2\text{O}...,``frac_lndu_$CAT-LANDUSE$_cl2_dry``,``grasslands``|``other``|``settlements``,,0,1.0,1.0,3.0,0,``dry_climate``,
8,Input,Land Use Fraction Fertilized,Fraction of the land use category that receive...,``frac_lndu_receiving_fertilizer_$CAT-LANDUSE$``,``croplands``|``grasslands``,,0,1.0,1.0,,0,none,
9,Input,Land Use Fraction Temperate,| Used to calculate :math:`\text{N}_2\text{O}...,``frac_lndu_$CAT-LANDUSE$_cl1_temperate``,``grasslands``,,0,1.0,1.0,4.0,0,``temperate_crop_grass``,


##  Now, let's grab a variable and look at it

SISEPUEDE distinguishes between *Variables* and *Variable Fields*.
- **Variables** are abstract objects that are associated with some number $\geq 0$ of categories within a subsector. All categories within the variable face the same underlying mathematical model when they are called in the variable system; however, difference categories may be associated with different values.
- **Variable Fields** are fields associated with a variable. These fields are defined in the relevant sectoral variable definition attribute table.

In [45]:
modvar = matt.get_variable("Area of Land Use Converted to Type")
print(type(modvar))

<class 'model_variable.ModelVariable'>


# `ModelVariable` objects contain most of the methods required to manipulate variables
##  (but `ModelAttributes` has wrappers that are most often called)

- e.g., `modvar.fields` gives the fields associated with the variable
- note that some variables are associated with `all` categories--such as the **Area of Land Use Converted to Type** variable we selected
- others, like **Area of Improved Land**, are associated with a subset of categories
- and other still--such as **Vegetarian Diet Exchange Scalar**--are associated with no category at all

In [46]:
modvar.fields

['area_lndu_conversion_to_croplands',
 'area_lndu_conversion_to_forests_mangroves',
 'area_lndu_conversion_to_forests_primary',
 'area_lndu_conversion_to_forests_secondary',
 'area_lndu_conversion_to_grasslands',
 'area_lndu_conversion_to_other',
 'area_lndu_conversion_to_settlements',
 'area_lndu_conversion_to_wetlands']

In [93]:
# properties from the attribute table can passed--e.g., any field in the attribute table
modvar.get_property("variable_type")

'Output'

In [61]:
# model variable with a reduced category space
modvar_reduced = matt.get_variable("Area of Improved Land")
print(modvar_reduced.fields)
print(modvar_reduced.dict_category_keys) # dictionary mapping a primary category to valid categories

print("\n")

# note that
modvar_nocat = matt.get_variable("Vegetarian Diet Exchange Scalar")
print(modvar_nocat.fields)
print(modvar_nocat.dict_category_keys)

['area_lndu_improved_croplands', 'area_lndu_improved_grasslands']
{'cat_landuse': ['croplands', 'grasslands']}


['scalar_lndu_vegetarian_dietary_exchange']
{}


##  `ModelVariable` objects also store units and gasses

- we can see information about:
    - the gas it is associated with
    - units, like mass and area

- if any of these are not applicable, then usually `None` is returned

In [265]:
modvar_ef = matt.get_variable(":math:\\text{CO}_2 Land Use Conversion Emission Factor")

In [266]:
# the emission factor is for co2
modvar_ef.attribute("emission_gas")

'co2'

In [267]:
# the emission factor is mass per
modvar_ef.attribute("unit_mass")

'gg'

In [268]:
# the emission factor is per unit area
modvar_ef.attribute("unit_area")

'ha'

In [269]:
# power is not associated with the model variable, so nothing shows up
modvar_ef.attribute("unit_power")

##  The `ModelVariable` is defined using the `variable_schema` field in its applicable attribute table

In [227]:
# here's the general form of the schema after attributes have been defined 
modvar_ef.schema.schema

'ef_lndu_conv_$CAT-LANDUSE-DIM1$_to_$CAT-LANDUSE-DIM2$_gg_co2_ha'

In [228]:
# here's how it was defined in the original variable table
modvar_ef.dict_varinfo.get("variable_schema")

'``ef_lndu_conv_$CAT-LANDUSE-DIM1$_to_$CAT-LANDUSE-DIM2$_$UNIT-MASS$_$EMISSION-GAS$_$UNIT-AREA$`` (``$UNIT-MASS$ = gg``, ``$EMISSION-GAS$ = co2``, ``$UNIT-AREA$ = ha``)'

In [253]:
# of course, we can back this out and see it in the original attribute table

# get the original attribute table
tab = matt.get_attribute_table(
    modvar_ef.dict_varinfo.get("subsector"),
    table_type = "variable_definitions",
)

# filter it
if tab is not None:
    tab = tab.table
    tab = tab[tab["variable"] == modvar_ef.name]

tab["variable_schema"].iloc[0]

'``ef_lndu_conv_$CAT-LANDUSE-DIM1$_to_$CAT-LANDUSE-DIM2$_$UNIT-MASS$_$EMISSION-GAS$_$UNIT-AREA$`` (``$UNIT-MASS$ = gg``, ``$EMISSION-GAS$ = co2``, ``$UNIT-AREA$ = ha``)'

# Gasses and Units are passed to `ModelVariables` in the schema; they're managed by `ModelAttributes`
- the gas attribute table includes information on gasses that are accounted for in addition to global warming potential information, the source, and any flourinated compound designation

In [257]:
# look at the gas attribute table
attr_gas = matt.get_other_attribute_table("emission_gas")
attr_gas.attribute_table

Unnamed: 0,gas,emission_gas,name,global_warming_potential_20,global_warming_potential_100,global_warming_potential_500,source,flourinated_compound_designation
0,:math:`\text{CH}_4`,ch4,Methane,81.2,27.9,7.95,IPCC AR6,none
1,:math:`\text{CO}_2`,co2,Carbon Dioxide,1.0,1.0,1.0,IPCC AR6,none
2,:math:`\text{N}_2\text{O}`,n2o,Nitrus Oxide,273.0,273.0,130.0,IPCC AR6,none
3,:math:`\text{NF}_3`,nf3,Nitrogen Trifluoride,13400.0,17400.0,18200.0,IPCC AR6,Other FC
4,:math:`\text{SF}_6`,sf6,Sulfur Hexflouride,18300.0,25200.0,34100.0,IPCC AR6,Other FC
5,Dodecafluoropentane,c5f12,Dodecafluoropentane,6510.0,9160.0,13300.0,| `Ivy et al. (2012) <https://doi.org/10.5194...,Other FC
6,HCFC-141b,ch3cci2f,"1,1-Dichloro-1-fluoroethane (HCFC-141b)",2710.0,860.0,246.0,IPCC AR6,Other FC
7,HCFC-142b,ch3ccif2,"1-Chloro-1,1-difluoroethane (HCFC-142b)",5510.0,2300.0,658.0,IPCC AR6,Other FC
8,Hexadecafluoroheptane,c7f16,Hexadecafluoroheptane,5630.0,7930.0,11300.0,`Ivy et al. (2012) <https://doi.org/10.5194/ac...,Other FC
9,HFC-125,c2hf5,Pentafluoroethane (HFC-125),6740.0,3740.0,1110.0,IPCC AR6,HFC


# Variables for a unit often need to be converted--both in the model and in data construction--so each unit has a `Units` class to help that along

-  Use `ModelAttributes.all_units` to see all valid units

In [73]:
matt.all_units

['area', 'energy', 'length', 'mass', 'monetary', 'power', 'volume']

##  Use the `ModelAttributes.get_unit()` method to access unit attribute tables
- any unit in `ModelAttributes.all_units` is valid

In [164]:
unit_energy = matt.get_unit("energy")
unit_energy.attribute_table

Unnamed: 0,energy,unit_energy,hourly_unit_power_equivalent,annualized_unit_power_equivalent,name,energy_equivalent_kj,energy_equivalent_btu,energy_equivalent_mj,energy_equivalent_mbtu,energy_equivalent_kwh,energy_equivalent_gj,energy_equivalent_mmbtu,energy_equivalent_mwh,energy_equivalent_tj,energy_equivalent_mwy,energy_equivalent_pj,energy_equivalent_gwy
0,kJ,kj,none,none,Kilojoule,1.0,0.9478134,0.001,0.00095,0.000277778,1e-06,9.48e-07,2.78e-07,1e-09,3.17e-11,1e-12,3.17e-14
1,BTU,btu,none,none,British Thermal Unit,1.05506,1.0,0.00105506,0.001,0.000293071,1.06e-06,1e-06,2.93071e-07,1.06e-09,3.34e-11,1.06e-12,3.34e-14
2,MJ,mj,none,none,Megajoule,1000.0,947.8134,1.0,0.95,0.2777778,0.001,0.000947813,0.000278,1e-06,3.17e-08,1e-09,3.17e-11
3,MBTU,mbtu,none,none,Thousand British Thermal Unit,1055.06,1000.0,1.05506,1.0,0.2930722,0.00105506,0.001,0.000293,1.06e-06,3.34e-08,1.06e-09,3.34e-11
4,kWh,kwh,``kw``,none,Kilowatt-hour,3600.0,3412.128,3.6,3.41,1.0,0.0036,0.003412128,0.001,3.6e-06,1.14e-07,3.6e-09,1.14e-10
5,GJ,gj,none,none,Gigajoule,1000000.0,947813.4,1000.0,947.81,277.7778,1.0,0.9478134,0.278,0.001,3.17e-05,1e-06,3.17e-08
6,MMBTU,mmbtu,none,none,Million British Thermal Unit,1055060.0,1000000.0,1055.06,1000.0,293.0722,1.05506,1.0,0.293,0.00105506,3.34e-05,1.06e-06,3.34e-08
7,MWh,mwh,``mw``,none,Megawatt-hour,3600000.0,3412128.0,3600.0,3412.13,1000.0,3.6,3.412128,1.0,0.0036,0.000114077,3.6e-06,1.14e-07
8,TJ,tj,none,none,Terajoule,1000000000.0,947813400.0,1000000.0,947813.4,277777.8,1000.0,947.8134,278.0,1.0,0.03168809,0.001,3.17e-05
9,MWy,mwy,none,``mw``,Megawatt-year,31557600000.0,29910720000.0,31600000.0,29910720.0,8766000.0,31557.6,29910.72,8770.0,31.5576,1.0,0.0315576,0.001


##  Use the `Units` class to convert units when known
- we use energy as an example, but the same principle applies for all units

In [174]:
# can accept values from the columns `energy`, `unit_energy` or `name` 
print(unit_energy.convert("pj", "gwy"))
print(unit_energy.convert("pj", "GWy"))
print(unit_energy.convert("pj", "Gigawatt-year"))


0.031688088
0.031688088
0.031688088


In [175]:
# invalid outputs will return 1 ,though you can set the return value using the `missing_return_val` keyword arg
print(unit_energy.convert("pj", "gwY"))

1


##  SISEPUEDE lets you convert between variable units reliable and easily
- you don't have to know the units that it's in; this is convenient if a user has adjusted input units in the attribute table


In [260]:
modvar_1 = matt.get_variable("Electrical Vehicle Efficiency")
modvar_2 = matt.get_variable("Gravimetric Energy Density")

# scalar to write energy units of modvar_1 in terms of energy units of modvar_2
# i.e., scalar*units(energy_1) -> units(energy_2)
    
scalar = matt.get_variable_unit_conversion_factor(
    modvar_1,
    modvar_2,
    "energy"
)

units_1 = modvar_1.attribute("unit_energy")
units_2 = modvar_2.attribute("unit_energy")
print(f"units modvar_1: {units_1}")
print(f"units modvar_2: {units_2}")
print(f"\nScalar applied to modvar_1 to convert energy units of modvar_1 to energy units of modvar_2:\n\t{scalar}")


units modvar_1: kwh
units modvar_2: gj

Scalar applied to modvar_1 to convert energy units of modvar_1 to energy units of modvar_2:
	0.0036


# `ModelAttributes` is also used to manipulate data in the SISEPUEDE framework
- Build fields associated with variables
- Extract data from dataframes
- Build dataframes from arrays

This is very convenient for building input data using data from other sources

In [82]:
# read in some example data
df_example = pd.read_csv(
    pathlib.Path(file_struct.dir_ref).joinpath("fake_data", "fake_data_complete.csv")
)

In [84]:
df_example.shape

(36, 2206)

In [85]:
df_example.head()

Unnamed: 0,time_period,area_gnrl_country_ha,avgload_trns_freight_tonne_per_vehicle_aviation,avgload_trns_freight_tonne_per_vehicle_rail_freight,avgload_trns_freight_tonne_per_vehicle_road_heavy_freight,avgload_trns_freight_tonne_per_vehicle_water_borne,avgmass_lvst_animal_buffalo_kg,avgmass_lvst_animal_cattle_dairy_kg,avgmass_lvst_animal_cattle_nondairy_kg,avgmass_lvst_animal_chickens_kg,...,yf_agrc_other_woody_perennial_tonne_ha,yf_agrc_pulses_tonne_ha,yf_agrc_rice_tonne_ha,yf_agrc_sugar_cane_tonne_ha,yf_agrc_tubers_tonne_ha,yf_agrc_vegetables_and_vines_tonne_ha,elasticity_lvst_goats_demand_to_gdppc,elasticity_lvst_horses_demand_to_gdppc,elasticity_lvst_mules_demand_to_gdppc,elasticity_lvst_buffalo_demand_to_gdppc
0,0,5113100.0,70.0,2000.0,20.0,2000.0,315.0,508.0,303.0,1.1,...,20.0,0.62,2.9,1.62,20.0,2.8,0.42,0.42,0.42,0.22
1,1,5113100.0,70.0,2000.0,20.0,2000.0,315.0,508.0,303.0,1.1,...,20.0,0.62,2.9,1.62,20.0,2.8,0.415,0.415,0.415,0.217
2,2,5113100.0,70.0,2000.0,20.0,2000.0,315.0,508.0,303.0,1.1,...,20.0,0.62,2.9,1.62,20.0,2.8,0.41,0.41,0.41,0.214
3,3,5113100.0,70.0,2000.0,20.0,2000.0,315.0,508.0,303.0,1.1,...,20.0,0.62,2.9,1.62,20.0,2.8,0.405,0.405,0.405,0.211
4,4,5113100.0,70.0,2000.0,20.0,2000.0,315.0,508.0,303.0,1.1,...,20.0,0.62,2.9,1.62,20.0,2.8,0.4,0.4,0.4,0.208


In [94]:
# the emission factors for conversion is an input variable
modvar_ef.get_property("variable_type")

'Input'

###  We can extract the data in one of two ways; directly, using the `ModelVariable.get_from_dataframe` method

In [100]:
modvar_ef.get_from_dataframe(df_example).head()

Unnamed: 0,ef_lndu_conv_croplands_to_croplands_gg_co2_ha,ef_lndu_conv_croplands_to_forests_mangroves_gg_co2_ha,ef_lndu_conv_croplands_to_forests_primary_gg_co2_ha,ef_lndu_conv_croplands_to_forests_secondary_gg_co2_ha,ef_lndu_conv_croplands_to_grasslands_gg_co2_ha,ef_lndu_conv_croplands_to_other_gg_co2_ha,ef_lndu_conv_croplands_to_settlements_gg_co2_ha,ef_lndu_conv_croplands_to_wetlands_gg_co2_ha,ef_lndu_conv_forests_mangroves_to_croplands_gg_co2_ha,ef_lndu_conv_forests_mangroves_to_forests_mangroves_gg_co2_ha,...,ef_lndu_conv_settlements_to_settlements_gg_co2_ha,ef_lndu_conv_settlements_to_wetlands_gg_co2_ha,ef_lndu_conv_wetlands_to_croplands_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_mangroves_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_primary_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_secondary_gg_co2_ha,ef_lndu_conv_wetlands_to_grasslands_gg_co2_ha,ef_lndu_conv_wetlands_to_other_gg_co2_ha,ef_lndu_conv_wetlands_to_settlements_gg_co2_ha,ef_lndu_conv_wetlands_to_wetlands_gg_co2_ha
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


##  Or, using the `ModelAttributes.extract_model_variable` method

In [112]:
matt.extract_model_variable(
    df_example,
    modvar_ef,
).head()

Unnamed: 0,ef_lndu_conv_croplands_to_croplands_gg_co2_ha,ef_lndu_conv_croplands_to_forests_mangroves_gg_co2_ha,ef_lndu_conv_croplands_to_forests_primary_gg_co2_ha,ef_lndu_conv_croplands_to_forests_secondary_gg_co2_ha,ef_lndu_conv_croplands_to_grasslands_gg_co2_ha,ef_lndu_conv_croplands_to_other_gg_co2_ha,ef_lndu_conv_croplands_to_settlements_gg_co2_ha,ef_lndu_conv_croplands_to_wetlands_gg_co2_ha,ef_lndu_conv_forests_mangroves_to_croplands_gg_co2_ha,ef_lndu_conv_forests_mangroves_to_forests_mangroves_gg_co2_ha,...,ef_lndu_conv_settlements_to_settlements_gg_co2_ha,ef_lndu_conv_settlements_to_wetlands_gg_co2_ha,ef_lndu_conv_wetlands_to_croplands_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_mangroves_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_primary_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_secondary_gg_co2_ha,ef_lndu_conv_wetlands_to_grasslands_gg_co2_ha,ef_lndu_conv_wetlands_to_other_gg_co2_ha,ef_lndu_conv_wetlands_to_settlements_gg_co2_ha,ef_lndu_conv_wetlands_to_wetlands_gg_co2_ha
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


# Most `ModelAttributes` variable manipulation methods accepting `ModelVariable` objects **OR** variable names as modvar arguments

In [114]:

matt.extract_model_variable(
    df_example,
    ":math:\\text{CO}_2 Land Use Conversion Emission Factor",
).head()

Unnamed: 0,ef_lndu_conv_croplands_to_croplands_gg_co2_ha,ef_lndu_conv_croplands_to_forests_mangroves_gg_co2_ha,ef_lndu_conv_croplands_to_forests_primary_gg_co2_ha,ef_lndu_conv_croplands_to_forests_secondary_gg_co2_ha,ef_lndu_conv_croplands_to_grasslands_gg_co2_ha,ef_lndu_conv_croplands_to_other_gg_co2_ha,ef_lndu_conv_croplands_to_settlements_gg_co2_ha,ef_lndu_conv_croplands_to_wetlands_gg_co2_ha,ef_lndu_conv_forests_mangroves_to_croplands_gg_co2_ha,ef_lndu_conv_forests_mangroves_to_forests_mangroves_gg_co2_ha,...,ef_lndu_conv_settlements_to_settlements_gg_co2_ha,ef_lndu_conv_settlements_to_wetlands_gg_co2_ha,ef_lndu_conv_wetlands_to_croplands_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_mangroves_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_primary_gg_co2_ha,ef_lndu_conv_wetlands_to_forests_secondary_gg_co2_ha,ef_lndu_conv_wetlands_to_grasslands_gg_co2_ha,ef_lndu_conv_wetlands_to_other_gg_co2_ha,ef_lndu_conv_wetlands_to_settlements_gg_co2_ha,ef_lndu_conv_wetlands_to_wetlands_gg_co2_ha
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1e-05,0.000771,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Model variables only associated with partial categories can be returned either using their own categories or expanded to a full set; this is useful for algebraic operations on arrays, knowing that they are always ordered correctly. 

In [135]:
# without merging to full category
matt.extract_model_variable(
    df_example,
    "Unimproved Soil Carbon Land Management Factor",
).head()

Unnamed: 0,factor_lndu_soil_management_unimproved_croplands,factor_lndu_soil_management_unimproved_grasslands
0,1.0,0.9
1,1.0,0.9
2,1.0,0.9
3,1.0,0.9
4,1.0,0.9


In [134]:
# with merging--note that, if a dataframe is returned here, fields are given as categories. 
# This is because the variable is not defined for other categories.
matt.extract_model_variable(
    df_example,
    "Unimproved Soil Carbon Land Management Factor",
    expand_to_all_cats = True,
).head()

Unnamed: 0,croplands,forests_mangroves,forests_primary,forests_secondary,grasslands,other,settlements,wetlands
0,1.0,0.0,0.0,0.0,0.9,0.0,0.0,0.0
1,1.0,0.0,0.0,0.0,0.9,0.0,0.0,0.0
2,1.0,0.0,0.0,0.0,0.9,0.0,0.0,0.0
3,1.0,0.0,0.0,0.0,0.9,0.0,0.0,0.0
4,1.0,0.0,0.0,0.0,0.9,0.0,0.0,0.0


In [133]:
# you can also just return an ordered numpy array
arr_unimproved = matt.extract_model_variable(
    df_example,
    "Unimproved Soil Carbon Land Management Factor",
    expand_to_all_cats = True,
    return_type = "array_base",
)
arr_unimproved

array([[1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0. , 0. , 0.9, 0. , 0. , 0. ],
       [1. , 0. , 0.

In [130]:
# you could modify this (or combine arrays associated with the same subsector)
arr_unimproved *= 2
arr_unimproved

array([[8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0. , 0. , 7.2, 0. , 0. , 0. ],
       [8. , 0. , 0.

###  Then, create a data frame again; use the `reduce_from_all_cats_to_specified_cats` keyword to reduce it back to the correct set of fields

In [136]:
modvar_sc = "Unimproved Soil Carbon Land Management Factor"
df_arr_new = matt.array_to_df(
    arr_unimproved,
    modvar_sc,
    reduce_from_all_cats_to_specified_cats = True,
)
df_arr_new.head()

Unnamed: 0,factor_lndu_soil_management_unimproved_croplands,factor_lndu_soil_management_unimproved_grasslands
0,1.0,0.9
1,1.0,0.9
2,1.0,0.9
3,1.0,0.9
4,1.0,0.9


In [138]:
# the variable object also works here
modvar_sc = matt.get_variable("Unimproved Soil Carbon Land Management Factor")
df_arr_new = matt.array_to_df(
    arr_unimproved,
    modvar_sc,
    reduce_from_all_cats_to_specified_cats = True,
)
df_arr_new.head()

Unnamed: 0,factor_lndu_soil_management_unimproved_croplands,factor_lndu_soil_management_unimproved_grasslands
0,1.0,0.9
1,1.0,0.9
2,1.0,0.9
3,1.0,0.9
4,1.0,0.9


##  Finally, the `ModelAttributes.build_variable_fields()` method can be used to build fields associated with variable for either default categories or any valid subset

In [185]:
# if you specify a single string as the category, it will return a string field
modvar_ail = "Area of Land Use Converted to Type"
matt.build_variable_fields(
    modvar_ail,
    restrict_to_category_values = "croplands"
)

'area_lndu_conversion_to_croplands'

In [186]:
# if you specify a list, it will return the corresponding list
matt.build_variable_fields(
    modvar_ail,
    restrict_to_category_values = ["croplands"]
)

['area_lndu_conversion_to_croplands']

In [188]:
# if you specify a list, it will return the corresponding list
matt.build_variable_fields(
    modvar_ail,
    restrict_to_category_values = ["croplands", "forests_primary", "settlements"]
)

['area_lndu_conversion_to_croplands',
 'area_lndu_conversion_to_forests_primary',
 'area_lndu_conversion_to_settlements']

In [189]:
# it ignores invalid values by default
matt.build_variable_fields(
    modvar_ail,
    restrict_to_category_values = ["croplands", "forests_primary", "settlements", "public_private"]
)

['area_lndu_conversion_to_croplands',
 'area_lndu_conversion_to_forests_primary',
 'area_lndu_conversion_to_settlements']

# Some variables have mutliple dimensions of categories, which require dictionaries to restrict category values
- E.g., transition probabilities and land use conversion biomass emission factors

In [193]:
# if you specify as a list, all dimensions receieve the same restriction
modvar_pij = "Unadjusted Land Use Transition Probability"
matt.build_variable_fields(
    modvar_pij,
    restrict_to_category_values = ["croplands", "forests_primary", "settlements"]
)


['pij_lndu_croplands_to_croplands',
 'pij_lndu_croplands_to_forests_primary',
 'pij_lndu_croplands_to_settlements',
 'pij_lndu_forests_primary_to_croplands',
 'pij_lndu_forests_primary_to_forests_primary',
 'pij_lndu_forests_primary_to_settlements',
 'pij_lndu_settlements_to_croplands',
 'pij_lndu_settlements_to_forests_primary',
 'pij_lndu_settlements_to_settlements']

In [204]:
# if you specify as a dictionary, by default, only those with keys are reduced (only the first dimension is restricted here)
#
# NOTE: variables defined across multiple dimensions of a single primary categorization use `dimN` 
# to specify how they show up in schema
matt.build_variable_fields(
    modvar_pij,
    restrict_to_category_values = {
        "cat_landuse_dim1": ["croplands", "forests_primary", "settlements"]
    }
)


['pij_lndu_croplands_to_croplands',
 'pij_lndu_croplands_to_forests_mangroves',
 'pij_lndu_croplands_to_forests_primary',
 'pij_lndu_croplands_to_forests_secondary',
 'pij_lndu_croplands_to_grasslands',
 'pij_lndu_croplands_to_other',
 'pij_lndu_croplands_to_settlements',
 'pij_lndu_croplands_to_wetlands',
 'pij_lndu_forests_primary_to_croplands',
 'pij_lndu_forests_primary_to_forests_mangroves',
 'pij_lndu_forests_primary_to_forests_primary',
 'pij_lndu_forests_primary_to_forests_secondary',
 'pij_lndu_forests_primary_to_grasslands',
 'pij_lndu_forests_primary_to_other',
 'pij_lndu_forests_primary_to_settlements',
 'pij_lndu_forests_primary_to_wetlands',
 'pij_lndu_settlements_to_croplands',
 'pij_lndu_settlements_to_forests_mangroves',
 'pij_lndu_settlements_to_forests_primary',
 'pij_lndu_settlements_to_forests_secondary',
 'pij_lndu_settlements_to_grasslands',
 'pij_lndu_settlements_to_other',
 'pij_lndu_settlements_to_settlements',
 'pij_lndu_settlements_to_wetlands']

In [203]:
# reduce the second dimension too
matt.build_variable_fields(
    modvar_pij,
    restrict_to_category_values = {
        "cat_landuse_dim1": ["croplands", "forests_primary", "settlements"],
        "cat_landuse_dim2": ["forests_mangroves", "forests_primary", "other"]
    }
)


['pij_lndu_croplands_to_forests_mangroves',
 'pij_lndu_croplands_to_forests_primary',
 'pij_lndu_croplands_to_other',
 'pij_lndu_forests_primary_to_forests_mangroves',
 'pij_lndu_forests_primary_to_forests_primary',
 'pij_lndu_forests_primary_to_other',
 'pij_lndu_settlements_to_forests_mangroves',
 'pij_lndu_settlements_to_forests_primary',
 'pij_lndu_settlements_to_other']