<img src="../img/banner.png" alt="Powered by MSD-LIVE" width = 600 style="display: block;margin-left: auto;margin-right: auto;width: 50%;" />


# CSDMS 2021 Annual Meeting

### CLINIC:  Thursday, May 20 11:00am - 1pm MST

## GCAM and Demeter:  A global, integrated human-Earth systems perspective to modeling land projections

### Pralit Patel, pralit.patel<span></span>@pnnl.gov and Chris Vernon, chris.vernon<span></span>@pnnl.gov

Researchers and decision makers are increasingly interested in understanding the many ways in which human and Earth systems interact with one another, at scales from local (e.g., a city) to regional to global. For example, how might changes in population, income, or technology development alter crop production, energy demand, or water withdrawals? How do changes in one region's demand for energy affect energy, water, and land in other regions? This session will focus on two models – GCAM and Demeter – that provide capabilities to address these types of questions.


GCAM is an open-source, global, market equilibrium model that represents the linkages between energy, water, land, climate, and economic systems. A strength of GCAM is that it can be used to quickly explore, and quantify the uncertainty in, a large number of alternate future scenarios while accounting for multi-sector, human-Earth system dynamics. One of GCAM’s many outputs is projected land cover/use by subregion. Subregional projections provide context and can be used to understand regional land dynamics; however, Earth System Models (ESMs) generally require gridded representations of land at finer scales. Demeter, a land use and land cover disaggregation model, was created to provide this service. Demeter directly ingests land projections from GCAM and creates gridded products that match the desired resolution and land class requirements of the user.

This clinic will introduce both GCAM and Demeter at a high-level. We will also provide a hands-on walk through for a reference case so attendees can become familiar with configuring and running these two models. Our goal will be for attendees to leave the clinic with an understanding of 1) the value of capturing a global perspective when informing subregional and local analysis, 2) possibilities to conduct scenario exploration experiments that capture multi-sector/scale dynamics, 3) and a hands-on experience with GCAM and Demeter.

## Software Availability

Demeter: https://github.com/JGCRI/demeter  
GCAM: https://github.com/JGCRI/gcam-core  
gcamwrapper: https://github.com/JGCRI/gcamwrapper  
im3vis: https://github.com/IMMM-SFA/im3vis


# Value of this demonstration

Demonstrate how to integrate GCAM - using its new Python wrapper - into existing workflows that research energy, water, land, climate, and socioeconomic dynamics to provide a global, human-Earth perspective.

## Overview of the experiment to demonstrate capabilities

For this demonstration we will attempt to incorporate climate changed induced yield impacts in GCAM.  To do so we will utilize an archive of global gridded crop model results published as part of the [ISI-MIP](https://doi.org/10.1073/pnas.1222463110) project.  While this exercise has been conducted before: [Snyder 2020](https://doi.org/10.1371/journal.pone.0237918), [Calvin 2020](https://doi.org/10.1142/S2010007820500050).  A short coming in the design of these experiments is that they assume a fixed map of crop production which is used to weight the gridded yield impacts.  Similar studies such as [Fujimori 2018](https://doi.org/10.3390/su10103673) have indicated missing this dynamic can create significant bias.

We can use the dynamic GCAM - Demeter coupling to address this short coming by updating the map of where GCAM+Demeter is projecting the crops are now grown.  Which we attempt to do in this demonstration.  With the caveat that quite a number of simplifying assumptions were made to ensure computational tractability.

### Outline

- Load and initialize GCAM through its calibration years
  - Visualizations of GCAM land use results
- Load Demeter and downscale GCAM land use decisions
- Run and query the first projected GCAM results _without_ climate impacts
  - Query results for comparison
- Load ISIMIP yields and weight by Demeter crop maps
- Apply yield impacts and re-run GCAM
- Compare results with and without climate impacts

# Import packages

In [None]:
# general purpose package
import time
import pkg_resources

import pandas as pd
import geopandas as gpd

# load demeter and GCAM via gcamwrapper
import demeter
from demeter import Model
import gcamwrapper as gw

# some utilities to help keep the notebook tidy
import im3vis
import demo_utils
import gcam_demeter_clinic


# Using GCAM via `gcamwrapper`

#### Load GCAM specific files

In [None]:
# path to the exe directory where gcam-core is installed
core_exe_path = 'gcam-core/exe'

# path to the xml configuration file you want to use
config_xml_file = 'configuration_reduced.xml'


#### Instantiate GCAM

In [None]:
%%time

g = gw.Gcam(config_xml_file, core_exe_path)


#### Run the model through its calibration years through the base year of 2015

In [None]:
%%time

final_cal_period = g.convert_year_to_period(2015)
g.run_to_period(final_cal_period)


#### Query land allocation data from the current GCAM run to produce projected land allocation by land region

In [None]:
# load query from query library
query_string = gw.get_query('land', 'land_allocation')

# query specific information
query_params = {'region': ['*'], 'leaf': ['*'], 'year': ['<=', g.get_current_year()]}

# create an output data frame containing land data
land_df = g.get_data(query_string, query_params)

land_df.head()


## Explore GCAM land allocation outputs for year 2015

#### Load into a demeter formatted dataframe

In [None]:
gcam_df = demeter.format_gcam_data(df=land_df, start_year=2015, through_year=2015)

gcam_df.head()


#### GCAM total land allocation by region for all land classes

In [None]:
reg_ax = im3vis.gcam_demeter_region(gcam_df, target_year='2015',
                                    label='land allocation', units=land_df.attrs['units'])


#### GCAM total land allocation by region for combined Corn

In [None]:
reg_ax = im3vis.gcam_demeter_region(gcam_df, 
                                    target_year='2015', 
                                    landclass_list=['corn_irr', 'corn_rfd'],
                                    label='land allocation', units=land_df.attrs['units'])


#### GCAM `corn` allocation for year 2015 for the CONUS

In [None]:
agg_df = im3vis.plot_gcam_basin(gcam_df,
                                target_year='2015',
                                landclass_list=['corn_irr', 'corn_rfd'],
                                setting='crop_yield',
                                scope='conus',
                                label='land allocation', units=land_df.attrs['units'])


#### GCAM `corn` allocation for year 2015 for global basins

In [None]:
agg_df = im3vis.plot_gcam_basin(gcam_df,
                                target_year='2015',
                                landclass_list=['corn_irr', 'corn_rfd'],
                                setting='crop_yield',
                                scope='global',
                                label='land allocation', units=land_df.attrs['units'])


## Run Demeter

#### Instantiate Demeter

In [None]:
# instantiate demeter model
model = Model(config_file=gcam_demeter_clinic.get_config_file(), 
              gcamwrapper_df=land_df,
              write_outputs=False,
              write_logfile=False)


#### Load model data and prepare run

In [None]:
model.initialize()


#### Process the target time step from the GCAM output DataFrame for land allocation

In [None]:
%%time

# process first year
demeter_2015 = model.process_step()


## Explore Demeter's outputs

#### Build a geodataframe from the outputs

In [None]:
# a geopandas data frame of demeter's output land allocation data with geometry
demeter_gdf = im3vis.build_geodataframe(demeter_2015)


#### Plot Demeter `corn` output for year 2015 for the CONUS

In [None]:
r = im3vis.plot_demeter_raster(demeter_gdf=demeter_gdf, 
                               landclass_list=['crop2_irr', 'crop2_rfd'],
                               target_year='2015', 
                               scope='conus',
                               resolution='0.5')


#### Plot Demeter `forest` global output for 2015

In [None]:
r = im3vis.plot_demeter_raster(demeter_gdf=demeter_gdf, 
                               landclass_list=['unmanagedforest', 'forest'],
                               target_year='2015', 
                               scope='global',
                               resolution='0.5')


#### Clean up the logger

In [None]:
model.cleanup()


## Run the next model period (2020) in GCAM _without_ yield impacts

In [None]:
%%time

g.run_to_period()


## Query and visualize model results _without_ accounting for yield impact for comparison purposes

In [None]:
# load query for land allocation
query_string = gw.get_query('land', 'land_allocation')
query_params = {'region': ['*'], 'leaf': ['*'], 'year': ['<=', g.get_current_year()]}

# run the query
land_df = g.get_data(query_string, query_params)

# get Corn producer prices
query_string = gw.get_query('ag', 'prices')
query_params = {'region': ['*'], 
                'sector': ['=', 'Corn'], 
                'year': ['=', g.get_current_year()]}

prod_prices = g.get_data(query_string, query_params)

# get domestic Corn prices (accounting for trade)
query_params['sector'] = ["=", 'regional corn']

dom_prices = g.get_data(query_string, query_params)

# get crop production data
query_string = gw.get_query('ag', 'production')
query_params = {'region': ['*'],
                'sector': ['*'],
                'year': ['=', g.get_current_year()]}
production = g.get_data(query_string, query_params)

# finally get the actual yields
query_string = gw.get_query('ag', 'yield')
query_params = {'region': ['*'], 
                'sector': ['*'], 
                'tech': ['*'], 
                'year': ['=', g.get_current_year()]}

yields = g.get_data(query_string, query_params)


## Generate yield impacts from ISIMIP PDSST data using downscaled land from Demeter to weight and map them to GCAM land regions

In [None]:
yield_scaler_df = demo_utils.get_yield_scalers(demeter_2015, g.get_current_year())
yield_scaler_df.head()


In [None]:
yields_new = yields.merge(yield_scaler_df, on=["region", "sector", "technology", "year"], how="inner")
yields_new['yield'] *= yields_new['yield_scaler']
yields_new = yields_new.filter(["region", "sector", "technology", "year", "yield"])

yields_new.head()


## Update yields in GCAM for year 2020 and re-run

In [None]:
# get the same query string
query_string = gw.get_query('ag', 'yield')

# however the syntax for the query params are slightly different for set data as we
# need to explicitly tell it which match with the '+' argument and of course we do
# not give the values to compare against as those are coming from the DataFrame
query_params = {'region': ['+', '='], 
                'sector': ['+', '='], 
                'tech': ['+', '='], 
                'year': ['+', '=']}

g.set_data(yields_new, query_string, query_params)

# re-run 2020
g.run_to_period(g.get_current_period())


## Get the results _with_ impacts for comparison

In [None]:
# load query for land allocation
query_string = gw.get_query('land', 'land_allocation')
query_params = {'region': ['*'], 
                'leaf': ['*'], 
                'year': ['<=', g.get_current_year()]}

# run the query
land_df_new = g.get_data(query_string, query_params)

# get Corn producer prices
query_string = gw.get_query('ag', 'prices')
query_params = {'region': ['*'], 
                'sector': ['=', 'Corn'], 
                'year': ['=', g.get_current_year()]}

prod_prices_new = g.get_data(query_string, query_params)

# get domestic Corn prices (accounting for trade)
query_params['sector'] = ["=", 'regional corn']

dom_prices_new = g.get_data(query_string, query_params)

# get crop production data
query_string = gw.get_query('ag', 'production')
query_params = {'region': ['*'],
                'sector': ['*'],
                'year': ['=', g.get_current_year()]}
production_new = g.get_data(query_string, query_params)

# finally get the actual yields
query_string = gw.get_query('ag', 'yield')
query_params = {'region': ['*'], 
                'sector': ['*'], 
                'tech': ['*'], 
                'year': ['=', g.get_current_year()]}

yields_new = g.get_data(query_string, query_params)


## Visualizations comparing 2020 with and without crop yield updates

In [None]:
prod_prices_diff = demo_utils.calc_diff(prod_prices_new, prod_prices)
reg_ax = im3vis.gcam_demeter_region(prod_prices_diff, target_year='diff_rel', metric_id_col=None,
                                    label='Corn Producer Prices', units=prod_prices.attrs['units'])

In [None]:
dom_prices_diff = demo_utils.calc_diff(dom_prices_new, dom_prices)
reg_ax = im3vis.gcam_demeter_region(dom_prices_diff, target_year='diff_rel', metric_id_col=None,
                                    label='Corn Domestic Prices', units=prod_prices.attrs['units'])

In [None]:
yields_diff = demo_utils.calc_diff(yields_new, yields)
yields_diff = yields_diff.query("sector == 'Corn'")
yields_plotdf = demeter.format_gcam_data(df=yields_diff, start_year=2020, through_year=2020, gcam_landalloc_field="diff_rel", gcam_nodes_field="technology")
yields_plotdf_irr = yields_plotdf.query('landclass == "Corn_IRR"').copy()
agg_df = im3vis.plot_gcam_basin(yields_plotdf_irr,
                                target_year='2020',
                                landclass_list=['corn_irr'],
                                setting='crop_yield',
                                scope='global',
                                label="Yield Relative Diff",
                                units=yields.attrs['units'])
yields_plotdf_rfd = yields_plotdf.query('landclass == "Corn_RFD"').copy()
agg_df = im3vis.plot_gcam_basin(yields_plotdf_rfd,
                                target_year='2020',
                                landclass_list=['corn_rfd'],
                                setting='crop_yield',
                                scope='global',
                                label="Yield Relative Diff",
                                units=yields.attrs['units'])

In [None]:
production_diff = demo_utils.calc_diff(production_new, production)
production_diff = production_diff.query("sector == 'Corn'")
production_plotdf = demeter.format_gcam_data(df=production_diff, start_year=2020, through_year=2020, gcam_landalloc_field="diff_rel", gcam_nodes_field="technology")
production_plotdf_irr = production_plotdf.query('landclass == "Corn_IRR"').copy()
agg_df = im3vis.plot_gcam_basin(production_plotdf_irr,
                                target_year='2020',
                                landclass_list=['corn_irr'],
                                setting='crop_yield',
                                scope='global',
                                label="Production Relative Diff",
                                units=production.attrs['units'])
production_plotdf_rfd = production_plotdf.query('landclass == "Corn_RFD"').copy()
agg_df = im3vis.plot_gcam_basin(production_plotdf_rfd,
                                target_year='2020',
                                landclass_list=['corn_rfd'],
                                setting='crop_yield',
                                scope='global',
                                label="Production Relative Diff",
                                units=production.attrs['units'])

## Thank you for attending our clinic!

Hopefully we provided a high-level demonstration of how to integrate GCAM - using its new Python wrapper - into existing workflows that research energy, water, land, climate, and socioeconomic dynamics to provide a global, human-Earth perspective.

Both `demeter` and `gcamwrapper` will be getting a BMI soon!


## Please feel free to contact us for any further questions at:


#### pralit.patel@pnnl.gov
#### chris.vernon@pnnl.gov