### Data Processing for MimiCIAM - Country-specific Inputs

This notebook represents a subset of the data processing steps used for MimiCIAM. Specifically it describes data sources for country-specific data inputs, which were augmented in 06/2022.

Lisa Rennels (UC Berkeley) 06/2022 with contributions by Prof. Delavane Diaz

#### 1. Construction Cost Index (input/cci.csv)

Use the GAMS code below fill in / limit the range. (source data file is 2011WB_ICP.xls using the cci tab)

        * Construction cost indices
        $CALL gdxxrw.exe construction_country_indices\2011WB_ICP.xls par=cci rng=cci!h8 rdim=1 cdim=0
        $gdxin 2011WB_ICP
        $load cci
        * correct for missing countries by assuming 1 and restrict factor to 0.5 to 2.5
        loop(country,
                if(cci(country)=0, cci(country)=1; );
                cci(country)=max(0.5,min(2.5,cci(country)));
        );       

_NOTE: These files were pulled directly from the GAMS CIAM code, and augmented slightly when segment countries were updated. Thus this repository does not currently contain the full replication code in a runnable format._

#### 2. GTAP Land Value (input/gtapland.csv)

The gtap parameter is read in from the Agland tab cells E4:F191. Note the dollar year adjustment in the code below. (source data file is GTAPagrent.xls using the Agland tab cells E4:F191).

        * Land values from GTAP
        $onecho > inputlist.txt
        par=gtapland rng=Agland!e4:f191 rdim=1 cdim=0
        par=countryarea rng=area!a2:b166 rdim=1  cdim=0
        $offecho
        $CALL gdxxrw.exe GTAPagrent.xls @inputlist.txt
        $gdxin GTAPagrent
        $load gtapland, countryarea
        * $2007M per sq km - I convert to $2010
        gtapland(country)=gtapland(country)/0.962;

_NOTE: These files were pulled directly from the GAMS CIAM code, and augmented slightly when segment countries were updated. Thus this repository does not currently contain the full replication code in a runnable format._

#### 3. Population and GDP (input/pop.csv and input/ypcc.csv)

The initial CIAM model relied on demographic inputs from EPRI’s global IAM called  MERGE (Model for Estimating the Regional and Global Effects of Greenhouse Gas Reductions). Some minor adjustments for use in CIAM (i.e., aligning time indices, extrapolating GDP growth after 2100 at a nominal rate, and converting to $2010) were made and exported into a GDX and then csv file `data_processing/MERGEdata.xls`.

This Excel spreadsheet was reconfigured to create input files `input/pop.csv` and `input/ypcc.csv`

NOTE that the MERGE model does not have data for PSE. For replication purposes we substitute in the SSP2 data, using the OECDEnv-Growth model from sources discussed in Section 4. This is the only model available taht has both populattion and GDP available for `PSE`.

These two files are only consequential for replication purposes since these two data files are ONLY used in replication exercises of the CIAM (Diaz et al., 2016), as `MimiCIAM` runs by default with updated SSP data instead of original CIAM MERGE data.

_NOTE: These files were pulled directly from the GAMS CIAM code, and modified slightly when segment countries were updated. Thus this repository does not currently contain the full replication code in a runnable format._

#### 4. SSP Data (ssp/pop_IIASAGDP_SSPX_v9_130219 and ssp/ypcc_IIASAGDP_SSPX_v9_130219)

The SSP Data were downloaded from the [IIASA SSP Database](https://secure.iiasa.ac.at/web-apps/ene/SspDb/dsd?Action=htmlpage&page=about) to the file `data_processing/SspDb_country_data_2013-06-12.csv`, which was then filtered for two variables: Population and GDP|PPP. 

Starting with `input/pop.csv` and `input/ypcc.csv`, any country available in the IIASA GDP model is replaced with data from the respective SSP/variable combination.

This process is done using an R script `data_processing/PrepSSP.R`.

#### 5. Reference Population Density (input/refpopdens.csv)

The weighted average of reference population density of the segments in this region, weighted by area 1 ie. area between 0 and 1 meter.

In [1]:
using DataFrames
using CSVFiles
using Query

xsc = load(joinpath(@__DIR__, "../input/xsc.csv")) |> DataFrame
data = load(joinpath(@__DIR__, "../input/data.csv")) |> DataFrame

regions = load(joinpath(@__DIR__, "../meta/rgnIDmap.csv")) |> DataFrame
df = DataFrame()
for r in regions.rgn
    segIDs = (xsc |> @filter(_.rgn == r) |> DataFrame).segID
    segNames = (xsc |> @filter(_.rgn == r) |> DataFrame).seg

    segData = data |> @filter(_.NA in segNames) |> DataFrame
    select!(segData, ["NA", "area1", "popdens"])
    insertcols!(segData, :weight => (segData.area1) / sum(segData.area1)) 
    value = max(1, sum(segData.popdens .* segData.weight))

    append!(df, DataFrame(:country => r, :refpopdens => value))
end

df |> save(joinpath(@__DIR__, "../input/refpopdens.csv"))