Skip to content

Latest commit

 

History

History
104 lines (80 loc) · 7.87 KB

13_Submit_to_IIASA_database.md

File metadata and controls

104 lines (80 loc) · 7.87 KB

Uploading scenario data to IIASA database

Oliver Richters, 24 October, 2022

Many projects have to upload their scenario data to the database provided by IIASA.

Step 1: model registration

At the beginning of the project, there should be a process to get access to the project Internal Scenario Explorer. In case of problems, contact Daniel Huppmann.

Model, scenarios and project variables should be registered in the IIASA database. Often, the variable list is based on the AR6 template once generated for the IPCC Sixth Assessment Report, or the new template for the NAVIGATE project.

A template file containing the list of variables and associated units may be provided as yaml or xlsx file. It can be used to check the variable names and units of your submission. Save it in your REMIND repository, the suggested place is ./output/export/.

Step 2: generate file to upload

You can generate the file to be uploaded by either calling piamInterfaces::generateIIASASubmission or using a wrapper based on output.R

generateIIASASubmission requires the following inputs:

  • mifs: vector of .mif files or directories that contain the .mif files
  • model: model name as registered in the database, such as "REMIND-MAgPIE 3.2-4.6"
  • iiasatemplate: optional path to the xlsx or yaml file obtained in the project with the variables and units that are accepted in the database
  • addToScen: optional string added in front of all scenario names
  • removeFromScen: optional regular expression of parts to be deleted from the scenario names, such as "C_|_bIT|_bit|_bIt"
  • mapping: vector of mappings from this directory (such as c("AR6", "AR6_NGFS") or c("NAVIGATE", "SHAPE")) or a local file with identical structure.

Usually, you will find the result in in the output subdirectory, but you can adapt this, see the function documentation.

Starting from your REMIND directory, you can start this process by running ./output.R, then selecting export and xlsx_IIASA. Then choose the directories of the runs you would like to use. This works also for coupled runs, as the REMIND_generic_*.mif contains the MAgPIE output since October 4, 2022.

The script requires the inputs as above, expect that it lets you select the mifs from a list, and provides additional options:

  • mapping: either the path to a mapping or a vector of mapping names such as c("NAVIGATE", "SHAPE") referring to the last part of the file names in this piamInterfaces directory
  • filename_prefix: optional prefix of the resulting outputFile, such as your project name

You can specify the information above in two ways: Either edit xlsx_IIASA.R and add a project in a similar way to NGFS or ENGAGE. You can then start the scripts with:

Rscript output.R comp=export output=xlsx_IIASA project=NGFS

You do not need to specify comp and output in the command line, you can just wait to be asked for it. An alternative is to specify everything individually as command-line arguments:

Rscript output.R comp=export output=xlsx_IIASA model="REMIND 3.2" mapping=AR6,AR6_NGFS addToScen=whatever removeFromScen=C_ filename_prefix=test

All the information printed to you during the run will also be present in the logfile whose path will be told you at the end.

Step 3: check submission

Check the logfile carefully for the variables that were omitted, failing summation checks etc. If you need information on a specific variable such as "Emi|CO2", you can run piamInterfaces::variableInfo("Emi|CO2") and it will provide a human-readable summary of the places this variable shows up in mappings and summation checks. Running piamInterfaces::variableInfo("Emi|CO2", mapping = c("AR6", "mapping.csv")) allows to compare your local mapping with the AR6 mapping with respect to this variable.

If you specify iiasatemplate, the scripts will delete all the variables not in the template. This can be the reason that summation checks fail, simply because some of the variables that were reported by REMIND were omitted.

Additionally, unit mismatches can cause the script to fail. In the past, IIASA has sometimes changed unit names to correct spelling mistakes or harmonize them. If there were unit mismatches where the units are identical, just spelled differently, you can add them to the named vector identicalUnits in piamInterfaces::checkFixUnits. So if the project template expects Mt/yr, but our mappings export it as Mt/year, add c("Mt/yr", "Mt/year") to the vector, and it will in the future not fail on this unit mismatch but correct it to what is required for the submission. Never use this mechanism if the units are not actually identical in their meaning.

Step 4: upload file

Go to the project internal Scenario Explorer, click on your login name and then on "uploads" and the "plus" in the upper right corner - submit your xlsx file. Do not expect it to work flawlessly on the first try so hope for the best. You will receive an email message with a log and may at some point need the help of the IIASA administrators of your project.

Step 5: Analyse the snapshots

To compare your submission with other groups, you can generate snapshots in the database. You receive a zip file with large csv files. You can try to read the full file into R using read.snapshot:

quitte::read.snapshot("snapshot.csv")

But loading the full file might exceed available memory. You can prefilter the data with:

quitte::read.snapshot("snapshot.csv", list(variable = c("GDP|PPP", "GDP|MER"), region = "World", period = 2030))

You can also use more sophisticated filtering and pass a filter.function, see read.quitte documentation, or even combine these approaches.

library(tidyverse)
yourfilter <- function(x) {
  filter(x, grepl("^Final Energy", .data$variable),
            between(.data$period, 2030, 2050))
}
d <- quitte::read.snapshot("snapshot.csv", list(region = "World"), filter.function = yourfilter)

If your computer supports the system commands grep, head and tail (as the PIK cluster does), using the list-based filtering reduces loading times, as the file size can be reduced before reading the data into R.

The following functions from piamInterfaces might be helpful for further analysis:

Further Information

Please refer to this repository for a showcase of all the tools and best practices when working with data from the IIASA database, including:

  • how to download data from the IIASA database
  • how read in and validate data in R
  • how to create plots from the data in R