Skip to content

Latest commit

 

History

History
463 lines (326 loc) · 18.3 KB

gcamtool.rst

File metadata and controls

463 lines (326 loc) · 18.3 KB

GCAM tool (gt)

The gt script unifies GCAM workflow managment functionality into a single script with sub-commands. Generic sub-commands are implemented directly by the pygcam library. Project-specific features can be added via plugins <plugins-label>.

Note

Quick links to sub-commands: building <building>, chart <chart>, config <config>, diff <diff>, gcam <gcam>, gui <gui>, init <init>, mcs <mcs>, mi <mi>, new <new>, protect <protect>, query <query>, run <run>, setup <setup>, sandbox <sandbox>, transport <transport>

Sub-commands <mcs/commands> supporting Monte Carlo simulation

The sub-commands support all the major workflow setups, including

  • Modify XML files and configuration.xml to set up a modeling experiment (See the setup <setup> sub-command and setup for more information.)
  • Run GCAM in an automatically-created workspace, allowing multiple instances of GCAM to run simultaneously, e.g., on parallel computing systems (See the gcam <gcam> sub-command.)
  • Execute batch queries against the XML database to extract GCAM results, with on-the-fly regionalization based on a simple region-mapping file. (See the query <query> sub-command.)
  • Compute differences between policy and baseline scenarios, including linear annualization of values between time-steps, and (See the diff <diff> sub-command.)
  • Plot results, with flexible control of figure features including title, axis labels, scale, and so on. (See the chart <chart> sub-command.)
  • Manage (create, delete, rename, run commands in) automatically-created workspaces. (See the sandbox <sandbox> sub-command.)

In addition, the run <run> sub-command allows workflow steps to be defined in an XML file so that individual or groups of steps can be executed for one or more scenarios. The run sub-command supports direct invocation of other workflow steps as well as running arbitrary programs of the user's choosing.

Finally, gt allows all project steps to be run on a compute node in a High-Performance Computing environment by specifying +b or --batch on the command-line. (Note that this is not available on Mac OS X or Windows.)

For example, the command:

gt +b +P MyProject run -S MyScenario

runs all steps for scenario MyScenario in the project MyProject by queuing a batch job on the default queue. Arguments to gt allow the user to set various resource requirements and to select the queue to use.

The command to run to queue the batch job is taken from the configuration file parameter GCAM.BatchCommand. Example batch commands for the SLURM and PBS job management systems are provided in variables GCAM.QueueSLURM and GCAM.QueuePBS, respectively.

Command-line usage is described below. Note that some command-line (e.g., batch-related) options must precede the sub-command, whereas sub-command specific options must follow it.

Note

Note that arguments that pertain regardless of the sub-command (e.g., +P to identify the project name) are specified prior to the sub-command, and use + rather than -. This is to avoid conflicts between these "main" arguments and sub-command arguments. (An exception is gt -h, which retains the -.) Long-form argument names use two hyphens, as in --projectName.)

Usage

run : @replace

This sub-command reads instructions from the file project-xml, the location of which is taken from the user's ~/.pygcam.cfg <pygcam-cfg> file. The workflow steps indicated in the XML file and command-line determine which commands to run.

Examples:

Run all steps for the default scenario group for project 'Foo':

gt +P Foo run

Run all steps for scenario group 'test' for project 'Foo', but only for scenarios 'baseline' and 'policy-1':

gt +P Foo run -g test -S baseline,policy1

or, equivalently:

gt +P Foo run --group test --scenario baseline --step policy1

Run only the 'setup' and 'gcam' steps for scenario 'baseline' in the default scenario group:

gt +P Foo run -s setup,gcam -S baseline,policy-1

Same as above, but queue a batch job to run these commands on the queue 'short':

gt +b +q short +P Foo run -s setup,gcam -S baseline,policy-1

Note that the command above will run the two scenarios ('baseline' and 'policy-1') in a single batch job. To run scenarios in separate batch jobs, use the -D or --distribute option to the run sub-commmand:

gt +q short +P Foo run -D -S baseline,policy-1

The "distribute" option knows that various project steps for non-baseline scenarios may depend on baseline scenarios, so the baseline is always run first, with the non-baseline scenarios queued as dependent on the successful completion of the baseline. If no scenarios are explicitly named, all scenarios in the group are run, as usual.

The -n flag displays the commands that would be executed for a command, but doesn't run them:

gt +P Foo run -s setup,gcam -S baseline,policy-1 -n
building : @replace

Generates a template CSV file which can be edited to set building energy efficiency improvements that are applied by the callable function "buildingTechEfficiency", which must be called in your scenarios.xml file.

chart : @replace

The chart sub-command generates plots from GCAM-style ".csv" files. Two types of plots are currently supported: (i) stacked bar plots based on summing values over all years (with optional interpolation of annual values), by the given 'indexCol' (default is 'region'), and (ii) stacked bar plots by year for some data column, where the data are grouped by and summed across elements with the indicated 'indexCol'. The first option is indicated by using the -S (--sumYears) option. Numerous options allow the appearance to be customized.

You can perform on-the-fly unit conversions using the -m / --multiplier or -V / --divisor arguments, which cause all values in "year columns" to be multiplied or divided, respectively, by the values provided. Values can be specified as numeric constants or using symbolic constants defined in the pygcam.units module.

config : @replace

The config command list the values of configuration variables from ~/.pygcam.cfg. With no arguments, it displays the values of all variables for the default project. Use the -d flag to show only values from the [DEFAULT] section.

If an argument name is provided, it is treated as a substring pattern, unless the -x flag is given (see below). All configuration variables containing the give name are displayed with their values. The match is case-insensitive.

If the -x or --exact flag is specified, the argument is treated as an exact variable name (case-sensitive) and only the value is printed. This is useful mainly for scripting. For general use the substring matching is more convenient.

Examples:

$ gt config project
[MyProject]
GCAM.DefaultProject = MyProject
GCAM.ProjectRoot = /Users/rjp/bitbucket/myProject
GCAM.ProjectXmlFile = /Users/rjp/bitbucket/myProject/etc/project.xml

$ gt config -x GCAM.DefaultProject
MyProject

$ gt config sand
MyProject]
GCAM.SandboxRoot = /Users/rjp/ws/myProject

$ gt config sand -d
[DEFAULT]
GCAM.SandboxRoot = /Users/rjp/ws
diff : @replace

The diff sub-command script computes the differences between results from two or more CSV files generated from batch queries run on a GCAM database, saving the results in either a CSV or XLSX file, according to the extension given to the output file. If not provided, the output filename defaults to differences.csv.

If multiple otherFiles are given (i.e., the referenceFile plus 2 or more other files named on the command-line), the resulting CSV file will contain one difference matrix for each otherFile, with a label indicating which pair of files were used to produce each result.

When the output file is in XLSX format, each result is written to a separate worksheet. If the -c flag is specified, no differences are computed; rather, the .csv file contents are combined into a single .xlsx file.

gcam : @replace

The gcam sub-command runs the GCAM executable on the designated configuration file, scenario, or workspace. Typical use (e.g., from a project.xml file) would be to run GCAM by referencing a directory named the same as a scenario, holding a file called config.xml, as is generated by the setup sub-command. (See setup.)

If a workspace is specified on the command-line, it is used. Otherwise, if a scenario is specified, the workspace defined by {GCAM.SandboxDir}/{scenario} is used. If neither workspace nor scenario are defined, the value of config variable GCAM.RefWorkspace is used, i.e., GCAM is run in the reference workspace.

If the workspace doesn't exist, it is created based on the reference GCAM workspace, defined by the configuration variable GCAM.RefWorkspace. By default, read-only directories (e.g., input and libs) are symbolically linked from the new workspace to the reference one. (See the new <new> sub-command for more information on the creation of workspaces.)

Directories into which GCAM writes results (e.g., output and exe) are created in the new workspace, but read-only files within exe (e.g., the GCAM executable) are symbolically linked (with the same caveat for Windows users.)

Usage example:

gt gcam -S ~/MyProject/scenarios -s MyScenario -w ~/sandboxes/MyProject/MyScenario

would run the scenario MyScenario in the newly created sandbox (workspace) ~/sandboxes/MyProject/MyScenario using the configuration file ~/MyProject/scenarios/MyScenario/config.xml.

gui : @replace

Run the Graphical User Interface (GUI) generated from the command-line interface in a local web server available at http://127.0.0.1:8050.

init : @replace

Create the configuration file ~/.pygcam.cfg and initialize key variables, based on command-line arguments, or interactive prompts. See initialize for details.

mcs : @replace

Enable or disable Monte Carlo Simulation (MCS) mode, or check whether MCS mode is currently enabled or disabled.

mi : @replace

Invoke ModelInterface from the command-line after changing directory to the value of config variable GCAM.QueryDir. If the file model_interface.properties is found, it is used as is, unless the -u/--updateProperties flag is specified, in which case the file is modified so that the queryFile entry refers to the value of GCAM.MI.QueryFile, if this refer to an existing file, otherwise, by variable the GCAM.MI.RefQueryFile.

If the file model_interface.properties is not found, it is created automatically before invoking ModelInterface.

If the -d/--useDefault flag is given, the model_interface.properties file is modified to refer to the GCAM reference Main_Queries.xml file.

If you have a customized queries XML file, set the config variable GCAM.MI.QueryFile to the path to this file and it will be loaded into ModelInterface via this command.

new : @replace

Create the directory structure and basic files required for a new pygcam project. If a directory is specified with the -r flag, the project is created with the given name in that directory; otherwise the project is created in the directory identified by the config variable GCAM.ProjectRoot.

This sub-command creates examples of xmlsrc/scenarios.py, etc/protection.xml, etc/project.xml, etc/rewriteSets.xml, and etc/scenarios.xml that can be edited to fit the needs of your project. The file etc/Instructions.txt is also created to provide further information.

If the -c flag is given, a basic entry for the new project is added to the users configuration file, $HOME/.pygcam.cfg. Before modifying the config file, a backup is created in $HOME/.pygcam.cfg~. For example, the command

gt new -c foo

generates and entry like this:

[foo]
# Added by "new" sub-command Thu Sep 22 14:30:29 2016
GCAM.ProjectDir        = %(GCAM.ProjectRoot)s/foo
GCAM.ScenarioSetupFile = %(GCAM.ProjectDir)s/etc/scenarios.xml
GCAM.RewriteSetsFile   = %(GCAM.ProjectDir)s/etc/rewriteSets.xml

The example project defines two scenario groups, consisting of a baseline and 4 carbon tax scenarios. In one group, 90% of unmanaged land is protected (i.e., removed from consideration), as in the reference GCAM scenario. In the other scenario group, this protection is not performed, so all land is considered available for use.

protect : @replace

Generate versions of GCAM's land_input XML files that protect a given fraction of land of the given land types in the given regions by subtracting the required land area from the "managed" land classes, thereby removing them from consideration in land allocations.

Simple protection scenarios can be specified on the command-line. More complex scenarios can be specified in an XML file, landProtection.xml <protect-xml>.

Examples:

# Create and modify copies of the reference land files, renaming them with
# "prot\_" prefix. Protect 80% of the "UnmanagedForest" and "UnmanagedPasture"
# land classes in the specified regions only.

CLASSES=UnmanagedForest,UnmanagedPasture
REGIONS='Australia_NZ,Canada,EU-12,EU-15,Japan,Middle East,Taiwan,USA'
OUTDIR="$HOME/tmp/xml"

gt protect -f 0.8 "$INFILES" -l "$CLASSES" -r "$REGIONS" -o "$OUTDIR" -t 'prot_{filename}'
# Run the land protection scenario "s1", described in the file ``$HOME/protect.xml``,
# placing the results in the directory ``$HOME/ws/workspace1``

gt protect -s s1 -S "$HOME/protect.xml" -w "$HOME/ws/workspace1"
query : @replace

Run one or more GCAM database queries by generating and running the named XML queries. The results are placed in a file in the specified output directory with a name composed of the basename of the XML query file plus the scenario name. For example,

gt query -o. -s MyReference,MyPolicyCase liquids-by-region

would run the liquids-by-region query on two scenarios, MyReference and MyPolicyCase. Query results will be stored in the files ./liquids-by-region-MyReference.csv and ./liquids-by-region-MyPolicyCase.csv.

The named queries are located using the value of config variable GCAM.QueryPath, which can be overridden with the -Q argument. The QueryPath consists of one or more colon-delimited (on Unix) or semicolon-delimited (on Windows) elements that can identify directories or XML files. The elements of QueryPath are searched in order until the named query is found. If a path element is a directory, the filename composed of the query + '.xml' is sought in that directory. If the path element is an XML file, a query with a title matching the query name (first literally, then by replacing '_' and '-' characters with spaces) is sought. Note that query names are case-sensitive.

sandbox : @replace

The sandbox sub-command allows you to create, delete, show the path of, or run a shell command in a workspace. If the --scenario argument is given, the operation is performed on a scenario-specific workspace within a project directory. If --scenario is not specified, the operation is performed on the project directory that contains individual scenario workspaces. Note that the gcam <gcam> sub-command automatically creates workspaces as needed.

N.B. You can run sandbox with the --path option before performing any operations to be sure of the directory that will be operated on, or use the --noExecute option to show the command that would be executed by --run.

setup : @replace

The setup sub-command automates modification to copies of GCAM's input XML files and construction of a corresponding configuration XML file. See setup for a detailed description.

transport : @replace

Generates a template CSV file which can be edited to set transport energy efficiency improvements that are applied by the callable function "transportTechEfficiency", which must be called in your scenarios.xml file.

Extending gt using plug-ins

The gt script will load any python files whose name ends in _plugin.py, found in any of the directories indicated in the config file variable GCAM.PluginPath. The value of GCAM.PluginPath must be a sequence of directory names separated by colons (:) on Unix-like systems or by semi-colons (;) on Windows.

See pygcam.subcommand for documentation of the plug-in API.