The gt script unifies GCAM workflow managment functionality into a single script with sub-commands. Generic sub-commands are implemented directly by the pygcam library. Project-specific features can be added via plugins <plugins-label>
.
Note
Quick links to sub-commands: building <building>
, chart <chart>
, config <config>
, diff <diff>
, gcam <gcam>
, gui <gui>
, init <init>
, mcs <mcs>
, mi <mi>
, new <new>
, protect <protect>
, query <query>
, run <run>
, setup <setup>
, sandbox <sandbox>
, transport <transport>
Sub-commands <mcs/commands>
supporting Monte Carlo simulation
The sub-commands support all the major workflow setups, including
- Modify XML files and configuration.xml to set up a modeling experiment (See the
setup <setup>
sub-command andsetup
for more information.)- Run GCAM in an automatically-created workspace, allowing multiple instances of GCAM to run simultaneously, e.g., on parallel computing systems (See the
gcam <gcam>
sub-command.)- Execute batch queries against the XML database to extract GCAM results, with on-the-fly regionalization based on a simple region-mapping file. (See the
query <query>
sub-command.)- Compute differences between policy and baseline scenarios, including linear annualization of values between time-steps, and (See the
diff <diff>
sub-command.)- Plot results, with flexible control of figure features including title, axis labels, scale, and so on. (See the
chart <chart>
sub-command.)- Manage (create, delete, rename, run commands in) automatically-created workspaces. (See the
sandbox <sandbox>
sub-command.)
In addition, the run <run>
sub-command allows workflow steps to be defined in an XML file so that individual or groups of steps can be executed for one or more scenarios. The run
sub-command supports direct invocation of other workflow steps as well as running arbitrary programs of the user's choosing.
Finally, gt allows all project steps to be run on a compute node in a High-Performance Computing environment by specifying +b
or --batch
on the command-line. (Note that this is not available on Mac OS X or Windows.)
For example, the command:
gt +b +P MyProject run -S MyScenario
runs all steps for scenario MyScenario
in the project MyProject
by queuing a batch job on the default queue. Arguments to gt
allow the user to set various resource requirements and to select the queue to use.
The command to run to queue the batch job is taken from the configuration file parameter GCAM.BatchCommand
. Example batch commands for the SLURM and PBS job management systems are provided in variables GCAM.QueueSLURM
and GCAM.QueuePBS
, respectively.
Command-line usage is described below. Note that some command-line (e.g., batch-related) options must precede the sub-command, whereas sub-command specific options must follow it.
Note
Note that arguments that pertain regardless of the sub-command (e.g., +P
to identify the project name) are specified prior to the sub-command, and use +
rather than -
. This is to avoid conflicts between these "main" arguments and sub-command arguments. (An exception is gt -h
, which retains the -
.) Long-form argument names use two hyphens, as in --projectName
.)
- run : @replace
This sub-command reads instructions from the file
project-xml
, the location of which is taken from the user's~/.pygcam.cfg <pygcam-cfg>
file. The workflow steps indicated in the XML file and command-line determine which commands to run.Examples:
Run all steps for the default scenario group for project 'Foo':
gt +P Foo run
Run all steps for scenario group 'test' for project 'Foo', but only for scenarios 'baseline' and 'policy-1':
gt +P Foo run -g test -S baseline,policy1
or, equivalently:
gt +P Foo run --group test --scenario baseline --step policy1
Run only the 'setup' and 'gcam' steps for scenario 'baseline' in the default scenario group:
gt +P Foo run -s setup,gcam -S baseline,policy-1
Same as above, but queue a batch job to run these commands on the queue 'short':
gt +b +q short +P Foo run -s setup,gcam -S baseline,policy-1
Note that the command above will run the two scenarios ('baseline' and 'policy-1') in a single batch job. To run scenarios in separate batch jobs, use the
-D
or--distribute
option to the run sub-commmand:gt +q short +P Foo run -D -S baseline,policy-1
The "distribute" option knows that various project steps for non-baseline scenarios may depend on baseline scenarios, so the baseline is always run first, with the non-baseline scenarios queued as dependent on the successful completion of the baseline. If no scenarios are explicitly named, all scenarios in the group are run, as usual.
The
-n
flag displays the commands that would be executed for a command, but doesn't run them:gt +P Foo run -s setup,gcam -S baseline,policy-1 -n
- building : @replace
Generates a template CSV file which can be edited to set building energy efficiency improvements that are applied by the callable function "buildingTechEfficiency", which must be called in your scenarios.xml file.
- chart : @replace
The
chart
sub-command generates plots from GCAM-style ".csv" files. Two types of plots are currently supported: (i) stacked bar plots based on summing values over all years (with optional interpolation of annual values), by the given 'indexCol' (default is 'region'), and (ii) stacked bar plots by year for some data column, where the data are grouped by and summed across elements with the indicated 'indexCol'. The first option is indicated by using the-S
(--sumYears
) option. Numerous options allow the appearance to be customized.You can perform on-the-fly unit conversions using the
-m
/--multiplier
or-V
/--divisor
arguments, which cause all values in "year columns" to be multiplied or divided, respectively, by the values provided. Values can be specified as numeric constants or using symbolic constants defined in thepygcam.units
module.- config : @replace
The config command list the values of configuration variables from ~/.pygcam.cfg. With no arguments, it displays the values of all variables for the default project. Use the
-d
flag to show only values from the[DEFAULT]
section.If an argument
name
is provided, it is treated as a substring pattern, unless the-x
flag is given (see below). All configuration variables containing the give name are displayed with their values. The match is case-insensitive.If the
-x
or--exact
flag is specified, the argument is treated as an exact variable name (case-sensitive) and only the value is printed. This is useful mainly for scripting. For general use the substring matching is more convenient.Examples:
$ gt config project [MyProject] GCAM.DefaultProject = MyProject GCAM.ProjectRoot = /Users/rjp/bitbucket/myProject GCAM.ProjectXmlFile = /Users/rjp/bitbucket/myProject/etc/project.xml $ gt config -x GCAM.DefaultProject MyProject $ gt config sand MyProject] GCAM.SandboxRoot = /Users/rjp/ws/myProject $ gt config sand -d [DEFAULT] GCAM.SandboxRoot = /Users/rjp/ws
- diff : @replace
The
diff
sub-command script computes the differences between results from two or more CSV files generated from batch queries run on a GCAM database, saving the results in either a CSV or XLSX file, according to the extension given to the output file. If not provided, the output filename defaults to differences.csv.If multiple otherFiles are given (i.e., the referenceFile plus 2 or more other files named on the command-line), the resulting CSV file will contain one difference matrix for each otherFile, with a label indicating which pair of files were used to produce each result.
When the output file is in XLSX format, each result is written to a separate worksheet. If the -c flag is specified, no differences are computed; rather, the .csv file contents are combined into a single .xlsx file.
- gcam : @replace
The
gcam
sub-command runs the GCAM executable on the designated configuration file, scenario, or workspace. Typical use (e.g., from aproject.xml
file) would be to run GCAM by referencing a directory named the same as a scenario, holding a file calledconfig.xml
, as is generated by thesetup
sub-command. (Seesetup
.)If a workspace is specified on the command-line, it is used. Otherwise, if a scenario is specified, the workspace defined by {GCAM.SandboxDir}/{scenario} is used. If neither workspace nor scenario are defined, the value of config variable
GCAM.RefWorkspace
is used, i.e., GCAM is run in the reference workspace.If the workspace doesn't exist, it is created based on the reference GCAM workspace, defined by the configuration variable
GCAM.RefWorkspace
. By default, read-only directories (e.g., input and libs) are symbolically linked from the new workspace to the reference one. (See thenew <new>
sub-command for more information on the creation of workspaces.)Directories into which GCAM writes results (e.g., output and exe) are created in the new workspace, but read-only files within exe (e.g., the GCAM executable) are symbolically linked (with the same caveat for Windows users.)
Usage example:
gt gcam -S ~/MyProject/scenarios -s MyScenario -w ~/sandboxes/MyProject/MyScenario
would run the scenario
MyScenario
in the newly created sandbox (workspace)~/sandboxes/MyProject/MyScenario
using the configuration file~/MyProject/scenarios/MyScenario/config.xml
.- gui : @replace
Run the Graphical User Interface (GUI) generated from the command-line interface in a local web server available at http://127.0.0.1:8050.
- init : @replace
Create the configuration file ~/.pygcam.cfg and initialize key variables, based on command-line arguments, or interactive prompts. See
initialize
for details.- mcs : @replace
Enable or disable Monte Carlo Simulation (MCS) mode, or check whether MCS mode is currently enabled or disabled.
- mi : @replace
Invoke ModelInterface from the command-line after changing directory to the value of config variable
GCAM.QueryDir
. If the filemodel_interface.properties
is found, it is used as is, unless the-u/--updateProperties
flag is specified, in which case the file is modified so that thequeryFile
entry refers to the value ofGCAM.MI.QueryFile
, if this refer to an existing file, otherwise, by variable theGCAM.MI.RefQueryFile
.If the file
model_interface.properties
is not found, it is created automatically before invoking ModelInterface.If the
-d/--useDefault
flag is given, themodel_interface.properties
file is modified to refer to the GCAM referenceMain_Queries.xml
file.If you have a customized queries XML file, set the config variable
GCAM.MI.QueryFile
to the path to this file and it will be loaded into ModelInterface via this command.- new : @replace
Create the directory structure and basic files required for a new pygcam project. If a directory is specified with the
-r
flag, the project is created with the given name in that directory; otherwise the project is created in the directory identified by the config variableGCAM.ProjectRoot
.This sub-command creates examples of
xmlsrc/scenarios.py
,etc/protection.xml
,etc/project.xml
,etc/rewriteSets.xml
, andetc/scenarios.xml
that can be edited to fit the needs of your project. The fileetc/Instructions.txt
is also created to provide further information.If the
-c
flag is given, a basic entry for the new project is added to the users configuration file,$HOME/.pygcam.cfg
. Before modifying the config file, a backup is created in$HOME/.pygcam.cfg~
. For example, the commandgt new -c foo
generates and entry like this:
[foo] # Added by "new" sub-command Thu Sep 22 14:30:29 2016 GCAM.ProjectDir = %(GCAM.ProjectRoot)s/foo GCAM.ScenarioSetupFile = %(GCAM.ProjectDir)s/etc/scenarios.xml GCAM.RewriteSetsFile = %(GCAM.ProjectDir)s/etc/rewriteSets.xml
The example project defines two scenario groups, consisting of a baseline and 4 carbon tax scenarios. In one group, 90% of unmanaged land is protected (i.e., removed from consideration), as in the reference GCAM scenario. In the other scenario group, this protection is not performed, so all land is considered available for use.
- protect : @replace
Generate versions of GCAM's land_input XML files that protect a given fraction of land of the given land types in the given regions by subtracting the required land area from the "managed" land classes, thereby removing them from consideration in land allocations.
Simple protection scenarios can be specified on the command-line. More complex scenarios can be specified in an XML file,
landProtection.xml <protect-xml>
.Examples:
# Create and modify copies of the reference land files, renaming them with # "prot\_" prefix. Protect 80% of the "UnmanagedForest" and "UnmanagedPasture" # land classes in the specified regions only. CLASSES=UnmanagedForest,UnmanagedPasture REGIONS='Australia_NZ,Canada,EU-12,EU-15,Japan,Middle East,Taiwan,USA' OUTDIR="$HOME/tmp/xml" gt protect -f 0.8 "$INFILES" -l "$CLASSES" -r "$REGIONS" -o "$OUTDIR" -t 'prot_{filename}'
# Run the land protection scenario "s1", described in the file ``$HOME/protect.xml``, # placing the results in the directory ``$HOME/ws/workspace1`` gt protect -s s1 -S "$HOME/protect.xml" -w "$HOME/ws/workspace1"
- query : @replace
Run one or more GCAM database queries by generating and running the named XML queries. The results are placed in a file in the specified output directory with a name composed of the basename of the XML query file plus the scenario name. For example,
gt query -o. -s MyReference,MyPolicyCase liquids-by-region
would run the
liquids-by-region
query on two scenarios, MyReference and MyPolicyCase. Query results will be stored in the files./liquids-by-region-MyReference.csv
and./liquids-by-region-MyPolicyCase.csv
.The named queries are located using the value of config variable
GCAM.QueryPath
, which can be overridden with the-Q
argument. The QueryPath consists of one or more colon-delimited (on Unix) or semicolon-delimited (on Windows) elements that can identify directories or XML files. The elements of QueryPath are searched in order until the named query is found. If a path element is a directory, the filename composed of the query + '.xml' is sought in that directory. If the path element is an XML file, a query with a title matching the query name (first literally, then by replacing'_'
and'-'
characters with spaces) is sought. Note that query names are case-sensitive.- sandbox : @replace
The
sandbox
sub-command allows you to create, delete, show the path of, or run a shell command in a workspace. If the--scenario
argument is given, the operation is performed on a scenario-specific workspace within a project directory. If--scenario
is not specified, the operation is performed on the project directory that contains individual scenario workspaces. Note that thegcam <gcam>
sub-command automatically creates workspaces as needed.N.B. You can run
sandbox
with the--path
option before performing any operations to be sure of the directory that will be operated on, or use the--noExecute
option to show the command that would be executed by--run
.- setup : @replace
The
setup
sub-command automates modification to copies of GCAM's input XML files and construction of a corresponding configuration XML file. Seesetup
for a detailed description.- transport : @replace
Generates a template CSV file which can be edited to set transport energy efficiency improvements that are applied by the callable function "transportTechEfficiency", which must be called in your scenarios.xml file.
The gt script will load any python files whose name ends in _plugin.py
, found in any of the directories indicated in the config file variable GCAM.PluginPath
. The value of GCAM.PluginPath
must be a sequence of directory names separated by colons (:
) on Unix-like systems or by semi-colons (;
) on Windows.
See pygcam.subcommand
for documentation of the plug-in API.