Skip to content

Latest commit

 

History

History
367 lines (271 loc) · 15.4 KB

config.rst

File metadata and controls

367 lines (271 loc) · 15.4 KB

Configuration System

The pygcam scripts and libraries rely on a configuration file to:

  • define the location of essential and optional files,
  • allow the user to set defaults for many command-line arguments to scripts, and
  • define both global default and project-specific values for all parameters.

The configuration file and variables are described below.

Use the gt init <init> command to initialize your configuration file.

Usage of the config sub-command is described on the gt config <config> page. See pygcam.config for documentation of the API to the configuration system.

pygcam uses the Python ConfigParser package. See the documentation there for more details.

Configuration file sections

The configuration file is divided into sections indicated by a name within square brackets. All variable declarations following a section declaration, until the next section declaration (if any) appear in the declared section.

Default section

Default values are defined in the [DEFAULT] section. When pygcam requests the value of a variable from a project section (see below), the default value is returned if the variable is not defined in the project section. Variables whose values apply to multiple projects can be defined conveniently in the [DEFAULT] section.

All pre-defined pygcam variables are defined in the [DEFAULT] section, allowing them to be overridden on a project-by-project basis.

Project sections

Each project must have its own section. For example, to setup a project called, say, "myproj", I would create the section [myproj]. Following this, I would define variables particular to this project, e.g., where to find the files defining scenarios, queries, and so on.

Note that the new <new> sub-command will set up the structure for a new project and (optionally) add a section to the user's config file for the named project.

The configuration files

There are up to 4 configuration files read, two of which are user-modifiable:

  1. First, pygcam/etc/system.cfg is read from within the pygcam package. This defines all known config variables and provides their default values as described below. The values in this file are the appropriate values for Linux and similar systems. This file should not be modified by the user.
  2. Next, a platform-specific file is read, if it exists. Currently, the only such files are pygcam/etc/Windows.cfg and pygcam/etc/Darwin.cfg, read on Windows and Macintosh systems, respectively. (N.B. "Darwin" is the official platform name for the Macintosh operating system.) These files should not be modified by the user.
  3. Next, if the environment variable PYGCAM_SITE_CONFIG is defined, it should refer to a configuration file that defines site-specific settings. This file is optional; it allows an administrator to consolidate site-specific values to simplify configuration for users.
  4. Finally, the user's configuration file is read if it exists; otherwise the file is created with the initial contents being a commented-out version of pygcam/etc/system.cfg. This provides a handy reference to the available parameters and their default values.
    • On Linux and OS X, the user's configuration file is found in $HOME/.pygcam.cfg
    • On Windows, the file .pygcam.cfg will be stored in the directory identified by the first of the following environment variables defined to have a non-empty value: PYGCAM_HOME, HOMESHARE, and HOMEPATH. The first variable, PYGCAM_HOME is known only to pygcam, while at least one of the other two should be set by Windows.
    • In all cases, the directory in which the configuration file is located is assigned to the pygcam configuration variable Home.

The values in each successive configuration file override default values for variables of the same name that are set in files read earlier. Values can also be set in project-specific sections whose names should match project names defined in the project-xml file. Thus when a user specifies a project to operate on, either on the command-line to gcamtool or as the value of GCAM.DefaultProject in $HOME/.pygcam.cfg, the project-specific values override any values set in [DEFAULT] sections.

For example, consider the following values in $HOME/.pygcam.cfg:

[DEFAULT]
GCAM.RefWorkspace = %(Home)s/GCAM/gcam-v4.3

[Project1]
GCAM.RefWorkspace = /other/location/GCAM/gcam-v4.4

[OtherProject]
# no value set here for GCAM.RefWorkspace

The default value for GCAM.RefWorkspace is %(Home)s/GCAM/gcam-v4.3. This value is used for the project OtherProject since no project-specific value is defined, but the project Project1 overrides this with the value /other/location/GCAM/gcam-v4.4.

The available parameters and their default values are described below.

Editing the user configuration file

You can edit the configuration file, $HOME/.pygcam.cfg, with any editor capable of working with plain text, i.e., not a word-processor such as Word. Use the command gt config -e to invoke an editor on the configuration file.

The command invoked by gt config -e to edit the config file is the value of the configuration parameter GCAM.TextEditor, which defaults to a system-appropriate value shown in the table below. Set this value in the configuration file to invoke your preferred editor.

For example, if you prefer the emacs editor on a Mac, you can add this line to ~/.pygcam.cfg to cause the Finder to open the file using the emacs application:

GCAM.TextEditor = open -a emacs

Or, to edit the config file using the PyCharm app, use this:

GCAM.TextEditor = open -a PyCharm

If the editor command is not found on your execution PATH, you can specify the full pathname. Use quotes around the path if it includes spaces, as in the examples below.

To use Notepad++ on Windows, use the following (adjusted as necessary for your installation location):

GCAM.TextEditor = "C:/Program Files/Notepad++/notepad++.exe"

To use PyCharm, use the following -- again, adjusted to match your installation location:

GCAM.TextEditor  = "C:/Program Files/JetBrains/PyCharm 2018.1.4/bin/pycharm64.exe"

Invoking the command:

gt config -e

will cause the editor to be invoked on your configuration file.

Referencing configuration variables

A powerful feature of the configuration system is that variables can be defined in terms of other variables. The syntax for referencing the value of a variable is to precede the variable name with %( and follow it with )s. Thus to reference variable GCAM.QueryDir, you would write %(GCAM.QueryDir)s.

Note

When referencing a variable in the config file, you must include the trailing s after the closing parenthesis, or a Python exception will be raised.

Also note that variable names are case-sensitive.

Variable values are substituted when a variable's value is requested, not when the configuration file is read. The difference is that if variable A is defined in terms of variable B, (e.g., A = %(B)s/something/else), you can subsequently change B and the value of A will reflect this when A is accessed by pygcam.

All known variables are given default values in the pygcam system files. Users can create variables in any of the user controlled config files, if desired.

Environment variables

All defined environmental variables are loaded into the config parameter space before reading any configuration files, and are accessible with a prefix of $, as in a UNIX shell. For example, to reference the environment variable SCRATCH, you can use %($SCRATCH)s.

Validating configuration settings

pygcam requires that certain configuration variables be set. The table below shows key variables, indicating whether they are required or optional, and whether their value must be a file or directory.

Variable name Required Type
GCAM.SandboxRoot yes directory
GCAM.ProjectRoot yes directory
GCAM.QueryDir yes directory
GCAM.MI.Dir yes directory
GCAM.RefWorkspace yes directory
GCAM.TempDir yes directory
GCAM.ProjectXmlFile yes file
GCAM.RefConfigFile yes file
GCAM.MI.JarFile yes file
GCAM.UserTempDir no directory
GCAM.RegionMapFile no file
GCAM.RewriteSetsFile no file

The config sub-command provides a limited amount of validation by checking that all required and optional variables are set to reasonable values. You can do a basic (not foolproof) check that the required files and directories exist using the command:

gt config -t

which will print out a listing of files and their status.

You can also specify a project to check that project's variables. For example, I can test the values set for project myproj with the following command, shown with command output:

$ gt +P myproj config -t

OK: GCAM.SandboxRoot = /people/plev920/ws OK: GCAM.SandboxDir = /people/plev920/ws/myproj/ OK: GCAM.ProjectRoot = /people/plev920/bitbucket OK: GCAM.ProjectDir = /people/plev920/bitbucket/myproj OK: GCAM.QueryDir = /people/plev920/bitbucket/myproj/queries OK: GCAM.MI.Dir = /people/plev920/GCAM/current/input/gcam-data-system/_common/ModelInterface/src OK: GCAM.RefWorkspace = /people/plev920/GCAM/current OK: GCAM.TempDir = /pic/scratch/plev920/tmp OK: GCAM.UserTempDir = /people/plev920/tmp OK: GCAM.ProjectXmlFile = /people/plev920/bitbucket/myproj/etc/project.xml OK: GCAM.RefConfigFile = /people/plev920/GCAM/current/exe/configuration_ref.xml OK: GCAM.MI.JarFile = /people/plev920/GCAM/current/input/gcam-data-system/_common/ModelInterface/src/ModelInterface.jar OK: GCAM.RewriteSetsFile = /people/plev920/bitbucket/myproj/etc/rewriteSets.xml

Location of GCAM program and data files

The configuration variable GCAM.RefWorkspace must point to a directory structured like the standard GCAM workspace, with sub-directories for input, output, libs, and exe. These files are the reference files used by gcamtool to set up "sandbox" workspaces in which to run GCAM.

pygcam does not run GCAM in the reference workspace's exe directory; it uses the files there to create new workspaces as required. Creating separate workspaces for each scenario allows multiple scenarios to be run simultaneously without contention for the XML database which is created at the end of the model run. This is essential when running on a computing cluster.

The variable GCAM.MI.Dir should point to a directory holding the ModelInterface program. This is used to execute batch queries to extract results from GCAM. By default, this location is computed from the GCAM.RefWorkspace, but you can change it if necessary, e.g., if you're using a customized version of ModelInterface.

Default configuration variables and values

The system default values are provided in the pygcam package in the file pygcam/etc/system.cfg, which is listed below. In addition to these values, several values are read from platform-specific files, as noted above. These values are shown below.

For Windows:

../../pygcam/etc/Windows.cfg

For MacOS:

../../pygcam/etc/Darwin.cfg

Default configuration variable dependencies

The following figure shows variable dependencies according to the default definitions. Variables lower in the figure depend on those above them. Thus, if you change a variable with "descendants", you affect the definition of everything below it in the figure.

image

The system defaults file

../../pygcam/etc/system.cfg

Configuring the logging system

Setting logging verbosity

When the gcamtool runs, or when pygcam functions are called from your own code, diagnostic and informational messages are printed. You can control the level of output by setting the GCAM.LogLevel in your .pygcam.cfg file. (See :pylogging for further details.)

The simplest setting is just one of the following values, in order of decreasing verbosity: DEBUG, INFO, WARNING, ERROR, and FATAL. This will apply to all pygcam modules.

You can also specify verbosity by module, by specifying a module name and the level for that module as a comma-separated list of "module:level" strings, e.g.,:

GCAM.LogLevel = WARN, .utils:INFO, .scenarioSetup:DEBUG, CI_plugin:INFO, \
  .mcs.worker:DEBUG,, myProj.writeFuncs:DEBUG

In this example, the default level is set to WARN, and three pygcam modules have their levels set: pygcam.utils is set to INFO, pygcam.scenarioSetup is set to DEBUG, and pygcam.mcs.worker is set to DEBUG. A user's plugin can also use the logging system. This example sets logging levels for the user's CI_Plugin and myProj.writeFuncs modules.

Console / file logs and message formatting

Note that the module name is shown in the console log messages. Setting GCAM.LogLevel to DEBUG produces the maximum number of log messages; setting it to FATAL minimizes message verbocity.

Other relevant variables are shown here with their default values:

# If set, application logger messages are written here. Note that
# this is different than the GCAM.BatchLogFile for batch job output.
GCAM.LogFile = %(GCAM.SandboxRoot)s/log/gt.log

# Show log messages on the console (terminal)
GCAM.LogConsole = True

# Format strings for log files and console messages. Note doubled
# '%%' required here around logging parameters to avoid attempted
# variable substitution within the config system.
GCAM.LogFileFormat    = %%(asctime)s %%(levelname)s %%(name)s:%%(lineno)d %%(message)s
GCAM.LogConsoleFormat = %%(levelname)s %%(name)s: %%(message)s