Skip to content

Latest commit

 

History

History
368 lines (246 loc) · 10.9 KB

api.rst

File metadata and controls

368 lines (246 loc) · 10.9 KB

API reference

genno

Top-level classes and functions

configure Computer Key Quantity

configure

genno.Computer

A Computer is used to describe (add and related methods) and then execute (get and related methods) tasks stored in a graph. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Computer methods.

Instance attributes:

default_key graph keys modules unit_registry

General-purpose methods for describing tasks and preparing computations:

add add_queue add_single apply cache describe visualize

Helper methods to simplify adding specific computations:

add_file add_product aggregate convert_pyam disaggregate

Exectuing tasks:

get write

Utility and configuration methods:

check_keys configure full_key get_comp infer_keys require_compat

graph

Dictionary keys are either .Key, str, or any other hashable value.

Dictionary values are computations, one of:

  1. Any other, existing key in the Computer. This functions as an alias.
  2. Any other literal value or constant, to be returned directly.
  3. A task tuple: a callable (e.g. function), followed by zero or more computations, e.g. keys for other tasks.
  4. A list containing zero or more of (1), (2), and/or (3).

genno reserves some keys for special usage:

"config"

A dict storing configuration settings. See config. Because this information is stored in the graph, it can be used as one input to other computations.

Some inputs to tasks may be confused for (1) or (4), above. The recommended way to protect these is:

  • Literal str inputs to tasks: use functools.partial on the function that is the first element of the task tuple.
  • list of str: use dask.core.quote to wrap the list.

add

The data argument may be:

list

A list of computations, like [(list(args1), dict(kwargs1)), (list(args2), dict(kwargs2)), ...] → passed to add_queue.

str naming a computation

e.g. "select", retrievable with get_comp. add_single is called with (key=args[0], data, *args[1], **kwargs, i.e. applying the named computation. to the other parameters.

str naming another Computer method

e.g. add_file → the named method is called with the args and kwargs.

.Key or other str:

Passed to add_single.

add may be used to:

  • Provide an alias from one key to another:

    >>> from genno import Computer >>> rep = Computer() # Create a new Computer object >>> rep.add('aliased name', 'original name')

  • Define an arbitrarily complex computation in a Python function that operates directly on the ixmp.Scenario:

    >>> def my_report(scenario): >>> # many lines of code >>> return 'foo' >>> rep.add('my report', (my_report, 'scenario')) >>> rep.finalize(scenario) >>> rep.get('my report') foo

apply

The generator may have a type annotation for Computer on its first positional argument. In this case, a reference to the Computer is supplied, and generator can use the Computer methods to add many keys and computations:

def my_gen0(c: genno.Computer, **kwargs):
    c.load_file("file0.txt", **kwargs)
    c.load_file("file1.txt", **kwargs)

# Use the generator to add several computations
rep.apply(my_gen0, units="kg")

Or, generator may yield a sequence (0 or more) of (key, computation), which are added to the graph:

def my_gen1(**kwargs):
    op = partial(computations.load_file, **kwargs)
    yield from (f"file:{i}", (op, "file{i}.txt")) for i in range(2)

rep.apply(my_gen1, units="kg")

cache

Use this function to decorate another function to be added as the computation/callable in a task:

c = Computer(cache_path=Path("/some/directory"))

@c.cache
def myfunction(*args, **kwargs):
    # Expensive operations, e.g. loading large files
    return data

c.add("myvar", (myfunction,))

# Data is cached in /some/directory/myfunction-*.pkl

On the first call of get that invokes func, the data requested is returned, but also cached in the cache directory (see Configuration → Caching <config-cache>).

On subsequent calls, if the cache exists, it is used instead of calling the (possibly slow) func.

If the "cache_skip" configuration option is True, func is always called.

convert_pyam

The :pyamIAMC data format <data> includes columns named 'Model', 'Scenario', 'Region', 'Variable', 'Unit'; one of 'Year' or 'Time'; and 'value'.

Using convert_pyam:

  • 'Model' and 'Scenario' are populated from the attributes of the object returned by the Reporter key scenario;
  • 'Variable' contains the name(s) of the quantities;
  • 'Unit' contains the units associated with the quantities; and
  • 'Year' or 'Time' is created according to year_time_dim.

A callback function (collapse) can be supplied that modifies the data before it is converted to an ~pyam.IamDataFrame; for instance, to concatenate extra dimensions into the 'Variable' column. Other dimensions can simply be dropped (with drop). Dimensions that are not collapsed or dropped will appear as additional columns in the resulting ~pyam.IamDataFrame; this is valid, but non-standard IAMC data.

For example, here the values for the MESSAGEix technology and mode dimensions are appended to the 'Variable' column:

def m_t(df):
    """Callback for collapsing ACT columns."""
    # .pop() removes the named column from the returned row
    df['variable'] = 'Activity|' + df['t'] + '|' + df['m']
    return df

ACT = rep.full_key('ACT')
keys = rep.convert_pyam(ACT, 'ya', collapse=m_t, drop=['t', 'm'])

genno.Key

Quantities are indexed by 0 or more dimensions. A Key refers to a quantity using three components:

  1. a string name,
  2. zero or more ordered dims, and
  3. an optional tag.

For example, quantity with three dimensions:

# FIXME

>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])

Key allows a specific, explicit reference to various forms of “foo”:

  • in its full resolution, i.e. indexed by a, b, and c:

    >>> k1 = Key('foo', ['a', 'b', 'c']) >>> k1 <foo:a-b-c>

  • in a partial sum over one dimension, e.g. summed across dimension c, with remaining dimensions a and b:

    >>> k2 = k1.drop('c') >>> k2 == 'foo:a-b' True

  • in a partial sum over multiple dimensions, etc.:

    >>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b' True

  • after it has been manipulated by other computations, e.g.

    >>> k3 = k1.add_tag('normalized') >>> k3 <foo:a-b-c:normalized> >>> k4 = k3.add_tag('rescaled') >>> k4 <foo:a-b-c:normalized+rescaled>

Notes:

A Key has the same hash, and compares equal to its str representation. repr(key) prints the Key in angle brackets ('<>') to signify that it is a Key object.

>>> str(k1) 'foo:a-b-c' >>> repr(k1) '<foo:a-b-c>' >>> hash(k1) == hash('foo:a-b-c') True

Keys are immutable: the properties name, dims, and tag are read-only, and the methods append, drop, and add_tag return new Key objects.

Keys may be generated concisely by defining a convenience method:

>>> def foo(dims): >>> return Key('foo', dims.split()) >>> foo('a b c') <foo:a-b-c>

genno.Quantity

The .Quantity constructor converts its arguments to an internal, xarray.DataArray-like data format:

# Existing data
data = pd.Series(...)

# Convert to a Quantity for use in reporting calculations
qty = Quantity(data, name="Quantity name", units="kg")
rep.add("new_qty", qty)

Common genno usage, e.g. in message_ix, creates large, sparse data frames (billions of possible elements, but <1% populated); ~xarray.DataArray's default, 'dense' storage format would be too large for available memory.

  • Currently, Quantity is .AttrSeries, a wrapped pandas.Series that behaves like a ~xarray.DataArray.
  • In the future, genno will use .SparseDataArray, and eventually ~xarray.DataArray backed by sparse data, directly.

The goal is that all genno-based code, including built-in and user computations, can treat quantity arguments as if they were ~xarray.DataArray.

Computations

genno.computations

Unless otherwise specified, these methods accept and return .Quantity objects for data arguments/return values.

Genno's compatibility modules <compat> each provide additional computations.

Calculations:

add aggregate apply_units broadcast_map combine disaggregate_shares group_sum pow product ratio select sum

Input and output:

load_file write_report

Data manipulation:

concat

Internal format for quantities

genno.core.quantity

genno.core.quantity

genno.core.attrseries

genno.core.attrseries

genno.core.sparsedataarray

genno.core.sparsedataarray

Utilities

genno.util

genno.caching