genno
configure Computer Key Quantity
configure
genno.Computer
A Computer is used to describe (add
and related methods) and then execute (get
and related methods) tasks stored in a graph
. Advanced users may manipulate the graph directly; but common reporting tasks can be handled by using Computer methods.
Instance attributes:
default_key graph keys modules unit_registry
General-purpose methods for describing tasks and preparing computations:
add add_queue add_single apply cache describe visualize
Helper methods to simplify adding specific computations:
add_file add_product aggregate convert_pyam disaggregate
Exectuing tasks:
get write
Utility and configuration methods:
check_keys configure full_key get_comp infer_keys require_compat
graph
Dictionary keys are either .Key
, str
, or any other hashable value.
Dictionary values are computations, one of:
- Any other, existing key in the Computer. This functions as an alias.
- Any other literal value or constant, to be returned directly.
- A task
tuple
: a callable (e.g. function), followed by zero or more computations, e.g. keys for other tasks. - A
list
containing zero or more of (1), (2), and/or (3).
genno
reserves some keys for special usage:
"config"
A
dict
storing configuration settings. Seeconfig
. Because this information is stored in thegraph
, it can be used as one input to other computations.
Some inputs to tasks may be confused for (1) or (4), above. The recommended way to protect these is:
- Literal
str
inputs to tasks: usefunctools.partial
on the function that is the first element of the task tuple. list
ofstr
: usedask.core.quote
to wrap the list.
add
The data argument may be:
list
A list of computations, like
[(list(args1), dict(kwargs1)), (list(args2), dict(kwargs2)), ...]
→ passed toadd_queue
.str
naming a computatione.g. "select", retrievable with
get_comp
.add_single
is called with(key=args[0], data, *args[1], **kwargs
, i.e. applying the named computation. to the other parameters.str
naming another Computer methode.g.
add_file
→ the named method is called with the args and kwargs..Key
or otherstr
:Passed to
add_single
.
add
may be used to:
Provide an alias from one key to another:
>>> from genno import Computer >>> rep = Computer() # Create a new Computer object >>> rep.add('aliased name', 'original name')
Define an arbitrarily complex computation in a Python function that operates directly on the
ixmp.Scenario
:>>> def my_report(scenario): >>> # many lines of code >>> return 'foo' >>> rep.add('my report', (my_report, 'scenario')) >>> rep.finalize(scenario) >>> rep.get('my report') foo
apply
The generator may have a type annotation for Computer on its first positional argument. In this case, a reference to the Computer is supplied, and generator can use the Computer methods to add many keys and computations:
def my_gen0(c: genno.Computer, **kwargs):
c.load_file("file0.txt", **kwargs)
c.load_file("file1.txt", **kwargs)
# Use the generator to add several computations
rep.apply(my_gen0, units="kg")
Or, generator may yield
a sequence (0 or more) of (key, computation), which are added to the graph
:
def my_gen1(**kwargs):
op = partial(computations.load_file, **kwargs)
yield from (f"file:{i}", (op, "file{i}.txt")) for i in range(2)
rep.apply(my_gen1, units="kg")
cache
Use this function to decorate another function to be added as the computation/callable in a task:
c = Computer(cache_path=Path("/some/directory"))
@c.cache
def myfunction(*args, **kwargs):
# Expensive operations, e.g. loading large files
return data
c.add("myvar", (myfunction,))
# Data is cached in /some/directory/myfunction-*.pkl
On the first call of get
that invokes func, the data requested is returned, but also cached in the cache directory (see Configuration → Caching <config-cache>
).
On subsequent calls, if the cache exists, it is used instead of calling the (possibly slow) func.
If the "cache_skip"
configuration option is True
, func is always called.
convert_pyam
The :pyamIAMC data format <data>
includes columns named 'Model', 'Scenario', 'Region', 'Variable', 'Unit'; one of 'Year' or 'Time'; and 'value'.
Using convert_pyam
:
- 'Model' and 'Scenario' are populated from the attributes of the object returned by the Reporter key
scenario
; - 'Variable' contains the name(s) of the quantities;
- 'Unit' contains the units associated with the quantities; and
- 'Year' or 'Time' is created according to year_time_dim.
A callback function (collapse) can be supplied that modifies the data before it is converted to an ~pyam.IamDataFrame
; for instance, to concatenate extra dimensions into the 'Variable' column. Other dimensions can simply be dropped (with drop). Dimensions that are not collapsed or dropped will appear as additional columns in the resulting ~pyam.IamDataFrame
; this is valid, but non-standard IAMC data.
For example, here the values for the MESSAGEix technology
and mode
dimensions are appended to the 'Variable' column:
def m_t(df):
"""Callback for collapsing ACT columns."""
# .pop() removes the named column from the returned row
df['variable'] = 'Activity|' + df['t'] + '|' + df['m']
return df
ACT = rep.full_key('ACT')
keys = rep.convert_pyam(ACT, 'ya', collapse=m_t, drop=['t', 'm'])
genno.Key
Quantities are indexed by 0 or more dimensions. A Key refers to a quantity using three components:
- a string
name
, - zero or more ordered
dims
, and - an optional
tag
.
For example, quantity with three dimensions:
# FIXME
>>> scenario.init_par('foo', ['a', 'b', 'c'], ['apple', 'bird', 'car'])
Key allows a specific, explicit reference to various forms of “foo”:
in its full resolution, i.e. indexed by a, b, and c:
>>> k1 = Key('foo', ['a', 'b', 'c']) >>> k1 <foo:a-b-c>
in a partial sum over one dimension, e.g. summed across dimension c, with remaining dimensions a and b:
>>> k2 = k1.drop('c') >>> k2 == 'foo:a-b' True
in a partial sum over multiple dimensions, etc.:
>>> k1.drop('a', 'c') == k2.drop('a') == 'foo:b' True
after it has been manipulated by other computations, e.g.
>>> k3 = k1.add_tag('normalized') >>> k3 <foo:a-b-c:normalized> >>> k4 = k3.add_tag('rescaled') >>> k4 <foo:a-b-c:normalized+rescaled>
Notes:
A Key has the same hash, and compares equal to its str
representation. repr(key)
prints the Key in angle brackets ('<>') to signify that it is a Key object.
>>> str(k1) 'foo:a-b-c' >>> repr(k1) '<foo:a-b-c>' >>> hash(k1) == hash('foo:a-b-c') True
Keys are immutable: the properties name
, dims
, and tag
are read-only, and the methods append
, drop
, and add_tag
return new Key objects.
Keys may be generated concisely by defining a convenience method:
>>> def foo(dims): >>> return Key('foo', dims.split()) >>> foo('a b c') <foo:a-b-c>
genno.Quantity
The .Quantity
constructor converts its arguments to an internal, xarray.DataArray
-like data format:
# Existing data
data = pd.Series(...)
# Convert to a Quantity for use in reporting calculations
qty = Quantity(data, name="Quantity name", units="kg")
rep.add("new_qty", qty)
Common genno
usage, e.g. in message_ix
, creates large, sparse data frames (billions of possible elements, but <1% populated); ~xarray.DataArray
's default, 'dense' storage format would be too large for available memory.
- Currently, Quantity is
.AttrSeries
, a wrappedpandas.Series
that behaves like a~xarray.DataArray
. - In the future,
genno
will use.SparseDataArray
, and eventually~xarray.DataArray
backed by sparse data, directly.
The goal is that all genno
-based code, including built-in and user computations, can treat quantity arguments as if they were ~xarray.DataArray
.
genno.computations
Unless otherwise specified, these methods accept and return .Quantity
objects for data arguments/return values.
Genno's compatibility modules <compat>
each provide additional computations.
Calculations:
add aggregate apply_units broadcast_map combine disaggregate_shares group_sum pow product ratio select sum
Input and output:
load_file write_report
Data manipulation:
concat
genno.core.quantity
genno.core.quantity
genno.core.attrseries
genno.core.attrseries
genno.core.sparsedataarray
genno.core.sparsedataarray
genno.util
genno.caching