Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed lifecycle and workflow for Model Variable Definitions and Metadata #25

Closed
mensch72 opened this issue Nov 15, 2016 · 1 comment

Comments

@mensch72
Copy link
Contributor

Goal: Allow for...

  • flexible addition of model variables (including static ones aka parameters) by model components
  • easy usage in other model components
  • possibility to be promoted into a centrally governed "master data model"
  • without having to maintain a variable's metadata (name, description, dimension, unit, scale type, bounds, quantum, default value, uninformed prior distribution, etc.) in more than one place.
  1. A new model variable (e.g. Cell.fossil_reserve) comes into being when some model component (e.g. pycopancore.components.fossil_extraction) introduces it the first time by defining it in their entity interface class by instantiating the Variable class with the variable's metadata, e.g. in pycopancore.components.fossil_extraction.interface.py:
class Cell (object):
    fossil_reserve = Variable(name="cellular fossil reserve", desc="...", dimension=Carbon)
  1. Another components (e.g. pycopancore.components.general_equilibrium_energy_sector) can then access that variable by importing that interface module in its own interface module and assigning the same Variable object to its own entity interface class, e.g. in pycopancore.components.general_equilibrium_energy_sector.interface.py:
import pycopancore.components.fossil_extraction.interface as FE
class Cell (object):
     fossil_reserve = FE.Cell.fossil_reserve
  1. As soon as the variable gets approved by the copancore-community's "data model committee" (to be defined), its definition gets copied into the master data model, a special module of the same format as the components' interface modules, whose classes are however never inherited directly. In the above example, on would then add to pycopancore.master_data_model.py:
class Cell (object):
    fossil_reserve = Variable(name="cellular fossil reserve", desc="...", dimension=Carbon)
  1. From that point on, all components using this variable should import it from the master data model, e.g., both pycopancore.components.fossil_extraction.interface.py and pycopancore.components.general_equilibrium_energy_sector.interface.py would change to
import pycopancore.master_data_model as MDM
class Cell (object):
     fossil_reserve = MDM.Cell.fossil_reserve
  1. To maintain harmonized variable definitions and avoid coincidental name clashes, the method Model.configure() must make sure that if two of the components used in the model at hand define the same variable name, it must point to the same Variable object (as in the examples above). It may even make sure that if a component uses a variable name that exists in the MDM, it must point to the Variable object defined in the MDM.

  2. Still, components should probably be allowed to use another component's (or the master data model's) variables under a locally different name, as in:

import pycopancore.components.fossil_extraction.interface as FE
class Cell (object):
     geological_carbon_stock = FE.Cell.fossil_reserve
@mensch72
Copy link
Contributor Author

mensch72 commented Nov 15, 2016

The second part of my proposal is a metadata model for variables in which the following metadata are stored as attributes in each instance of Variable:

  • name: human readable short name of the variable, to be used in labels etc.
  • desc: longer text
  • symbol: mathematical symbol or abbreviation to be used as a short label
  • scale: "ratio" (default), "interval", "ordinal", or "nominal" (see https://en.wikipedia.org/wiki/Level_of_measurement)
  • reference: some URI, e.g. a wikipedia page
  • default: a default value or None
  • uninformed_prior: the prior distribution to be used in case the value is unknown, specified by a python function implementing a pseudo-random number generator, e.g. lambda: norm.rvs(loc=10, scale=5)

Additional metadata for ratio- and interval-scaled variables:

  • dimension: the physical dimension, a symbolic expression made from instances of the class Dimension, defined in a module pycopancore.dimensions.py, e.g. carbon / time
  • unit: the measurement unit, defaults to the dimension's default unit, a symbolic expression made from instaces of the class Unit, defined in a module pycopancore.units.py, e.g. GtC / a
  • lower_bound: a lower bound (in the before-specified units), or None (default)
  • upper_bound: similar, or None (default)
  • quantum: None (default) if variable can take any value, q>0 if it is an integer multiple of q

Additional metadata for ordinal- and nominal-scaled variables:

  • levels: list of possible values (numerical and/or strings), or None if unrestricted (in which case the python operator < determines the order)

Example:

fossil_extraction = Variable(
  name = "cellular fossil fuel extraction flow",
  desc = "the amount of fossil fuel extracted from the geological carbon stock of the cell per time",
  symbol = "F", # to be compatible to copan:GLOBAL model
  scale = "ratio", # unnecessary since this is the default
  reference = "https://en.wikipedia.org/wiki/Fossil_fuel",
  default = 0,
  uninformed_prior = None, # unnecessary since this is the default
  dimension = Carbon / Time,  # unnecessary since this this follows from unit
  unit = GtC / a,
  lower_bound = 0,
  upper_bound = None, # unnecessary since this is the default
  quantum = None # unnecessary since this is the default
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants