
# [Meillionen](https://github.com/openmodelingfoundation/meillionen): Model Interfaces and Coupling

## [Open Modeling Foundation](https://openmodelingfounation.github.io)

Calvin Pritchard (calvin.pritchard@asu.edu)

An Interface Definition Library for Existing Models

- support remote access to models as command line programs
- helps to validate, serialize and deserialize models interfaces and messages
- intended to enable wrapping existing models by inspecting metadata

Concepts

- validators
- resources
- functions
- classes
- modules
- function requests
- method requests

Validators

```
table Validator {
    name: string;
    type_name: string;
    payload: [ubyte];
}
```

The validator schema is a payload with a name and a type_name. The type_name is a key that tells you the format of the payload. Payload is a binary buffer so that different schema formats can be added by other libraries. The name is the name of the sink or source that is associated with the validator

An example validator that loads and saves using Pandas. The schema matches [Apache Arrows](https://arrow.apache.org/docs/python/data.html#schemas).

```python
PandasHandler.from_kwargs(
    description='Daily soil characteristics',
    columns={
        'fields': [
            {
                'name': name,
                'data_type': 'Float32'
            } for name in
            [
                'day_of_year',
                'soil_daily_runoff',
                'soil_daily_infiltration',
                'soil_daily_drainage',
                'soil_evapotranspiration',
                'soil_evaporation',
                'soil_water_storage_depth',
                'soil_water_profile_ratio',
                'soil_water_deficit_stress',
                'soil_water_excess_stress'
            ]]})
```

Resources

Metadata used load or save data. Could contain actual data or just be a reference to it.

```
table Resource {
    name: string;
    type_name: string;
    payload: [ubyte];
}
```

A reference to a ESRI ascii file. The base path and name parameters are usually added later determined from settings, context and the resource's associated name. 

```python
BASE_DIR = '../../examples/crop-pipeline'
INPUT_DIR = os.path.join(BASE_DIR, 'workflows/inputs')
OUTPUT_DIR = os.path.join(BASE_DIR, 'workflows/outputs')

# global settings for an experiment reside in experiment settings
experiment = Experiment(
    sinks=PathSettings(base_path=OUTPUT_DIR),
    sources=PathSettings(base_path=INPUT_DIR)
)
# run specific settings reside in a trial
trial = experiment.trial("2021-05-22")

partial_elevation = FileResource(ext='.asc')

# This is typically handled for you by the framework.
# It is normally done inside the function request and method request classes
elevation = partial_elevation.build(settings=trial.sources, name='elevation')
elevation.serialize()

# b'{ "path": "../../examples/crop-pipeline/workflows/input/elevation.asc" }'
```

Function Interface

The function interface defines what arguments the function expects and what the function is called

```
table FunctionInterface {
    name: string;
    sinks: [Validator];
    sources: [Validator];
}
```

A function interface definition can be created in Python by specifying how to handle and validate the sources and sinks used by the model.

```python
simplecrop_func_interface = FuncInterfaceServer(
    name = 'run',
    sources = {
        'daily': PandasHandler.from_kwargs(...),
        'yearly': PandasHandler.from_kwargs(...)
    },
    sinks = {
        'plant': PandasHandler.from_kwargs(...),
        'soil': PandasHandler.from_kwargs(...)
    })
```

Class Interface

```
table ClassInterface {
    name: string;
    type_name: string;
    methods: [FunctionInterface];
}
```

A class interface definition can be created in Python by specifying different function methods.

```python
simplecrop_class_interface = ClassInterfaceServer(
    name = 'simplecrop',
    methods=[
        FuncInterfaceServer(
            name = 'default_year_config',
            sources = {},
            sinks = {
                'yearly': PandasHandler.from_kwargs(...)
            }),
        FuncInterfaceServer(
            name = 'run',
            sources = {
                'daily': PandasHandler.from_kwargs(...),
                'yearly': PandasHandler.from_kwargs(...)
            },
            sinks = {
                'plant': PandasHandler.from_kwargs(...),
                'soil': PandasHandler.from_kwargs(...)
            })
    ])
```

Module Interface

```
table ModuleInterface {
    functions: [FunctionInterface];
    classes: [ClassInterface];
}
```

A module interface definition can be created in Python by specifying function and class attributes

```python
simplecrop_module = ModuleInterfaceServer(
    functions=[simplecrop_func_interface],
    classes=[simplecrop_class_interface]
)
```

It can be serialized to send to a client with

```python
serialized = simplecrop_module.serialize()
```

Function Request

```
table FunctionRequest {
    name: string;
    sources: [Resource];
    sinks: [Resource];
}
```

A request to a function endpoint (a command line program)

```python
overlandflow = ClientFunctionModel.from_path(
    name='overlandflow', 
    path=os.path.join(BASE_DIR, 'overlandflow/model.py'),
    trial=trial
)

elevation = FileResource(".asc")
weather = FeatherResource()

sources = {
    'elevation': elevation,
    'weather': weather
}

sinks = overlandflow.run(sources=sources)
```

Method Request

```
table MethodRequest {
    class_name: string;
    method_name: string;
    sources: [Resource];
    sinks: [Resource];
}
```

```python
simplecrop = ClientClassModel.from_path(
    name='simplecrop',
    class_name='simplecrop',
    path='simplecrop_omf',
    trial=trial)

sinks = simplecrop.default_year_config()
```

BMI Interface Supported

- intialize
- get_input_var_names
- get_input_var_type
- get_output_var_names
- get_output_var_type
- get_value
- set_value
- update
- finalize

BMI interface in action

```python
simplecrop_bmi = BMI(CLIRef(
    name='simplecrop', 
    class_name='simplecrop', 
    path='simplecrop_omf'))
simplecrop_bmi.initialize(trial=trial)

simplecrop_bmi.get_input_var_names()
simplecrop_bmi.get_output_var_names()

simplecrop_bmi.set_value('daily', FeatherResource())
simplecrop_bmi.set_value('yearly', FeatherResource())
simplecrop_bmi.update()
plant = simplecrop_bmi.get_value('plant')
soil = simplecrop_bmi.get_value('soil')
simplecrop_bmi.finalize() # no-op for this model
```

# Roadmap / Future Directions

- distribute packages on PyPi for MacOS and Windows
- make resource handling easily available to other packages
- document how to build your own resource handlers
- support model communication via grpc with [Arrow Flight](https://arrow.apache.org/docs/format/Flight.html) services
- improve integration with [Prefect](https://docs.prefect.io/) (workflow manager)
- contribute changes back to BMI / [PyMT](https://pymt.readthedocs.io/en/latest/)
- support additional languages such as Java, Julia and R

# Challenges: Coupled Remote Models

- How to call off to and setup remote coupled models (overlandflow is coupled with an infiltration model in the example)
- The overlandflow creates a coupled model interface manually
- Would be better to use existing model metadata to build an interface automatically (or at least partially automatically)
-  Will work on using metadata in existing BMI models in the coming weeks to reduce boilerplate needed to wrap a model

# Challenges: Model Adapters

- Name mismatches in dataframe columns and tensor dimensions between models require adapters right now
- Resources for dataframes and tensors should have optional selected fields and field aliases to remove some of the times adapter functions / classes need to be created
- More involved data transformation cases can make use of datafusion or other frameworks to provide an SQL interface so that joins, aggregates, selects and filters can be done to the data

# Challenges: Dynamic Interfaces

- Some model interfaces may you may want to change based on input from a previous function
  - Saving only particular variables in a simulation
  - Type of source constrains type of sink (keep projection of source and use it in sink type)
- Can be done with multiple methods and explicitly providing schema information 

# Resources

- [meillionen](https://github.com/openmodelingfoundation/meillionen)
- [PyMT](https://pymt.readthedocs.io/en/latest/)
- [OMF](https://openmodelingfoundation.github.io)