-
Notifications
You must be signed in to change notification settings - Fork 3
Ideas
The main idea of mesh is to provide tools to assist in the systematic assembly and operation of python objects.
By "assembly" we mean the definition of how the different python objects should interact. Typically, we define this assembly by writing lines of code, feeding some values in one function, getting it's output, combining with another value, and assigning that result to a variable, etc.
By "systematic" we mean "through a system", so "systematic assembly" means that the structure of this assembly is specified through a separate logic.
By "operation" we mean that we the ability to operation on, and with, the objects created. For example, being able to add logging, or caching, or extra branches of computation to an already existing data flow.
The last important word to clarify: python objects. Yes, this means that the scope is everything python, potentially. Most of the time these objects will be callables though. There's two main aspects in meshed tools:
- defining the relationship of a set of objects
- defining how use these relationships to execute, given these relationships
If at this point, you're in need a concrete image in mind, think of the computation DAG
(direct acyclic graph),
pipelines, or even a simple composition of functions -- but know, as you have these instances in mind, that there could be many more such "assembly systems". Even so, the input-output relationship that a DAG
represents could be executed in many different ways:
- as a single function computing leaf values from root values
- as a template to extract many different input-output functions from, for example
- to make a class that fixes some attributes, then offers methods that depend on these
- as the expression of linked user story actions
- as the structure of a reactive system (e.g. reactive programming,
@property
,@cached_property
, ...) - as a guideline on how to dynamically find a computational path from available resources to a desired resource
Though the term pipeline has snowballed a more general meaning recently, we take it as "function composition": A sequence of functions where the output of one is connected to the input of the next.
def this(a, b=1):
return a + b
def that(x, b=2):
return x * b
def greet(y):
return f"Hello {y}!"
pipe = Pipe([this, that, greet])
Here, pipe
will take a number, add 1
to it, then multiply by 2
, and finally return a string that says hello to the resulting number.
A DAG (Directed Acyclic Graph) subsumes the pipeline in that functions are be input-output connected in any way that doesn't create "directed cycles".
This in includes pipelines like
a
│
▼
┌───────┐
│ this │
└───────┘
│
▼
x
│
▼
┌───────┐
│ that │
└───────┘
│
▼
y
│
▼
┌───────┐
│ greet │
└───────┘
but could also have structures such as
a ─┐
│
│ │
│ │
▼ │
┌─────────────┐ │
b= ──▶ │ _add │ │
└─────────────┘ │
│ │
│ │
▼ │
│
x │
│
│ │
│ │
▼ │
┌─────────────┐ │
y= ──▶ │ the_product │ │
└─────────────┘ │
│ │
│ │
▼ │
│
mult │
│
│ │
│ │
▼ │
┌─────────────┐ │
│ _exp │ ◀┘
└─────────────┘
│
│
▼
exp
In the case of pipelines, specifying structure is easy: You just need to specify the sequence of functions, and if these functions were made so that all (but possibly the first function) of them have only one required argument, no further information is needed.
But in the case of DAGs, for each function that is not an "entry" function, you need to specify where it should get all its required arguments from.
One can specify each connection between function inputs and outputs explicitly, or specify rules that can infer these connections from properties of the functions themselves, such as type annotations, function names and/or argument names.
See in the example below (which creates the DAG shown above) how the output of add
is connected to the first argument of mult
via its
argument name and how the output of mult
is connected to the first argument of exp
via the function name.
from meshed.dag import DAG, FuncNode
def add(a, b=1):
return a + b
def mult(x, y=3):
return x * y
def exp(mult, a):
return mult ** a
func_nodes = [
FuncNode(add, out='x'),
FuncNode(mult, name='the_product'),
FuncNode(exp)
]
dag = DAG(func_nodes)
Note; after writing this, realized I already had in another wiki devoted to this: [Computational Path Resolution] wiki (https://github.com/i2mint/meshed/wiki/Computational-path-resolution), now moved to Computational Path Resolution discussion.
This digraph doesn't represent a DAG computation. In fact, it's not a DAG since it has a directed cycle!
Instead here each node represents a variable kind and an edge represents a function that can transform/compute one kind from another.
The general object illustrated here is one that relates sets of variable kinds to other ones. (Note often functions require more than one variable, but we don't represent this case here, for simplicity.)
The intended use of such an object is to allow a user to get one set of variables from another set of variables without having to manually specify how to do so. For example, a user can specify that they need to have the wfs
corresponding to a specific folder_path
, and the object will be able to find a computational path to do so. It may be that there's several ways (such as is the case here) to do so, in which case, some rule will decide which path to offer. This rule could for example assume that the path with the least edges must be more efficient and use that.
Note as well that this object can have cycles: Here, we may want to save some wfs
as files of a folder_path
folder. Our current object enables us to do that too.
One typical application of such path finder object arises when we want to overload a function to "Postelize" it, enabling it to handle several types of inputs. For example, our function could have a wfs
input, waveforms which the function expects to be given as a list of lists.
- If we wanted to enable the user to specify a folder containing
.wav
files that should be used to create these waveforms, we could include someif... then... transform
code in the function to handle this. - Better would be to bring this code outside the main function, and use it inside that main function to do our bidding.
- Even more elegant, and reusable, would be to write a decorator that can be applied to any function using
wfs
(recognized by name and/or type, say). - Our path finder object brings it yet to another level, allowing us to encapsulate not only this logic from
folder_path
towfs
, but from many different intertwined variable kinds.
Express complex fallback structures for finding or computing things. (Not saying that's a good idea though.)
Self-organizing code is a scary idea, so let's just mention the utility of the core functionality of such an approach: Validation of an existing organization.
Enabling the specification of how the different components (say functions) can be connected.
Letting a automatic (or semi-automatic) system assemble (or propose assembly structures).
Connect by name, connect by type, connect by expected behavior etc.