Skip to content
This repository has been archived by the owner on Dec 1, 2023. It is now read-only.

Commit

Permalink
Merge pull request #53 from Quansight-Labs/intro-notebooks
Browse files Browse the repository at this point in the history
Add example documentation
  • Loading branch information
saulshanabrook committed May 9, 2019
2 parents e06f66e + 780ca72 commit 72f9c11
Show file tree
Hide file tree
Showing 31 changed files with 4,015 additions and 475 deletions.
4 changes: 1 addition & 3 deletions .coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,4 @@ source =
omit =
*_test.py
**/test_*.py
nonumpy/*
metadsl/python/**
metadsl/nonumpy/**
metadsl/*/**
14 changes: 1 addition & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,6 @@

A framework for creating domain specific language libraries in Python.

The initial use case is in scientific computing, where:

1. You want to use the the APIs you know and love (ex. NumPy).
2. But you want it to execute in a new way (ex. on a GPU or distributed accross machines).
3. And you want to optimize a chain of operations before executing (ex. `(x * y)[0]` -> `x[0] * y[0]` / [Mathematics of Arrays](https://paperpile.com/app/p/5de098dd-606d-0124-a25d-db5309f99394)).


## Guiding Principles

Expand All @@ -22,8 +16,6 @@ effort.

This means we have to explicitly expose the protocols of the different levels to foster distributed collaboration and reuse.

## [Roadmap](./ROADMAP.md)

## Development

Either use repo2docker:
Expand Down Expand Up @@ -55,9 +47,5 @@ open htmlcov/index.html
### Docs

```bash
cd docs/
# build
make html
# serve
python -m http.server -d _build/html/
sphinx-autobuild docs docs/_build/html/
```
1 change: 1 addition & 0 deletions conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
collect_ignore = ["docs/conf.py"]
343 changes: 343 additions & 0 deletions docs/Concepts.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,343 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Concepts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`metadsl` inserts a layer between calling a function and computing its result, so that we can build up a bunch of calls, transform them, and then execute them all at once."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Instances and Calls\n",
"\n",
"The two building blocks we start with are, `Call`s and `Instance`s:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import dataclasses\n",
"import metadsl\n",
"\n",
"@dataclasses.dataclass\n",
"class MyObject(metadsl.Instance):\n",
" @metadsl.call(lambda self: MyObject)\n",
" def do_things(self) -> \"MyObject\":\n",
" ...\n",
"\n",
" @metadsl.call(lambda self, other: MyObject)\n",
" def __add__(self, other: \"MyObject\") -> \"MyObject\":\n",
" ...\n",
"\n",
" \n",
"@metadsl.call(lambda x: MyObject)\n",
"def create_object(x: int) -> MyObject:\n",
" ..."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MyObject(_call=Call(create_object, (123,)))"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"o = create_object(123)\n",
"o"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MyObject(_call=Call(__add__, (MyObject(_call=Call(do_things, (MyObject(_call=Call(create_object, (123,))),))), MyObject(_call=Call(create_object, (123,))))))"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"o.do_things() + o"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is useful to keep in mind the strict typing constraints here, not all of which can be faithfully checked by MyPy:\n",
"\n",
"1. The arguments in a `Call` should fit the signature of the function in the call.\n",
"2. The call within the instance should have represent a function whose return type is the type of the instance.\n",
"3. The `type_fn` is the first argument passed into the `call` decorator. It should take in the same signature as the function itself, but return a function that maps from a `Call` to the return value.\n",
"\n",
"Let's check the first two of these for `o`. We see that the call's function is `create_object`, which takes in an `int` and returns a `MyObject`. It's argument is indeed an `int`, so the first is true. And the instances holding it is of type `MyObject`, which is its return type.\n",
"\n",
"For the third, we see that for both calls, it is simply creating a new `MyObject` with the call."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Expressions and Recursive Calls\n",
"\n",
"You might notice that this representation is a bit verbose. It actually contains unneccesary information, which is the type that each function returns. We don't need to store this,\n",
"because we can always recreate it given the type functions we have specfifed.\n",
"\n",
"We can translate these instances into `RecursiveCall`s, which have the same structure as `Call` object but different typing constraints:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RecursiveCall(__add__, (RecursiveCall(do_things, (RecursiveCall(create_object, (123,)),)), RecursiveCall(create_object, (123,))))"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"r = metadsl.to_expression(o.do_things() + o)\n",
"r"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"__add__(do_things(create_object(123)), create_object(123))\n"
]
}
],
"source": [
"print(r)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The relationships here are:\n",
"1. The result of `to_expression` is a `RecursiveCall` if the argument is a `Instance` and the original object otherwise. \n",
"2. The args in a `RecursiveCall` should either be the right type for the function, if they are not an `Instance`, or a `RecursiveCall` that will return that `Instance` subtype, if they are."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can always get back to original nested instance version of the call:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MyObject(_call=Call(__add__, (MyObject(_call=Call(do_things, (MyObject(_call=Call(create_object, (123,))),))), MyObject(_call=Call(create_object, (123,))))))"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metadsl.from_expression(r)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This will recursively transform expression recursively. This leads us to another relationship:\n",
"\n",
"1. `from_expression(to_expression(x)) == x` for all `x`, if it is in `Instance` or any other Python object."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Why do we have these two forms? Well we need the `Instance` form, to provide a nice typed Python API. You should always be dealing in that form when you are calling the Python functions, so that the typing is enforced by MyPy. However, it's a bit more complicated to modify the graph in this form. So whenever we are traversing the graph in some way, we first convert it to expressions, so that we don't have to deal with the intermingled types."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Rules"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Often, it's helpful to think about replacements we do on the graph. By writing logic like this, we can then apply it to any nodes on the graph any number of times.\n",
"\n",
"We can combine a bunch of replacements and have them execute repeatedly on all nodes of the graph, until no more apply:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"rules = metadsl.RulesRepeatFold()\n",
"applier = metadsl.RuleApplier(rules)\n",
"\n",
"@rules.append\n",
"@metadsl.rule(None, None)\n",
"def _add(x: int, y: int):\n",
" return create_object(x) + create_object(y), lambda: create_object(x + y)\n",
"\n",
"@rules.append\n",
"@metadsl.rule(None)\n",
"def _do_things(x: int):\n",
" return create_object(x).do_things(), lambda: create_object(x * 2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The requirements for these replacements is that they take in some arguments, which can match any expression in the graph.\n",
"Their first return value, build up template expression based on the inputs, that shows what it should match again. The second\n",
"is a thunk that returns the resulting replacement. Note that both should have the same type, because all replacements should be equivalent.\n",
"\n",
"We can call these on an instance and it will return a replaced version of it:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"MyObject(_call=Call(__add__, (MyObject(_call=Call(create_object, (123,))), MyObject(_call=Call(create_object, (123,))))))\n",
"MyObject(_call=Call(create_object, (246,)))\n"
]
}
],
"source": [
"print(o+o)\n",
"print(applier(o + o))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just like we have a function that creates a `MyObject` from an `int`, we can have a similar one that does the opposite:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"246"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@metadsl.call(lambda o: applier)\n",
"def unwrap_object(o: MyObject) -> int:\n",
" ...\n",
" \n",
"@rules.append\n",
"@metadsl.pure_rule(None)\n",
"def _unwrap_object(i: int):\n",
" return unwrap_object(create_object(i)), i\n",
"\n",
"unwrap_object(o + o)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is nice, because now we can write our unboxing as a replacement, which means it's nice and type safe.\n",
"\n",
"You notice that here we are returning `replacements` from the `type_fn`. This means that `replacements` will be called with the created `Call(unwrap_object, (o))` object, which will in turn call all the replacements on it, \n",
"including that which we defined below for the unwrapping. So this *should* return an `int`, so it does violate the typing constraints we first wrote out about `call`s."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

0 comments on commit 72f9c11

Please sign in to comment.