-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Functions Module Thread. Do not merge #17
Changes from 20 commits
4a5e84d
d82e0a3
66772a9
8a08e35
1a375fd
1c20509
e517c04
85d1961
3c76b65
8baaf55
33ce364
5c62b29
d733a4d
fb60c66
9571f7b
b1abced
c1a1ad5
405307c
41bbf7f
d34c7e6
5a38347
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,3 +14,179 @@ | |
|
||
https://en.wikipedia.org/wiki/Bagua | ||
``` | ||
|
||
# Tutorial | ||
|
||
## Quick Start | ||
|
||
Standalone example xun project file for computing fibonacci numbers | ||
|
||
```python | ||
import xun | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
context = xun.context( | ||
driver=xun.functions.driver.Sequential(), | ||
store=xun.functions.store.DiskCache('store'), | ||
) | ||
|
||
|
||
@context.function() | ||
def fibonacci_number(n): | ||
return f_n_1 + f_n_2 | ||
with ...: | ||
f_n_1 = ( | ||
0 if n == 0 else | ||
1 if n == 1 else | ||
fibonacci_number(n - 1) | ||
) | ||
f_n_2 = fibonacci_number(n - 2) if n > 1 else 0 | ||
|
||
|
||
@context.function() | ||
def fibonacci_sequence(n): | ||
return sequence | ||
with ...: | ||
sequence = [fibonacci_number(i) for i in range(n)] | ||
|
||
|
||
def main(): | ||
""" | ||
Compute and print the first 10 fibonacci numbers | ||
""" | ||
program = context.fibonacci_sequence.compile(10) | ||
sequence = program() | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for num in sequence: | ||
print(num) | ||
|
||
|
||
if __name__ == '__main__': | ||
main() | ||
``` | ||
|
||
Note that the `main` function defined here is optional. This project could either be run as is, or run using xun. To run the program using xun, run the following: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is one of the very first lines users will read. So maybe explicitly mark the difference between running project as is and with using xun so it was clear what xun is. Also, just calling xun without main won't print anything. As a user, where could I find the results or my computations? |
||
|
||
```bash | ||
xun exec examples/fibonacci.py "fibonacci_sequence(10)" | ||
``` | ||
|
||
To see a visualization of the call graph: | ||
|
||
```bash | ||
xun graph examples/fibonacci.py "fibonacci_sequence(10)" | ||
``` | ||
|
||
## A closer look | ||
|
||
Let's break down the code from `fibonacci_number` in the example above in to 4 parts | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in()to 4 parts |
||
|
||
```python | ||
@context.function() | ||
``` | ||
The decorator `@context.function()` marks this function as a context function, or a job. Context functions are functions that are meant to be executed in parallel, possibly on remote workers. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
```python | ||
def fibonacci_number(n): | ||
``` | ||
The function definition is just normal a python function definition. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
```python | ||
return f_n_1 + f_n_2 | ||
``` | ||
The body of the function is just regular python, it has as expected access to the function arguments, but it also has access to the variables defined in the special with constants statement. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
```python | ||
with ...: | ||
f_n_1 = ( | ||
0 if n == 0 else | ||
1 if n == 1 else | ||
fibonacci_number(n - 1) | ||
) | ||
f_n_2 = fibonacci_number(n - 2) if n > 1 else 0 | ||
``` | ||
Statements on form `with ...:` we refere to as with constant statments, they introduce new syntex and rules that we'll get move into in the next section. But important to note is that the recursive calls to `fibonacci_number(n)` are memoized in the context store, and can after scheduling, be run in parallel. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is this sentence really so important for the user? Like does this happens only for recursive calls, what is "memoized", why do I care that they have to be "in context store"? And you already promised me parallel execution of the whole method earlier. |
||
|
||
In fact, `xun` works by first figuring out all the calls that will happen to context functions, building a call graph, and scheduling the calls such that any previous call that we may depend on is executed before we evaluate the current call. This requires the call graph to be a DAG, or directed acyclic graph. | ||
|
||
## With Constants Statement | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How will "with constants" help me? How I, as a user of xun, will know what to put into with constants?
Python already knows that all the results must be available when dependent computations are done! Or, on the opposite, why can't I put everything into "with"? You say that "@context.function" marks a job. But some jobs are called in "with_constants" statement, some are called beyond it! To summarize:
|
||
|
||
```python | ||
@context.function() | ||
def do_some_work(some_values): | ||
result = expensive_computation(data) | ||
with ...: | ||
data = depencency(fixed_values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. depen(d)ency |
||
fixed_values = [fix(v) for v in some_values] | ||
``` | ||
|
||
In the above example a job takes in some iterable, `some_values` as argument, polished the values in it and calls another context function that it depends on. Note that the order of the statements inside the with constants statements does not matter. The syntax of with constants statements is similar to where clauses in Haskell and has rules that differ from standard python. In general, for with constants statements the following apply: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
* Order of statements is arbitrary | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note (because one more comment, one less - who cares anymore?): |
||
* Calling context functions is only allowed within with constants statements | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* Only assignments and free expressions are allowed | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
* There can only be one with constants statements per context function | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ... one "with constants" statement() |
||
* Any code in with constants statements will be executed during scheduling, so the heavy lifting should be done in fully in the function body, and not inside the with constants statements | ||
|
||
With constants statements allows xun to figure out the order of calls needed to execute a xun program. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "With constants" statements allow() |
||
|
||
## Stores | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For Stores and Drivers maybe refer to lines from original script (where you create them), so it's obvious what you are talking about |
||
|
||
```python | ||
class DiskCache(metaclass=StoreMeta): | ||
def __init__(self, cache_dir): | ||
self.cache_dir = cache_dir | ||
|
||
def __contains__(self, key): | ||
with diskcache.Cache(self.cache_dir) as cache: | ||
return key in cache | ||
|
||
def __delitem__(self, key): | ||
with diskcache.Cache(self.cache_dir) as cache: | ||
del cache[key] | ||
|
||
def __getitem__(self, key): | ||
with diskcache.Cache(self.cache_dir) as cache: | ||
return cache[key] | ||
|
||
def __iter__(self): | ||
with diskcache.Cache(self.cache_dir) as cache: | ||
return iter(cache) | ||
|
||
def __len__(self): | ||
with diskcache.Cache(self.cache_dir) as cache: | ||
return len(cache) | ||
|
||
def __setitem__(self, key, value): | ||
with diskcache.Cache(self.cache_dir) as cache: | ||
cache[key] = value | ||
``` | ||
|
||
As calls to context functions are executed and finished, the results are saved in the store of the context. Stores are classes that satisfy the requirements of `collections.abc.MutableMapping`, are pickleable, and whos state is shared between all instances. Stores can be defined by users by specifying a class with metaclass `xun.functions.store.StoreMeta`. The above code is in fact the entire implementation of the `xun.functions.store.DiskCache` store. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
## Drivers | ||
|
||
Drivers are the classes that have the responsibility of executing programs. This includes scheduling the calls of the call graph and managing any concurency. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
## The `@xun.make_shared` decorator | ||
|
||
```python | ||
from math import radians | ||
import numpy as np | ||
|
||
|
||
def not_installed(): | ||
pass | ||
|
||
|
||
@xun.make_shared | ||
def not_installed_but_shared(): | ||
pass | ||
|
||
|
||
@context.function() | ||
def context_function(): | ||
not_installed() # Not OK | ||
not_installed_but_shared() # OK | ||
radians(180) # OK because the function is builtin | ||
np.array([1, 2, 3]) # OK because the function is defined in an installed module | ||
``` | ||
|
||
Because context functions are pickled, any function they reference must either be installed on the system, be represented differently. `xun` comes with a decorator, `@xun.make_shared`, that can make many functions serializable, which you need to use if you wish to call functions defined in your project file. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
#!/usr/bin/env python3 | ||
import xun | ||
|
||
|
||
context = xun.context( | ||
driver=xun.functions.driver.Sequential(), | ||
store=xun.functions.store.DiskCache('store'), | ||
) | ||
|
||
|
||
@context.function() | ||
def fibonacci_number(n): | ||
return f_n_1 + f_n_2 | ||
with ...: | ||
f_n_1 = ( | ||
0 if n == 0 else | ||
1 if n == 1 else | ||
fibonacci_number(n - 1) | ||
) | ||
f_n_2 = fibonacci_number(n - 2) if n > 1 else 0 | ||
|
||
|
||
@context.function() | ||
def fibonacci_sequence(n): | ||
return sequence | ||
with ...: | ||
sequence = [fibonacci_number(i) for i in range(n)] | ||
|
||
|
||
def main(): | ||
""" | ||
Compute and print the first 10 fibonacci numbers | ||
""" | ||
program = context.fibonacci_sequence.compile(10) | ||
sequence = program() | ||
for num in sequence: | ||
print(num) | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
#!/usr/bin/env python3 | ||
import xun | ||
|
||
|
||
""" | ||
WARNING: in this example almost all the computations happens within the with | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
constants statement and will be run during scheduling. It cannot be | ||
parallelized, and is indended only to show the syntax of with constants | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
statements. | ||
""" | ||
|
||
|
||
context = xun.context( | ||
driver=xun.functions.driver.Sequential(), | ||
store=xun.functions.store.Memory(), | ||
) | ||
|
||
|
||
@context.function() | ||
def quicksort(iterable): | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
result = [] | ||
|
||
result.extend(lt_sorted) | ||
|
||
if len(pivot) == 1: | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
result.append(pivot[0]) | ||
|
||
result.extend(gt_sorted) | ||
|
||
return tuple(result) | ||
with ...: | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Tuples are used because arguments must be hashable and lists are not | ||
lt_sorted = quicksort(lt) if len(lt) > 0 else tuple() | ||
gt_sorted = quicksort(gt) if len(gt) > 0 else tuple() | ||
|
||
# Workaround because generators can't be pickled, make list before tuple | ||
lt = tuple([item for item in L[1:] if item <= pivot[0]]) | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
gt = tuple([item for item in L[1:] if item > pivot[0]]) | ||
|
||
pivot = L[:1] | ||
L = list(iterable) | ||
|
||
|
||
def main(): | ||
""" | ||
Compute and print the first 10 fibonacci numbers | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
input = (8, 4, 7, 5, 6, 0, 9, 2, 3, 1) | ||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
print('input:', input) | ||
|
||
program = context.quicksort.compile(input) | ||
output = program() | ||
|
||
print('output:', output) | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
import xun | ||
|
||
|
||
context = xun.context( | ||
driver=xun.functions.driver.Sequential(), | ||
store=xun.functions.store.Memory(), | ||
) | ||
|
||
|
||
v = 3 | ||
|
||
|
||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def add(a, b): | ||
return a + b | ||
|
||
|
||
@context.function() | ||
def three(): | ||
return v | ||
|
||
|
||
@context.function() | ||
def add3(a): | ||
return add(a, thr) | ||
with ...: | ||
thr = three() | ||
|
||
|
||
@context.function() | ||
def script(value): | ||
print("Result:", result) | ||
return result | ||
with ...: | ||
result = add3(value) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
import xun | ||
|
||
|
||
context = xun.context( | ||
driver=xun.functions.driver.Sequential(), | ||
store=xun.functions.store.Memory(), | ||
) | ||
|
||
|
||
v = 3 | ||
|
||
|
||
JensGM marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def add(a, b): | ||
return a + b | ||
|
||
|
||
@context.function() | ||
def three(): | ||
return v | ||
|
||
|
||
@context.function() | ||
def add3(a): | ||
return add(a, three) | ||
with ...: | ||
three = three() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you mark explicitly somewhere what's wrong with this example? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I get that, will make an issue |
||
|
||
|
||
@context.function() | ||
def script(): | ||
return result | ||
with ...: | ||
result = add3(2) | ||
|
||
|
||
program = context.script.compile() | ||
print(program()) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
setuptools >=28 | ||
setuptools_scm | ||
astor | ||
astunparse | ||
pyshd | ||
pytest | ||
pytest-runner | ||
pyshd | ||
setuptools >=28 | ||
setuptools_scm | ||
sphinx |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,5 @@ | ||
camille | ||
diskcache | ||
fastavro | ||
matplotlib | ||
networkx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One-liner even before tutorial stating what xun is in general would be good.
Also, put a note about technical requirements. Meaning at least exactly which Python versions are supported (I think some of the attributes you use appear only from certain version)