# Getting started with Dask on Saturn Cloud


Dask is a framework that easily lets you run Python in parallel across distributed machines. Below is a small example of using Dask on Saturn Cloud. The code creates a function that computes exponents and runs it across a list of inputs in parallel.

_For more details about the basics of Dask, read the [Parallelization in Python](https://www.saturncloud.io/docs/reference/dask_concepts/) article in the Saturn Cloud docs._ You can also look at the [Saturn Cloud Dask examples](https://www.saturncloud.io/docs/examples/dask/), and [the official Dask documentation](https://docs.dask.org/en/latest/).

Before running this example, you need to create a Dask cluster associated with this project. You can create the cluster through the [Saturn Cloud project page](https://www.saturncloud.io/docs/getting-started/create_cluster_ui/), or [programmatically in Python](https://www.saturncloud.io/docs/getting-started/create_cluster/#create-clustersaturncluster-object).

This code chunk imports the Dask libraries and connects to the Saturn Cloud Dask cluster. 

In [1]:
import dask

#from dask_saturn import SaturnCluster
#from dask.distributed import Client
#cluster = SaturnCluster()
#client = Client(cluster)

from dask.distributed import Client, LocalCluster
cluster = LocalCluster()
client = Client(cluster)

You can use the `@dask.delayed` decorator to change a regular Python function into a lazily-evaluated function. That means that a function call will return a future, instead of a value. The function won't immediately do the computation when it is run, instead only when that future object has its result requested.

In [2]:
@dask.delayed
def lazy_exponent(args):
    x, y = args
    """Define a lazily evaluating function"""
    return x ** y

The Dask distributed client comes with several methods for managing collections of such futures. The code below shows how to use a few of these.

* `client.map(f, x)` - run function `f` once per item in a list-like object `x`. Returns a list of futures that can then be evaluated.
* `client.gather(futures)` - Given a list of futures held by Dask workers, pulls them back to the client. Returns a list of delayed results (in this example).
* `client.compute(delayed_results)` - Given a list of delayed results, brings the results on the client. Returns a list of futures (`sync=False`) or actual function results (`sync=True`)
* `.result()` - Wait for a future to complete. Returns its actual value.

All together, the code below will take the list of inputs, converts them to futures held by Dask workers for apply the exponental function, gathers them onto the client machine, and determines the results.

In [3]:
inputs = [[1, 2], [3, 4], [5, 6], [9, 10], [11, 12]]

example_future = client.map(lazy_exponent, inputs)
futures_gathered = client.gather(example_future)
futures_computed = client.compute(futures_gathered, sync=False)

results = [x.result() for x in futures_computed]
results

[1, 81, 15625, 3486784401, 3138428376721]

This was _somewhat_ of a toy example--you probably want to do more complex computations than exponents with Dask. However the core concept of making a function and then running it in a distributed fashion is at the core of what you can do with Dask on Saturn Cloud.


When you're done, you can close the connection to the cluster:

In [4]:
client.close()