# RandomForests

You may want to switch to the `coiled/default` environment

```
coiled install coiled/default
conda activate coiled-coiled-default
```

## Setup Coiled

In [None]:
import coiled
cluster = coiled.Cluster(10, configuration="coiled/default")

from dask.distributed import Client
client = Client(cluster)
client.dashboard_link

## Set up problem


In [None]:
from joblib import parallel_backend
import numpy as np
from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

digits = load_digits()
clf = RandomForestClassifier(n_estimators=45000, verbose=1)

## We could run locally

In principle we want to run the following:

```python
with parallel_backend('dask'):
    clf.fit(digits.data, digits.target)
```

But then our local session will be in rapid communication with all of the workers.  This is fine if we're running this from the cloud, but if we're running this from our laptop then it's better to run this on a worker.

In [None]:
def train(clf, X, y):
    with parallel_backend('dask'):
        clf.fit(digits.data, digits.target)
    
    return clf

In [None]:
%%time

future = client.submit(train, clf, digits.data, digits.target)
s = client.submit(str, future)
s.result()