# Dask Match Demonstration

We make a Dask client, mostly so we can use the dashboard.

In [None]:
from dask.distributed import Client
client = Client()

## Example DataFrame

In [None]:
import dask_expr as dx

df = dx.datasets.timeseries(
    start="2000-01-01", 
    end="2000-12-30", 
    freq="100ms",
)
df

## Expression trees and optimization

In [None]:
%%time
out = df[df.id == 1000].x.sum()
out.compute()

In [None]:
out.pprint()

In [None]:
out.optimize(fuse=False).pprint()

In [None]:
out.optimize().pprint()

In [None]:
%%time
out.optimize().compute()

## Class structure

In [None]:
out = (df.x + 1)
out

Dataframe/Series/Index objects live in `collections.py`

In [None]:
type(out)

Collections used to hold `_meta`, `divisions`, `_name`, and `__dask_graph__`.  Now they hold just `expr`, which computes these things based on user inputs.

In [None]:
out.__dict__

In [None]:
out.expr

Expressions have a type hierarchy which reflects user commands

In [None]:
type(out.expr)

In [None]:
type(out.expr).mro()

Expressions are composed of a *type* or *Operation* (like `Add`) and *operands*, (like `left` and `right`)

In [None]:
out._parameters

In [None]:
out.operands

In [None]:
type(out.left)

In [None]:
dict(
    zip(
        out.left._parameters, 
        out.left.operands,
    )
)

In [None]:
dict(
    zip(
        out.left.frame._parameters, 
        out.left.frame.operands,
    )
)