# Benchmarking Performance of History vs. Timeseries Files with [`ecgtools`](https://ecgtools.readthedocs.io/en/latest/), [`Intake-ESM`](https://intake-esm.readthedocs.io/en/latest/), and [`Dask`](https://dask.org/)

In this example, we will walk through "benchmarking" performance of reading, apply calculations, and visualizing data from the Community Earth System Model (CESM), using the following packages:
* [`ecgtools`](https://ecgtools.readthedocs.io/en/latest/)
* [`Intake-ESM`](https://intake-esm.readthedocs.io/en/latest/)
* [`Dask`](https://dask.org/)

Going into this analysis, we have a ***hypothesis*** that performance should be substantially better when reading from timeseries files, but let's take a look...

## Imports

### Installing packages via `conda-forge`

As of this week, [`ecgtools`](https://ecgtools.readthedocs.io/en/latest/) is available via `conda-forge`, which is very exciting! You can install the packages used here using the following:

```bash
conda install -c conda-forge ecgtools ncar-jobqueue distributed intake-esm
```

In [3]:
import intake
from distributed import Client
from ecgtools import Builder
from ecgtools.parsers.cesm import parse_cesm_history, parse_cesm_timeseries
from ncar_jobqueue import NCARCluster

## Build the Catalogs

Something to keep in mind here is that [`ecgtools`](https://ecgtools.readthedocs.io/en/latest/)'s builder parallelizes across the number of cores you have available; so ideally, for this section of the notebook, you will want to use more than a single core, up to however many you see fit.

In [None]:
b = Builder(
    "/glade/campaign/cesm/development/espwg/SMYLE/initial_conditions/SMYLE-FOSI/",
    depth=5,
    exclude_patterns=["*/rest/*", "*/logs/*", "*/proc/*"],
    njobs=5,
)