# **DASCore Application: Training**

December 1, 2024

This notebook shows an example DASCore application: tracking traffic in Salt Lake City. These signals are primarily dominated by the light rail trains. 

The data were in May of 2023. The original data were down-sampled to 4 Hz. 

<a target="_blank" href="https://colab.research.google.com/github/DASDAE/seg_tutorial/blob/master/04_application.ipynb">

</a>  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>

#### Useful links: 
* [DASCore Tutorial](https://dascore.org/tutorial/concepts.html)
* [Numpy Dates and Times](https://numpy.org/devdocs/reference/arrays.datetime.html)
* [Pint Units Library](https://pint.readthedocs.io/en/stable/)

In [None]:
%matplotlib widget

In [None]:
%%capture

# First ensure DASCore is installed. If not, install and restart the kernel.
try:
    import dascore as dc
except ImportError:
    !pip install dascore
    !pip install ipympl
    # resetart kernel
    import IPython
    IPython.Application.instance().kernel.do_shutdown(True) #automatically restarts kernel

from rich import print

In [None]:
import numpy as np

## Setup
First, we create a directory of DAS files to simulate the output of an acquisition. 

In [None]:
# Create a directory of DAS file
from pathlib import Path

path = Path("UoU_data")
path.mkdir(exist_ok=True)

# Get a spool of data with 5 minute patches
spool = dc.get_example_spool("UoU_lf_urban.hdf5").chunk(time=60*5)
for num, patch in enumerate(spool):
    patch_path = (path / f"UoU_{num}.hdf5")
    if patch_path.exists():
        continue
    patch.io.write(patch_path, "dasdae")

## Exploration

We initialize a spool on the directory of DAS files and explore a summary of the contents.

In [None]:
spool = dc.spool(path)

In [None]:
spool.get_contents()

There are 10 files, each of 5 minutes of data. One example file looks like this:

In [None]:
patch = spool[2]
patch.viz.waterfall(scale=0.05);

Notice the linear features pre-dominantly moving from left to right. 

### **Exercise** (Train Speed)
Use the interactive mode of the MPL or a [2D dft](https://dascore.org/api/dascore/transform/fourier/dft.html) to estimate the train velocity. Use [`Patch.slope_filter`](https://dascore.org/api/dascore/proc/filter/slope_filter.html) to make the signal pop. 

## Aggregations

Let's calculate some statistics of the data. For example, the standard deviation of each channel (distance) for each of the files.

In [None]:
abs_max = [
    x.abs()
    .aggregate(dim="time", method=np.max, dim_reduce="first") 
    for x in spool
]

In [None]:
patch = (
    # Merge all the aggregate patches together
    dc.spool(abs_max).chunk(time=None)[0]
    # Ensure data type reflects the new meaning
    .update_attrs(data_type="maximum absolute values")
)

In [None]:
patch.viz.waterfall()

### **Exercise** (Aggregations 1)
Redo the aggregations above, but calculate the standard deviation for every 60 seconds of data with 10 second overlap. 


## Moving windows

A similar concept to the aggregations is that of a moving window which is implemented with DASCore's [`Patch.rolling`](https://dascore.org/api/dascore/proc/rolling/rolling.html). These are more efficient for overlapping or very dense operations. 



In [None]:
spool_mw = dc.spool(path).chunk(time=120, overlap=20, tolerance=10)

output = []
for patch in spool_mw:
    abs_patch = patch.abs()
    sta = abs_patch.rolling(center=True, time=10, step=1).std()
    output.append(sta.dropna("time"))


In [None]:
patch = (
    dc.spool(output)
    .chunk(time=None)[0]
    .update_attrs(data_type="standard deviation")
)

In [None]:
patch.viz.waterfall(scale=0.5);

## Frequency Binning