# DASCore Quick Intro
May 13, 2024

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DASDAE/presentations/blob/master/cwp_2024/cwp_intro_2024.ipynb)

This tutorial will introduce new users to basic functionality of DASCore.

#### Useful links: 
* [DASCore tutorial](https://dascore.org/tutorial/concepts.html)

First, let's install DASCore using pip. If you run into issues running this on your local computer, use the button above to switch to [Google Colab](https://colab.research.google.com).

In [None]:
try: 
    import dascore
except ImportError:
    !pip install -q dascore
    print('Installed DasCore, stopping RUNTIME! Please restart.')
    import os; os.kill(os.getpid(), 9)

Now that we've got DASCore installed, we can import it for use. We also swap out the built-in `print` function for a [rich](https://github.com/Textualize/rich)er version. 

In [None]:
import numpy as np
from rich import print

import dascore as dc

# Overview

The main data structures in DASCore are the `Patch` and `Spool`. The `Patch` is an array with metadata, and the `Spool` is an interface for access patches from various sources. Each of these have many methods for processing the underlying data. Here we will examine the `Patch` in more detail. 


<div>
    <center>
        <img src="https://raw.githubusercontent.com/DASDAE/presentations/master/_static/images/patch_spool_disp.svg" width="900" align="center">
    </center>
</div>

## Patch

DASCore comes prepackaged with multiple example patches of data. Let's start by loading an example of an induced event recorded by a fiber in a borehole (from [Staněk et al. 2022](https://www.frontiersin.org/articles/10.3389/feart.2022.907749/full)). 

In [None]:
patch = dc.get_example_patch('example_event_1')

To get an idea of the patch contents, and metadata we can simply print it.

In [None]:
print(patch)

A few important patch components (attributes) are:

1. `data`
An array of the managed the data

2. `coords`
An object to keep track of labels for each dimension and, optionally, extra information (like latitude/longitude)

3. `attrs`
A model for storing scalar metadata

4. `dims`
A tuple of the data's dimension labels

(Note: the patch design is based on Xarray's [DataArray](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.html))

### **Exercise**
Find the mean of the patch data.

### Coords

Coordinates provide useful functionality for orienting the data. Here are some examples:

In [None]:
coord = patch.coords.get_coord("distance")

In [None]:
print(coord)

In [None]:
# Find the total cable distance spanned by the patch:
coord.max() - coord.min()

In [None]:
# Get an array of coordinate values
dist_array = coord.values

## Visualization

The `Patch.viz` namespace contains a few options for plotting the data. The most commonly used is the waterfall plot.

(Note: Often, extreme values in the data cause the waterfall plot to appear completely white. Use the `scale` parameter to saturate the colorbar)

In [None]:
patch.viz.waterfall(scale=0.01);

### Updating Patches
Patches are immutable, meaning they cannot be chaned inplace. This helps prevent many types of errors, but does require a bit more thought when working with these data structures.

A few methods are useful here: `Patch.new`, `Patch.update_attrs`, `patch.update_coords`, `patch.convert_units`. 

#### New

`new` will create a new patch, using all the data in the old patch except for parameters specified. 

In [None]:
# Divide data by 1_000_000 to go from µε/s to ε/s, keep all other aspects of the patch the same.
new_data = patch.data / 1_000_000
patch_strain = patch.new(data=new_data)

In [None]:
print(patch_strain)

#### update_attrs
`update_attrs` is used to create a new patch with aspects of the metadata changed.

In [None]:
# Add the gauge length, data units, and a custom tag
patch_new_attrs = patch_strain.update_attrs(guage_length=5, data_units="(1/s)", tag="induced_event")
print(patch_new_attrs)

#### update_coords
`update_coords` is used to change start, end, step, units, or array data of coordinates.

In [None]:
# Translate time coordinate to a plausible event origin time.
patch_coords = patch_new_attrs.update_coords(time_min="2020-04-01T01:12:23.1")

In [None]:
print(patch_coords)

#### convert_units
`convert_units` will convert (or set if undefined) units on the patch data or coordinates.

In [None]:
patch_units = patch_coords.convert_units(time="s", distance="m") 
print(patch_units)

### Processing

DasCore provides many [processing](https://dascore.org/api/dascore/proc.html) and [transformation](https://dascore.org/api/dascore/transform.html) methods and many more are planned.

The following shows of the most commonly used.

#### Select
`select` can be used to crop a patch.

In [None]:
# Get current start time
time = patch_units.get_coord("time")
time_min = time.min()

# Get desired time range (t1, t2)
t1 = time_min + dc.to_timedelta64(0.02)
t2 = t1 + dc.to_timedelta64(0.04)

# Select based on new time and distance ranges
patch_select = patch_units.select(distance=(600, 800), time=(t1, t2))

# Plot
patch_select.viz.waterfall(scale=0.1);

#### Taper
`taper` applies a taper along a specified dimension of the patch.

In [None]:
# Apply a cosine taper to 10 ms of start and end of time axis
ms = dc.get_quantity('ms')
patch_taper = patch_units.taper(time=10 * ms) 

# The visualize.
patch_taper.viz.waterfall(scale=0.01);

#### Pass Filtering
The `Patch.pass_filter` method is used to apply bandpass, lowpass, and highpass filters to the data along a specified dimension.

In [None]:
patch_bp = patch_taper.pass_filter(time=(100, 300))  # apply a 100Hz to 300Hz highpass
patch_lp = patch_taper.pass_filter(time=(None, 300))  # apply a 300Hz lowpass
patch_bp = patch_taper.pass_filter(time=(50, None))  # apply a 50Hz highpass

### **Exercise**
Plot each of the filtered patches above. Which filtering technique did the best at accentuating the event signal? 

#### Detrend

`detrend` applies a linear detrend along a specify dimension (axis).

In [None]:
patch_dt_time = patch_lp.detrend("time")  # Linear detrend along time axis
patch_dt_dist = patch_lp.detrend("distance") # Linear detrend along distance axis

#### Decimate
`decimate` reduces the number of samples in the specified dimension. By default, a lowpass filter is applied to avoid aliasing. 

In [None]:
patch_dec = patch_lp.decimate(time=10)  # keep every 10th sample along time axis

#### Resample

Like `decimate`, `resample` is used to change the sampling rate of the patch along a specific dimension. Unlike `decimate`, however, non-integar multiples can be used.

In [None]:
patch_resamp = patch_lp.resample(distance=15)  # change spatial sampling to 15m

### **Exercise**

Using the processing methods above, or others you find in the [processing module documentation](https://dascore.org/api/dascore/proc.html), apply processing to the example patch to accentuate the primary phases of the event and plot it.

Compare your results to the ones from the DASCore presentation as a baseline.

In [None]:
# Basline patch 
patch_baseline = (
    patch.set_units("1/s", distance="m", time="s")
    .detrend("time")
    .taper(time=0.05)
    .pass_filter(time=(..., 300))
)
patch_baseline.viz.waterfall(show=True, scale=0.3);