# **Spool Basics**

December 1, 2024

This notebook introduces the basics of DASCore's [`Spool`](https://dascore.org/api/dascore/core/spool/BaseSpool.html). It is a shortened version of the [DASCore's Spool tutorial](https://dascore.org/tutorial/spool.html). 

<a target="_blank" href="https://colab.research.google.com/github/DASDAE/seg_tutorial/blob/master/03_spool.ipynb">

</a>  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>

#### Useful links: 
* [DASCore Tutorial](https://dascore.org/tutorial/concepts.html)
* [Numpy Dates and Times](https://numpy.org/devdocs/reference/arrays.datetime.html)
* [Pint Units Library](https://pint.readthedocs.io/en/stable/)


In [None]:
%%capture

# First ensure DASCore is installed. If not, install and restart the kernel.
try:
    import dascore as dc
except ImportError:
    !pip install dascore
    # resetart kernel
    import IPython
    IPython.Application.instance().kernel.do_shutdown(True) #automatically restarts kernel

from rich import print


# Spool
As stated above, spools manage a group of patches. They can be initialized in several different ways including: 
- from in-memory patches
- from a single file
- from a directory of DAS files

In [None]:
in_memory_spool = dc.get_example_spool("diverse_das")

# save patches to disk
das_folder_path = dc.examples.spool_to_directory(in_memory_spool)
das_file_path = next(das_folder_path.glob("*.hdf5"))


In [None]:
# From a patch or list of patches
spool = dc.spool([patch])

In [None]:
# From a single file
spool = dc.spool(das_file_path)

In [None]:
# From a directory of files
# Update will create an index of the contents for fast querying/access
spool = dc.spool(das_folder_path).update()

In [None]:
print(spool)

In [None]:
# get contents of spool as a dataframe
contents_df = spool.get_contents()
contents_df.head()

### Accessing Patches

Patches are retrieved using iteration or indexing

In [None]:
first_patch = spool[0]
last_patch = spool[-1]

In [None]:
for patch in spool:
    ...    

In [None]:
# spools can also be sliced (sub-indexed)
sub = spool[1:-1]

### Selecting

`Spool` contents can be select (filtered) with `Spool.select`

In [None]:
# Return a spool with patches that end before 1990
sub_spool = spool.select(time=(..., '1990-01-01'))
print(sub_spool)

In [None]:
# Return a spool with patches whose station attribute is "wayout"
sub_spool = spool.select(station="wayout")
print(sub_spool)

In [None]:
# Return a spool with patches whose tags meets a unix-style match string
sub_spool = spool.select(tag="*dom")
print(sub_spool)

### Chunking
`Spool.chunk` is used to merge contiguous/overlapping patches or create patches of new sizes.

In [None]:
# Chunk spool for 3 second increments with 1 second overlaps
# and keep any segements at the end that don't have the full 3 seconds.
subspool = spool.chunk(time=3, overlap=1, keep_partial=True)

# Merge all contiguous segments along time dimension.
merged_spool = spool.chunk(time=None)

Sometimes the `tolerance` parameter is needed if there are slight gaps in the data. 