# Spool Basics

August 14, 2025

This notebook introduces the basics of DASCore's [`Spool`](https://dascore.org/api/dascore/core/spool/BaseSpool.html). It is a shortened version of the [DASCore's Spool tutorial](https://dascore.org/tutorial/spool.html). 

<a target="_blank" href="https://colab.research.google.com/github/DASDAE/ctemps_tutorial/blob/master/02_spool.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

#### Useful links: 
* [Colab link](https://colab.research.google.com/github/DASDAE/ctemps_tutorial/blob/master/02_spool.ipynb)
* [DASCore documentation](https://dascore.org)


In [1]:
%%capture

# First ensure DASCore is installed. If not, install and restart the kernel.
try:
    import dascore as dc
except ImportError:
    !pip install dascore
    !pip install ipympl
    # resetart kernel
    import IPython
    IPython.Application.instance().kernel.do_shutdown(True) #automatically restarts kernel
    
from rich import print

# Spool
As stated above, the `Spool` class manages a group of patches. `Spool` instances can be initialized in several different ways including: 
- from in-memory patches
- from a single file
- from a directory of DAS files

In [2]:
# This block creates patch files.
in_memory_spool = dc.get_example_spool("diverse_das")
in_memory_spool[0]


DASCore Patch ⚡
---------------
➤ Coordinates (distance: 300, time: 2000)
    *distance: CoordRange( min: 0 max: 299 step: 1 shape: (300,) dtype: int64 units: m )
    *time: CoordRange( min: 2020-01-03 max: 2020-01-03T00:00:07.996 step: 0.004s shape: (2000,) dtype: datetime64[ns] units: s )
➤ Data (float64)
   [[0.778 0.238 0.824 ... 0.37  0.077 0.232]
    [0.497 0.442 0.703 ... 0.126 0.118 0.78 ]
    [0.207 0.195 0.174 ... 0.849 0.365 0.807]
    ...
    [0.619 0.105 0.669 ... 0.621 0.436 0.5  ]
    [0.757 0.259 0.091 ... 0.361 0.937 0.104]
    [0.158 0.295 0.585 ... 0.229 0.24  0.494]]
➤ Attributes
    tag: random
    category: DAS

In [3]:
# save patches to disk
das_folder_path = dc.examples.spool_to_directory(in_memory_spool)
das_file_paths = next(das_folder_path.glob("*.hdf5"))
das_file_paths

PosixPath('/var/folders/mc/2m8j_fw97wdcy2b9hwh6ns8h0000gp/T/tmp07z_yjdj/DAS_____smallg__random__2020_01_03T00_00_16__2020_01_03T00_00_24.hdf5')

In [4]:
# Create a memory spool from a patch or list of patches
spool = dc.spool([in_memory_spool[0]])

In [5]:
# Create a spool from a single file
spool = dc.spool(das_file_paths)

In [6]:
# From a directory of files
# Update will create an index of the contents for fast querying/access
spool = dc.spool(das_folder_path).update()

In [7]:
# Display the contents of a spool as a dataframe
contents_df = spool.get_contents()
contents_df.head()

Unnamed: 0,station,time_min,time_max,network,file_format,instrument_id,data_type,path,dims,file_version,tag,time_step,experiment_id,data_category,_modified
0,wayout,1989-05-04,1989-05-04 00:00:07.996,,DASDAE,,,DAS_____wayout__random__1989_05_04__1989_05_04...,"distance,time",1,random,0 days 00:00:00.004000,,,False
1,,2020-01-03,2020-01-03 00:00:07.996,das2,DASDAE,,,DAS___das2____random__2020_01_03__2020_01_03T0...,"distance,time",1,random,0 days 00:00:00.004000,,,False
2,smallg,2020-01-03,2020-01-03 00:00:07.996,,DASDAE,,,DAS_____smallg__random__2020_01_03__2020_01_03...,"distance,time",1,random,0 days 00:00:00.004000,,,False
3,,2020-01-03,2020-01-03 00:00:07.996,,DASDAE,,,DAS_______random__2020_01_03__2020_01_03T00_00...,"distance,time",1,random,0 days 00:00:00.004000,,,False
4,big_gaps,2020-01-03,2020-01-03 00:00:07.996,,DASDAE,,,DAS_____big_gaps__random__2020_01_03__2020_01_...,"distance,time",1,random,0 days 00:00:00.004000,,,False


### **Exercise** (Spool 1)

Using the diverse das spool, determine how many unique stations are represented. 

In [8]:
diverse_spool = dc.get_example_spool("diverse_das")

### Accessing Patches

Patches are retrieved using iteration or indexing

In [9]:
first_patch = spool[0]
last_patch = spool[-1]

In [10]:
for patch in spool:
    ...    

In [11]:
# spools can also be sliced (sub-indexed)
sub = spool[1:-1]

### **Exercise** (Spool 2)

Sort the diverse spool based on time (using [`Spool.sort`'](https://dascore.org/api/dascore/core/spool/DataFrameSpool/sort.html)), create a sub-spool with the last 4 patches. Print the attrs of each patch in this spool. 

### Selecting

`Spool` contents can be select (filtered) with `Spool.select`

In [12]:
# Return a spool with patches that end before 1990
sub_spool = spool.select(time=(..., '1990-01-01'))
print(sub_spool)

In [13]:
# Return a spool with patches whose station attribute is "wayout"
sub_spool = spool.select(station="wayout")
print(sub_spool)

In [14]:
# Return a spool with patches whose tags meets a unix-style match string
sub_spool = spool.select(tag="*dom")
print(sub_spool)

### **Exercise** (Spool 3)

Create a sub-spool of the diverse spool by selecting all patches with a station code that ends with an 's'. 

### Chunking
`Spool.chunk` is used to merge contiguous/overlapping patches or create patches of new sizes.

In [15]:
example_spool = dc.get_example_spool("random_directory_das")
# Chunk spool for 3 second increments with 1 second overlaps
# and keep any segments at the end that don't have the full 3 seconds.
subspool = example_spool.chunk(time=3, overlap=1, keep_partial=True)

# Merge all contiguous segments along time dimension.
merged_spool = subspool.chunk(time=None)
merged_spool

DASCore DirectorySpool 🧵 (1 Patch)
    Path: /var/folders/mc/2m8j_fw97wdcy2b9hwh6ns8h0000gp/T/tmpvo1hfa5b

In [16]:
contents_df = merged_spool.get_contents()
contents_df.head()

Unnamed: 0,time_min,time_max,time_step,station,network,instrument_id,data_type,dims,tag,experiment_id,data_category,_group
0,2020-01-03,2020-01-03 00:00:23.996,0 days 00:00:00.004000,,,,,"distance,time",random,,,0_0_0


Sometimes the `tolerance` parameter is needed if there are slight gaps in the data. 

### **Exercise** (Spool 4)

Do the following: 

1. Chunk the diverse spool to have 8 second patches (no partial patch needed) with no overlaps.
2. Combine all compatible patches along the time dimension. 

Determine how many patches are in these new spools.

In the next section, we show DASCore in action for processing an urban das dataset. 
- [Github link](https://github.com/DASDAE/ctemps_tutorial/blob/master/03_application.ipynb)
- [Colab link](https://colab.research.google.com/github/DASDAE/ctemps_tutorial/blob/master/03_application.ipynb)