# ``das`` Package Tutorial

The ``das`` package is developed by the CoRDIAL project (DOE award number DE-SC0019654) for access and analysis of distributed acoustic sensing (DAS) data stored in the DAS-HDF5 format. The package consists of two modules:

* ``DasIo`` for data access
* ``DasStream`` for DAS data analysis

This notebook explains the current features and capabilities of the ``das`` package.

In [1]:
from cordial.io import DasIo
import obspy

The DasIO module enables access to DAS-HDF5 files so is the only one that must be imported. It supports both file system and cloud object store access via the [h5py_switch](https://github.com/ajelenak/h5py_switch) package.

Opening a DAS-HDF5 file served by the Kita server in read-only mode:

f = DasIo('http://hsdshdflab.hdfgroup.org/home/ajelenak/porotomo-das.h5', mode='r')

In [2]:
f = DasIo('https://hsdshdflab.hdfgroup.org/CoRDIAL/PoroTomo-Mar21-quake.h5', 'r', 
          bucket='cordial-hsds')

The file is open now, display some basic information about it:

In [3]:
f

<DasIo("/CoRDIAL/PoroTomo-Mar21-quake.h5", "r") at 0x7f7cde499b38>

There are a number of properties that describe the file's data.

DAS-HDF5 file name:

In [4]:
f.filename

'/CoRDIAL/PoroTomo-Mar21-quake.h5'

The number of traces (DAS channels):

In [5]:
f.num_traces

8721

The number of time observations:

In [6]:
f.num_samples

14430000

Time of the first observation in the file (returns an [obspy.UTCDateTime](http://docs.obspy.org/packages/autogen/obspy.core.utcdatetime.UTCDateTime.html) object):

In [7]:
f.starttime

2016-03-21T05:00:21.404309Z

Time of the last observation in the file (returns an [obspy.UTCDateTime](http://docs.obspy.org/packages/autogen/obspy.core.utcdatetime.UTCDateTime.html) object):

In [8]:
f.endtime

2016-03-21T09:00:51.403309Z

Trace (DAS channel) identifiers as a numpy.ndarray:

In [9]:
f.trace_ids

array([ -20,  -19,  -18, ..., 8698, 8699, 8700], dtype=int32)

DAS instrument identifier:

In [10]:
f.instrument

'iDAS S/N: iDAS16043'

DAS data sampling rate in either seconds or Hertz:

In [11]:
f.sampling_rate_secs

0.001

In [12]:
f.sampling_rate_Hz

1000.0

The DasIo class supports selecting DAS data via trace (channel) identifiers and/or time of observations. The output is an object of the DasStream class.

In [13]:
%time x = f.select(trace=slice(1967, 3425), time=slice('2016-03-21T07:00:00', '2016-03-21T07:18:00'))



CPU times: user 21.7 s, sys: 33.8 s, total: 55.4 s
Wall time: 14min 5s


The DasStream class subclasses the [obspy.Stream](http://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.html) class and is intended for DAS data analysis. New data analysis methods should be added to this class or to a new class which subclasses the DasStream.

Basic information about the DasStream object:

In [14]:
x

DasStream from "/CoRDIAL/PoroTomo-Mar21-quake.h5" at 0x7f7b8ce816d8
1459 Trace(s) in Stream:

...1967 | 2016-03-21T07:00:00.000309Z - 2016-03-21T07:17:59.999309Z | 1000.0 Hz, 1080000 samples
...
(1457 other traces)
...
...3425 | 2016-03-21T07:00:00.000309Z - 2016-03-21T07:17:59.999309Z | 1000.0 Hz, 1080000 samples


Trace (DAS channel) identifiers in the DasStream object are available as a numpy.ndarray object:

In [15]:
x.trace_ids

array([1967, 1968, 1969, ..., 3423, 3424, 3425], dtype=int32)

The source of the DasStream object's data:

In [16]:
x.source

'/CoRDIAL/PoroTomo-Mar21-quake.h5'

Access to the DAS trace data is available via the obspy.Stream/obspy.Trace methods. Below is the actual DAS strain rate of the 16th trace in the DasStream object:

In [17]:
x.traces[15].data

array([ 0.00121416, -0.00522994, -0.00494717, ...,  0.00322007,
        0.00101393, -0.00204339], dtype=float32)

Selecting DAS data is also possible for a single trace (channel) only and will include all the available time observations:

---

**WARNING: Do not run the cell below for very large number of time samples.**

---

Similarly, using a single time value will select all the trace (DAS channel) data. Time values can be obspy.UTCDateTime objects as well. The output will include the time samples surrounding (before and after) the input time instance.

In [18]:
%time x = f.select(time=obspy.UTCDateTime('2016-03-21T06:39:50'))
x

CPU times: user 1.42 s, sys: 0 ns, total: 1.42 s
Wall time: 5.83 s


DasStream from "/CoRDIAL/PoroTomo-Mar21-quake.h5" at 0x7f7cdd3da470
8721 Trace(s) in Stream:

...-20 | 2016-03-21T06:39:49.999309Z - 2016-03-21T06:39:50.000309Z | 1000.0 Hz, 2 samples
...
(8719 other traces)
...
...8700 | 2016-03-21T06:39:49.999309Z - 2016-03-21T06:39:50.000309Z | 1000.0 Hz, 2 samples


It possible to use the traditional syntax for subsetting but care must be taken to apply correct slice selections for each dimension: first dimension is time, second dimension is trace (channel):

In [19]:
%time x = f['2016-03-21T07:25:50':'2016-03-21T07:35:50', 3456:3500:10]
x

CPU times: user 1.66 s, sys: 650 ms, total: 2.31 s
Wall time: 7.32 s


DasStream from "/CoRDIAL/PoroTomo-Mar21-quake.h5" at 0x7f7cdd3da898
5 Trace(s) in Stream:
...3456 | 2016-03-21T07:25:50.000309Z - 2016-03-21T07:35:49.999309Z | 1000.0 Hz, 600000 samples
...3466 | 2016-03-21T07:25:50.000309Z - 2016-03-21T07:35:49.999309Z | 1000.0 Hz, 600000 samples
...3476 | 2016-03-21T07:25:50.000309Z - 2016-03-21T07:35:49.999309Z | 1000.0 Hz, 600000 samples
...3486 | 2016-03-21T07:25:50.000309Z - 2016-03-21T07:35:49.999309Z | 1000.0 Hz, 600000 samples