# Add Waveforms
Here we combine the experiment-specific stationXML file created in GLNN_StationXML.ipynb with the recorded AE waveforms from TranAX in a tpc5 file to initialize the experiment ASDF file.

ElSys has provided a module tpc5.py __[available on their website](https://www.elsys-instruments.com/en/support/tpc5_fileformat.php)__ for easy interpretation.

In [59]:
import tpc5 # get from ElSys, see link above
import h5py
import numpy as np
import pyasdf
from obspy import Stream, Trace
from obspy.core.util import AttribDict
from obspy.core import Stats
import obspy

First set up the ASDF dataset to fill

In [72]:
ds = pyasdf.ASDFDataSet('020918_shear/020918ASDF.h5', compression='gzip-3')
ds.add_stationxml('020918_shear/GLNNstations_020918.xml')

In [73]:
ds.waveforms.list()

['L0.AE09',
 'L0.AE11',
 'L0.AE16',
 'L0.AE17',
 'L0.AE18',
 'L0.AE21',
 'L0.AE22',
 'L0.AE23',
 'L0.AE27',
 'L0.AE28',
 'L0.AE32',
 'L0.AE33',
 'L0.AE35',
 'L0.AE38',
 'L0.AE41',
 'L0.AE43']

In [76]:
ds.waveforms.L0_AE09.StationXML.networks[0].stations[0]
# ASDF appears to dump the extra attribute of the StationXML
# Add the (x,y) locations as auxiliary data instead

Station AE09 (BP3)
	Station Code: AE09
	Channel Count: 1/None (Selected/Total)
	None - 
	Access: None 
	Latitude: 37.87, Longitude: -122.26, Elevation: 100.0 m
	Available Channels:
		AE09.00.FHZ

The stations are ordered as A1-D4 in the original StationXML but get re-sorted to increasing station codes when imported to ASDF. The ASDF format also drops the .extra attribute from the StationXML, which contained my local_location information. I need to temporarily access the StationXML outside of ASDF to bring back this lost information.

In [184]:
inv = obspy.read_inventory('020918_shear/GLNNstations_020918.xml', format='stationxml')
ordered_stations = [inv.get_contents()['stations'][i][3:7] for i in range(16)]
ordered_locations = [(float(inv[0].stations[i].extra.local_location.value['x'].value),
                      float(inv[0].stations[i].extra.local_location.value['y'].value),
                      float(inv[0].stations[i].extra.local_location.value['z'].value))
                     for i in range(16)]


In [191]:
inv[0].stations[3].extra

AttribDict({'local_location': AttribDict({'namespace': 'GLNN', 'value': AttribDict({'x': AttribDict({'namespace': 'GLNN', 'value': '9.9568', 'attrib': {'unit': 'CENTIMETERS'}}), 'y': AttribDict({'namespace': 'GLNN', 'value': '19.05', 'attrib': {'unit': 'CENTIMETERS'}}), 'z': AttribDict({'namespace': 'GLNN', 'value': '0', 'attrib': {'unit': 'CENTIMETERS'}})})})})

These bits of extra info are great candidates for ASDF auxiliary data.

In [185]:
ordered_locations

[(9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0),
 (9.9568, 19.05, 0.0)]

In [181]:
ds.add_auxiliary_data(data=np.array(ordered_locations),
                      data_type='LabStationInfo',
                      path='local_locations',
                      parameters={})

TypeError: No conversion path for dtype: dtype('<U4')

Now on to importing the waveforms. An experiment might be made up of multiple files. Display the tpc5 files present in the experiment folder:

In [3]:
!ls 020918_shear/*.tpc5

020918_shear/bd1.tpc5  020918_shear/run.tpc5
020918_shear/bd2.tpc5  020918_shear/timing.tpc5


This experiment had two ball drops (bd1, bd2), a timing alignment signal (timing), and the full shear run (run). I'll start with the first ball drop.

In [157]:
f = h5py.File("020918_shear/run.tpc5", "r")

In [154]:
# we need to know how many blocks to read
# all channels have the same number of blocks, use channel 1
cg = f[tpc5.getChannelGroupName(1)]
nblocks = len(cg['blocks'].keys())

In [8]:
# read the raw data from each channel into a stream, one trace at a time
# first build some basic Stats
statn_stats = Stats()
statn_stats.network = 'L0'
statn_stats.channel = 'FHZ'
statn_stats.location = '00'
statn_stats.sampling_rate = 20e6

# make sure times will retain full precision
# the ElSys max precision seems to be nanoseconds
obspy.UTCDateTime.DEFAULT_PRECISION = 9

# iterate through stations, following the ordered_stations A1-D4 sort
# the number of the (ordered) station is the TranAX channel A1-D4->1-16, here Tchan
for Tchan,statname in enumerate(ordered_stations,1):
    # add the station to the stats
    statn_stats.station = statname
    
    # create a stream for the station
    statn_stream = Stream()
    
    # iterate through continuous data segments
    # TranAX calls these Blocks, obspy calls them Traces
    for blk in range(nblocks):
        # get the trace start time
        statn_stats.starttime = tpc5.getStartTime(f,1) \ # gives the start of the whole recording
                                + tpc5.getTrigger(f,1,block=blk) \ # seconds from start to trigger
                                - tpc5.getTriggerSample(f,1,block=blk)/statn_stats.sampling_rate # seconds from trigger to block start
        
        # add the trace of raw voltage data to the stream
        statn_stream += Trace(tpc5.getVoltageData(f, Tchan, block=blk),statn_stats)
    
    # add the complete stream to the ASDF object
    ds.add_waveforms()

## Extra notes
Accessing tpc5 and hdf5 files can be a bit confusing. Here are some reminders:

In [None]:
# to view the keys of an hdf5 file:
list(f.keys())

# to then access one of the keys:
f['key']

# to control the precision of a UTCDateTime
t = obspy.UTCDateTime(precision=9)
# doing arithmetic creates a new UTCDateTime with the default precision!
# change the default precision for the session
obspy.UTCDateTime.DEFAULT_PRECISION = 9