Skip to content

Seismological Use Case

Asela Rajapakse edited this page Apr 23, 2018 · 5 revisions

Computing Features of Seismic Waveform Data

Introduction

Seismology is witnessing a dramatic increase in the data volumes produced by dense networks of sensors globally deployed. To help to cope effectively with this wealth of data seismological data centres offer advanced services for data discovery, access and analysis. One such services is WFCatalog (see reference below) that provides programmatic access to metadata and features derived from seismological primary data, i.e. seismic waveforms. Those features support better discovery of datasets, automatic processing of seismological data and generation of products.

Scientific Context

A typical modern seismic station provides continuous, 3-component recordings of ground motion that are usually between 1 and 100 samples per second. These recordings are acquired and archived in data centres and processed and analysed by seismologists to extract seismological information (e.g. earthquake location and sub-surface structure). Data discovery and selection are important steps prior to any analysis. Those steps are usually done based on well-known parameters such as sensor (e.g. network, station and channel) and temporal descriptions (e.g. start-time and end-time). However, having a more detailed view of the data streams is desirable to facilitate and support processing. We perform the qualification of seismic waveform according to well-defined and agreed data quality metrics, which can be derived from seismic waveforms. Examples of such metrics are: number of data samples, gaps, overlaps, max value, min value, mean value of the samples, sample rate, encoding etc. WFCatalog offers such metrics in a standardised JSON format that also includes Persistent Identifiers, attribution of the data providers and versioning.

Rationale

The main idea behind this use is to provide users with seismic waveform features extracted from their local archives and/or user-defined data collections stored in the EUDAT CDI. GEF can be harnessed to bring the computing close to data. In this case, the software module that computes such features is the same that underpins WFCatalog ensuring consistent and standardised results. Users can select the input source with the raw data to analyse and trigger the computation of the related features close to the data. The output (in JSON format) can be used for instance to build local metadata catalogues or to feed additional processing modules.

Input Data

The application works with seismological waveform data in MiniSEED format as input. This input ought to be accessible via a URL. Examples of such URLs are:

Dockerfile and Execution Script

The associated Dockerfile can be found at https://github.com/EUDAT-GEF/GEF/blob/master/services/waveform-metadata-demo/Dockerfile. The execution script run inside is stored at https://github.com/EUDAT-GEF/GEF/blob/master/services/waveform-metadata-demo/collector.py.

References

Luca Trani, Mathijs Koymans, Malcolm Atkinson, Reinoud Sleeman, Rosa Filgueira: A catalogue for seismological waveform data, Computers & Geosciences, Volume 106, 2017, Pages 101-108, ISSN 0098-3004, http://dx.doi.org/10.1016/j.cageo.2017.06.008. (http://www.sciencedirect.com/science/article/pii/S0098300416308263)