# Open Science Data for Seismology

This notebook introduces key open science data types in seismology and provides hands-on exercises for accessing and exploring these datasets.

---

## Earthquake Catalogs, Observed Seismograms, Fault Geometry Models, Seismic Velocity Models

Seismology relies on diverse datasets to study the Earth's structure and seismic events:

- **Earthquake Catalogs**: Databases of earthquake locations, magnitudes, and times (e.g., USGS, IRIS).
- **Observed Seismograms**: Time-series data recorded by seismic stations.
- **Fault Geometry Models**: Representations of fault surfaces in 3D.
- **Seismic Velocity Models**: Maps of seismic wave speeds in the Earth's subsurface.

### Example Image: 
![Seismic Velocity Model](https://example.com/velocity_model.png)  
(*Replace with an actual image path or URL*)

---

## Find, Select, Download, View

Accessing seismological data involves these steps:
1. **Find**: Locate datasets using platforms like IRIS DMC or USGS Earthquake Catalog.
2. **Select**: Filter datasets based on criteria like location, time range, or magnitude.
3. **Download**: Retrieve data in formats such as CSV, SAC, or NetCDF.
4. **View**: Visualize data with tools like Python, ObsPy, or GIS software.

---

## Exercise One: Earthquake Catalog Data Access

In this exercise, you will find earthquakes data service, select earthquakes of interest, download the earthquakes to your computer, and view the retrieved dataset.

1. **Find**: Locate datasets using platforms like IRIS DMC or USGS Earthquake Catalog.
Researchers can find earthquake catalogs using the open-data services of the U.S. Geological Survey. The U.S.G.S integrates information from a multiple seismic networks and provides open access to the resulting earthquake catalog through web interface and programmatic API.

Other seismological data services also provide earthquake catalogs, which have special characteritics such as coverage region, coverage time period, magnitude completeness, and location methods.


This class introduces students to the basic steps of working with open science data: Find, Select, Download, and View. These steps align with key principles of the FAIR data guidelines and the broader open science frameworks developed by researchers such as Wilkinson et al. and organizations like Research Data Alliance (RDA), Earth Science Information Partners (ESIP), and DataONE."

Exercises will show how to access research data used in seismological simulations. The workshop show how to find, retrieve and use open seismological data including earthquake catalogs, observed seismograms, fault geometry models, seismic velocity models.

In [3]:
import webbrowser
url = "https://earthquake.usgs.gov/earthquakes/search/"

# Open the URL in the default web browser
webbrowser.open(url)

True


2. **Select**: Filter datasets based on criteria like location, time range, or magnitude.
Research select the earthquakes they want in their catalog by defining parameters including region, start-date, end-date, minimum magnitude.

In browser User selects:
Magnitude: 4.5+
Date & Time: Past 30 Days
Geographical Region: Conterminous US

https://earthquake.usgs.gov/earthquakes/search/

Search Results: 167 earthquakes


In [2]:
url = "https://earthquake.usgs.gov/earthquakes/map/?extent=7.53676%2C-145.98633&extent=60.15244%2C-43.94531&range=search&timeZone=utc&search=%7B%22name%22%3A%22Search+Results%22%2C%22params%22%3A%7B%22starttime%22%3A%222024-12-07+00%3A00%3A00%22%2C%22endtime%22%3A%222024-12-14+23%3A59%3A59%22%2C%22maxlatitude%22%3A50%2C%22minlatitude%22%3A24.6%2C%22maxlongitude%22%3A-65%2C%22minlongitude%22%3A-125%2C%22minmagnitude%22%3A2.5%2C%22orderby%22%3A%22time%22%7D%7D"
# Open the URL in the default web browser
webbrowser.open(url)

True



3. **Download**: Retrieve data in formats such as CSV, SAC, or NetCDF.
Researchers can download the data for use on their computer. Computational models of earthquake processes will often require use of open science data sources. When downloading data, an important consideration is the data format. Make sure to identify how you will read a data format before you donwload a data in it.

For earthquake catalog, return to the earthquake catalog search page, specify the same search paramters, but on the USGS page Format Options:
Format: KML
KML Specific Options : Color by depth
Order By: Time - Newest First
Limit Results: (leave blank) 

If a Download Button Appears, Select Download:


In [None]:
url = "https://earthquake.usgs.gov/earthquakes/search/"

# Open the URL in the default web browser
webbrowser.open(url)


4. **View**: Visualize data with tools like Python, ObsPy, or GIS software.
Researchers view the data and metadata using platform and data specific tools. For earthquake catalogs, KML files can be read using Chrome Browser and Google Earth, good for visualizing the catalog. For processing the data, formats like Comma Separted Values (CSV) which can be read with text editors or spreadsheet software are often used to view data.


In [None]:


url = "https://earthquake.usgs.gov/earthquakes/search/"

# Open the URL in the default web browser
webbrowser.open(url)

## Exercise Two: Observed Seismograms Data Access

In this exercise, you will find earthquakes data service, select earthquakes of interest, download the earthquakes to your computer, and view the retrieved dataset.

1. **Find**: Locate datasets using platforms like IRIS DMC or USGS Earthquake Catalog.
Researchers can retrieve observed seismograms for significant earthquake for many stations using the U.S.G.S open data services. There is a wide variety of data access options, and researchers will select the most appropriate options based on their intended use of the seismograms.

Other seismological data services also provide observed seismograms, including groups that focus on strong ground motion data used for civil engineering and earthquake engineering research.

In [4]:
url = "https://www.iris.edu/hq/programs/epo/resources_for_viewing_seismograms"

# Open the URL in the default web browser
webbrowser.open(url)

True

2. **Select**: Filter datasets based on criteria like location, time range, or magnitude.
Use EarthScope SAGE data service to retrieve seismograms for a specific event.
IRIS URL Builder to specific data for a Northern California Mainshock:

Researchers select the seismograms they want by defining parameters including earthquake, site, duration.

As an example, retrieve the ground motions at USC from the Mag 7.0 in Northern California. We are going to use both event information and station information in our selection:

For the Event we will use the recent M7.0 Northern California Event:

Visit the URL Builder to determine what selection criteria is needed:


In [None]:
url = "https://service.iris.edu/fdsnws/dataselect/docs/1/builder/"
# Open the URL in the default web browser
webbrowser.open(url)

Visit the IRIS Event Page to determine the event origin time:

In [None]:
url = "https://ds.iris.edu/ds/nodes/dmc/tools/event/11909383/
# Open the URL in the default web browser
webbrowser.open(url)

Event Time: 2024-12-05 18:44:21"


3. **Download**: Retrieve data in formats such as CSV, SAC, or NetCDF.
Researchers can download the data for use on their computer. Computational models of earthquake processes will often require use of open science data sources. When downloading data, an important consideration is the data format. Make sure to identify how you will read a data format before you donwload a data in it.

For earthquake seismograms, researchers can specify ground motions by station time duration. Collection of seismograms from significant events are also available. The information required to retrieve our seismogram of interest, USC for the Mendacinon 7.0 uses these inputs:

Event Page Information Event Time: 2024-12-05 18:44:21

We plan to request 5 minutes of 20 sample per second data.

The USC Station and broadband 10Hz Channel is defined as:

Network: CI
Station: USC
Channel : BHZ (vertical broadband - 20 sps)
Location: --
Start Time: 2024-12-05 18:44:21
End Time: 2024-12-05 18:49:21
Quality: -
Format: GeoCSV


In [None]:
url = "https://service.iris.edu/fdsnws/event/1/query?starttime=2024-11-14&endtime=2024-12-14T23%3A59%3A59.999999&minmagnitude=3&maxmagnitude=10&mindepth=0&maxdepth=6371&limit=10000&output=text"

# Open the URL in the default web browser
webbrowser.open(url)


4. **View**: Visualize data with tools like Python, ObsPy, or GIS software.
Researchers view the data and metadata using platform and data specific tools. For seismograms, formats like GeoCSV can be read and plotted using programs like Excel. Seismologist often use seismogram specific visualization tools such as SAC, ObsPy, and others to view seismogram data.

## Exercise Three: Fault Geometry Model Data Access

In this exercise, you will find earthquakes data service, select earthquakes of interest, download the earthquakes to your computer, and view the retrieved dataset.

1. **Find**: Locate datasets using platforms like IRIS DMC or USGS Earthquake Catalog.
Researchers can retrieve observed seismograms for significant earthquake for many stations using the U.S.G.S open data services. There is a wide variety of data access options, and researchers will select the most appropriate options based on their intended use of the seismograms.

Other seismological data services also provide observed seismograms, including groups that focus on strong ground motion data used for civil engineering and earthquake engineering research.

In [None]:


Four data types: Four steps to using most datasets:

Earthquake Catalog:
Discover https://earthquake.usgs.gov/data/comcat/
Select: https://earthquake.usgs.gov/earthquakes/search/
Download: query.kml
Read: Chrome Browser

Earthquake Seismograms:
Discover: https://www.iris.edu/hq/programs/epo/resources_for_viewing_seismograms
Select: https://ds.iris.edu/wilber3/find_event
Download: query.kml
Read: Chrome Browser

Fault Surface Geometry File:
Discover: https://www.usgs.gov/programs/earthquake-hazards/faults
Select: https://usgs.maps.arcgis.com/apps/webappviewer/index.html
Download: https://earthquake.usgs.gov/static/lfs/nshm/qfaults/qfaults.kmz
Read: Chrome Browser


Seismic Velocity Model:
Discover: http://moho.scec.org/UCVM_web/web/viewer.php
Select: http://moho.scec.org/UCVM_web/web/viewer.php Search
Download: bin, json, png
Read: Chrome Browser



In [None]:
import webbrowser
url = "https://www.sciencebase.gov/catalog/item/655bebe7d34ee4b6e05cc19f"

# Open the URL in the default web browser
webbrowser.open(url)

In [None]:
import os
# KMl File Download
usgs_fdb_url = "https://www.sciencebase.gov/catalog/item/655bebe7d34ee4b6e05cc19f"
kml_file = "NSHM23_FSD_v3.kml"
geojson_file = "NSHM23_FSD_v3.geojson"

cmd = f"wget '{usgs_fdb_url}?name={kml_file}' -O ./{kml_file}"
print(cmd)
!{cmd}

cmd2 = f"wget '{usgs_fdb_url}?name={geojson_file}' -O ./{geojson_file}"
print(cmd2)
!{cmd2}

In [None]:
import os
import json
import time
import itertools
import datetime
import cartopy
# Third-party Imports
import numpy
import numpy as np
from scipy import ndimage as nd
import matplotlib.pyplot as plt
import matplotlib.pyplot as pyplot
import matplotlib.pyplot as plot
import matplotlib.patches as mpatches
import csep
import pandas as pd
import libcomcat
from libcomcat.dataframes import get_history_data_frame, split_history_frame, PRODUCTS
from libcomcat.search import get_event_by_id

In [None]:
from csep.utils import datasets, time_utils, comcat, plots
from csep.core import regions, catalog_evaluations

# set start and end date
start_time = time_utils.strptime_to_utc_datetime('2010-04-08 00:00:00.0')
end_time = time_utils.strptime_to_utc_datetime('2010-04-09 23:59:59.0')
# retrieve events in ComCat catalogue between start and end date
catalog = csep.query_comcat(start_time, end_time)
min_mw = 3.95 # minimum magnitude
max_mw = 8.95 # max magnitude after which is just one bin
dmw = 0.1 # bin width

# Create space and magnitude regions. The forecast is already filtered in space and magnitude
magnitudes = regions.magnitude_bins(min_mw, max_mw, dmw)
region = regions.california_relm_region()

# Bind region information to the forecast 
space_magnitude_region = regions.create_space_magnitude_region(region, magnitudes)


# filter magnitude below 3.95
catalog.filter('magnitude >= 3.95')
# filter events outside spatial region
catalog.filter_spatial(space_magnitude_region)
# print summary information 
print(catalog)

#######################################
# CAGALOGUE SHOULD CONTAIN 10 EVENTS ##
#######################################

In [None]:
#load forecast
forecast = csep.load_catalog_forecast('u3_fore_2010_04_08.bin',
                                      start_time = start_time,
                                      end_time = end_time,
                                      type='ucerf3',
                                      region = space_magnitude_region,
                                      filter_spatial = True,
                                      apply_filters = True,
                                      filters = 'magnitude >= 3.95')

In [None]:
evt_counts = forecast.get_event_counts()

In [None]:
##############################
# Thia sum should be 420734 ##
##############################
np.sum(evt_counts)

In [None]:
# compute N-test
number_test_result = catalog_evaluations.number_test(forecast, catalog)

In [None]:
##################################################################
## results should be delta_1 = 0.08, delta_2 = 0.93, omega = 10 ##
################################################################## 
ax = number_test_result.plot()

In [None]:
#calculate expected rates per space-magnitude bin
expected_rates = forecast.get_expected_rates(verbose=True)

In [None]:
args_forecast = {'title': 'Landers aftershock forecast',
                 'grid_labels': True,
                 'borders': True,
                 'feature_lw': 0.5,
                 'basemap': 'ESRI_imagery',
                 'cmap': 'rainbow',
                 'alpha_exp': 0.9,
                 'projection': cartopy.crs.Mercator(),
                 'clim':[-9, 0]}
args_catalog = {'basemap': 'ESRI_terrain',
                'markercolor': 'black',
                'markersize': 4}
ax_1 = expected_rates.plot(plot_args=args_forecast)
ax_2 = catalog.plot(ax=ax_1, plot_args=args_catalog)
#########################################################
# Image only has observations in the right side corner ##
#########################################################