# How to Access EarthScope Data

EarthScope maintains archives of seismic and geodetic data. 

## Getting Seismic Data

Commandline download tools:
- Rover  –  https://earthscope.github.io/rover/
- FetchData  –  https://earthscope.github.io/fetch-scripts/

Webservices:
- Dataselect - https://service.iris.edu/

Python packages:
- Obspy
- MsPASS

### Rover

ROVER is a command line tool to robustly retrieve geophysical timeseries data from data centers such as EarthScope. It builds an associated index for downloaded data to generate a local repository. ROVER compares a built local index to timeseries availability information provided by the datacenter. This enables a local archive to remain synchronized with a remote data center.

In [26]:
!pip install --no-cache-dir mseedindex==3.0.4
!pip install --no-cache-dir rover



Run rover in a terminal

```
rover init-repository datarepo
cd datarepo
```

Run the process rover retrieve to fetch these data:

```
rover retrieve request.txt
```

`list-summary` prints the retrieved data from the earliest to the latest timespans:

### FetchData and FetchEvent

These scripts are require perl to run.

**FetchData** - Fetch time series and optionally, related metadata, matching SAC Poles and Zeros and matching SEED RESP files. Time series data are returned in miniSEED format, and metadata is saved as a simple ASCII list.

To download the script in GeoLab and make it executable:

```
url -O https://earthscope.github.io/fetch-scripts/FetchData
chmod +x FetchData
```

Usage documentation:

```
./FetchData
```

Example usage:

To request the first hour of the year 2011 for BHZ channels from GSN stations, execute the following command:

```
./FetchData -N _GSN -C BHZ -s 2011-01-01T00:00:00 -e 2011-01-01T01:00:00 -o GSN.mseed -m GSN.metadata
```

**FetchEvent** - Fetch event parameters and print simple text summary. Works with any fdsnws-event service.

To download the script in GeoLab and make it executable:

```
url -O https://earthscope.github.io/fetch-scripts/FetchEvent
chmod +x FetchData
```

Usage documentation:

```
./FetchEvent
```

Example usage:

To request magnitude 6+ events within 20 degrees of the main shock of the Tohoku-Oki, Japan Earthquake on or after March 11th 2011, execute the following command:

```
./FetchEvent -s 2011-03-11 --radius 38.2:142.3:20 --mag 6
```

More information about FetchData and FetchEvent is available on this [page](https://earthscope.github.io/fetch-scripts/docs/tutorial/)


### Dataselect to GeoLab directory

Dataselect is a web service for downloading data from the SAGE archive. API documentation is available on this [page](https://service.earthscope.org/fdsnws/dataselect/1/).

The following example downloads miniseed files to GeoLab.


In [13]:
import requests
import csv, os
from datetime import date
from pathlib import Path

# SAGE archive
URL = "http://service.iris.edu/fdsnws/dataselect/1/query?"

def download(station, directory_path):
    # calculate duration
    start_year = int(station["starttime"][:4])
    end_year = int(station["endtime"][:4])
    startdate = station["starttime"].split("T")[0]
    enddate = station["endtime"].split("T")[0]
    days = date.fromisoformat(enddate) - date.fromisoformat(startdate)
    total_days = days.days
    
    # duration
    start_year = int(station["starttime"][:4])
    end_year = int(station["endtime"][:4])
    params = {"net" : station["network"],
              "sta" : station["station"],
              "loc" : station["location"],
              "cha" : station["channel"],
              "start": station["starttime"],
              "end": station["endtime"]}
    
    # download miniseed to local drive
    for day in range(1,total_days + 1):
        # this only works for 2 years
        if day > 366:
            year = end_year
        else:
            year = start_year

        # file name format: STATION.NETWORK.YEAR.DAYOFYEAR
        file_name = ".".join([station["station"], station["network"], str(year), "{:03d}".format(day)])

        r = requests.get(URL, params=params, stream=True)
        if r.status_code == requests.codes.ok:
            # save the file
            with open(Path(Path(directory_path) / file_name), 'wb') as f:
                for data in r:
                    f.write(data)
        else:
            #problem occured
            print(f"failure: {r.status_code}, {r.reason}")


# create directory for data
directory_path = "./miniseed_data"
os.makedirs(directory_path, exist_ok=True)

stations_file = "five_stations.csv"

with open(stations_file, 'r') as file:
    csv_reader = csv.DictReader(file, delimiter=',', doublequote=False)
    for row in csv_reader:
        download(row, directory_path)



### Using Dataselect to copy miniseed to AWS S3

EarthScope will make the SAGE archive available on AWS S3 later this year. This means you can access miniseed data in the cloud and process it without downloading the file to GeoLab. The data is adjacent to GeoLab, which reduces the time to access the file and process it. For the purpose of demonstrating working with data in the cloud, this example demonstrates how to copy miniseed files from EarthScope to an AWS S3 bucket. 

In [21]:
import requests
import csv, os
import pathlib as Path
import boto3
from botocore.exceptions import ClientError
from datetime import date

# s3 setup, change profile name
session = boto3.Session(profile_name="spara")
s3_client = session.client('s3')
s3 = session.resource('s3')
bucket_name = "my-miniseed"
region_name = "us-east-2"
try:
    s3_client.head_bucket(Bucket = bucket_name)
except ClientError as error:
    error_code = int(error.response['Error']['Code'])
    if  error_code == 404:
        s3.create_bucket(Bucket=bucket_name, CreateBucketConfiguration={'LocationConstraint': region_name})


# SAGE archive
URL = "http://service.iris.edu/fdsnws/dataselect/1/query?"

def upload_to_s3(station):
    # calculate duration
    start_year = int(station["starttime"][:4])
    end_year = int(station["endtime"][:4])
    startdate = station["starttime"].split("T")[0]
    enddate = station["endtime"].split("T")[0]
    days = date.fromisoformat(enddate) - date.fromisoformat(startdate)
    total_days = days.days

    # set params for request
    params = {"net" : station["network"],
              "sta" : station["station"],
              "loc" : station["location"],
              "cha" : station["channel"],
              "start": station["starttime"],
              "end": station["endtime"]}
    
    # upload miniseed to s3
    for day in range(1,total_days + 1):
        if day > 366:
            year = end_year
        else:
            year = start_year

        # file name format: STATION.NETWORK.YEAR.DAYOFYEAR
        # bucket path format: 'miniseed/TA/2004/365/A04A.TA.2004.365#2'
        doy = f"{str(day):0>3}"
        s3_path_prefix = "/".join([station["station"], str(year), doy])
        file = ".".join([station["station"], station["network"], str(year), doy])
        key = "/".join([s3_path_prefix,file])

        # copy data from archive to S3
        r = requests.get(URL, params=params, stream=True)
        if r.status_code == requests.codes.ok:
            # save the file
            bucket = s3.Bucket(bucket_name)
            bucket.upload_fileobj(r.raw, key)
            print("copied %s to %s in S3" % (file, key))
        else:
            #problem occured
            print(f"failure: {r.status_code}, {r.reason}")

# parse stations
stations_file = "five_stations.csv"

with open(stations_file, 'r') as file:
    csv_reader = csv.DictReader(file, delimiter=',', doublequote=False)
    for row in csv_reader:
        upload_to_s3(row)

copied WCI.IU.2014.001 to WCI/2014/001/WCI.IU.2014.001 in S3
copied KBS.IU.2014.001 to KBS/2014/001/KBS.IU.2014.001 in S3
copied TIXI.IU.2014.001 to TIXI/2014/001/TIXI.IU.2014.001 in S3
copied KIV.II.2014.001 to KIV/2014/001/KIV.II.2014.001 in S3
copied TRIS.G.2014.001 to TRIS/2014/001/TRIS.G.2014.001 in S3


### Using a miniseed file from S3

This example demonstrates how to read a single miniseed file and read the data with obspy.

In [25]:
import requests
import csv
import boto3
from botocore.exceptions import ClientError
from datetime import date
from obspy import read
import io 

# s3 setup
session = boto3.Session(profile_name="spara")
s3_client = session.client('s3')
s3 = session.resource('s3')
bucket_name = "my-miniseed"
region_name = "us-east-2"

# Define bucket and key
bucket_name = 'my-miniseed'
object_key = 'WCI/2014/001/WCI.IU.2014.001'

# Download object to memory
response = s3_client.get_object(Bucket=bucket_name, Key=object_key)
data_stream = io.BytesIO(response['Body'].read())

# Parse with ObsPy
st = read(data_stream)

# Print the ObsPy Streams
print(st)

12 Trace(s) in Stream:
IU.WCI.10.BH1 | 2014-01-01T00:00:00.019500Z - 2014-01-01T23:59:59.994500Z | 40.0 Hz, 3456000 samples
IU.WCI.10.BH2 | 2014-01-01T00:00:00.019500Z - 2014-01-01T23:59:59.994500Z | 40.0 Hz, 3456000 samples
IU.WCI.10.BHZ | 2014-01-01T00:00:00.019500Z - 2014-01-01T23:59:59.994500Z | 40.0 Hz, 3456000 samples
IU.WCI.10.LH1 | 2014-01-01T00:00:00.069500Z - 2014-01-01T23:59:59.069500Z | 1.0 Hz, 86400 samples
IU.WCI.10.LH2 | 2014-01-01T00:00:00.069500Z - 2014-01-01T23:59:59.069500Z | 1.0 Hz, 86400 samples
IU.WCI.10.LHZ | 2014-01-01T00:00:00.069500Z - 2014-01-01T23:59:59.069500Z | 1.0 Hz, 86400 samples
IU.WCI.10.VH1 | 2014-01-01T00:00:00.069500Z - 2014-01-01T23:59:50.069500Z | 0.1 Hz, 8640 samples
IU.WCI.10.VH2 | 2014-01-01T00:00:00.069500Z - 2014-01-01T23:59:50.069500Z | 0.1 Hz, 8640 samples
IU.WCI.10.VHZ | 2014-01-01T00:00:00.069500Z - 2014-01-01T23:59:50.069500Z | 0.1 Hz, 8640 samples
IU.WCI.10.VMU | 2014-01-01T00:00:09.000000Z - 2014-01-01T23:59:59.000000Z | 0.1 Hz, 8640 

## Geodetic Data



In [30]:
!pip install earthscope_sdk==1.0.0b1 

Collecting earthscope_sdk==1.0.0b1
  Using cached earthscope_sdk-1.0.0b1-py3-none-any.whl.metadata (17 kB)
Collecting pydantic-settings>=2.8.0 (from pydantic-settings[toml]>=2.8.0->earthscope_sdk==1.0.0b1)
  Using cached pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting stamina>=24.3.0 (from earthscope_sdk==1.0.0b1)
  Using cached stamina-25.1.0-py3-none-any.whl.metadata (5.9 kB)
Using cached earthscope_sdk-1.0.0b1-py3-none-any.whl (34 kB)
Using cached pydantic_settings-2.9.1-py3-none-any.whl (44 kB)
Using cached stamina-25.1.0-py3-none-any.whl (17 kB)
Installing collected packages: stamina, pydantic-settings, earthscope_sdk
[2K  Attempting uninstall: pydantic-settings
[2K    Found existing installation: pydantic-settings 2.6.1
[2K    Uninstalling pydantic-settings-2.6.1:
[2K      Successfully uninstalled pydantic-settings-2.6.1
[2K  Attempting uninstall: earthscope_sdk[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1/3[0m [pydantic-settings]
[

In [31]:
import time
from datetime import datetime, timedelta
import os, json
import numpy as np

import requests
from pathlib import Path
 
from earthscope_sdk import EarthScopeClient

client = EarthScopeClient()

def get_token(token_path='./'):

    # refresh the token if it has expired
    client.ctx.auth_flow.refresh_if_necessary()

    token = client.ctx.auth_flow.access_token
    
    return token

def get_es_file(url, directory_to_save_file='./', token_path='./'):

  # get authorization Bearer token
  token = get_token()

  # request a file and provide the token in the Authorization header
  file_name = Path(url).name

  r = requests.get(url, headers={"authorization": f"Bearer {token}"})
  if r.status_code == requests.codes.ok:
    # save the file
    with open(Path(Path(directory_to_save_file) / file_name), 'wb') as f:
        for data in r:
            f.write(data)
  else:
    #problem occured
    print(f"failure: {r.status_code}, {r.reason}")

    # https://gage-data.earthscope.org/archive/gnss/rinex/obs/<year>/<day>/<station><day>0.<two digit year>d.Z

directory_path = "./rinex_data"

os.makedirs(directory_path, exist_ok=True)

def download_rinex(doy, year, station):
    two_digit_year=str(year)[2:] #converts integer to string and slices the last characters
    for doy in np.arange(1,10):
        #download
        url='https://gage-data.earthscope.org/archive/gnss/rinex/obs/%d/%03d/%s%03d0.%sd.Z' %(year,doy,station,doy,two_digit_year)
        print('downloading: ', url)
        get_es_file(url, 'rinex_data')


station = "p038"
doy = 1
year = 2024

download_rinex(doy, year, station)

downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/001/p0380010.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/002/p0380020.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/003/p0380030.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/004/p0380040.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/005/p0380050.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/006/p0380060.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/007/p0380070.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/008/p0380080.24d.Z
downloading:  https://gage-data.earthscope.org/archive/gnss/rinex/obs/2024/009/p0380090.24d.Z
