# Web Services Data Exercise

This exercise is a script for downloading either miniSEED or RINEX data from SAGE and GAGE web services. All the functions are covered in the previous `4_data_access.ipynb` notebook. The download function accomodates either a SAGE web service request or GAGE web service request. Fill in the blanks to complete the script.

Download earthquake event data by the query parameters for the SAGE web service. You can use the example in the previous notebook or use another event and station. 

Download a RINEX file by providing a date, a station, and the type of compression. You can use the example in the previous notebook or use another event and station.

If the script is correctly completed you will have a new directory calle `/data` containig a miniSEED and RINEX file.

## Instructions
This exercise is a script for accessing EarthScope data using web services.

Create a notebook and title it web_services_data_exercise.ipynb.
Create three code cells and copy the exercise code into the respective code cell.
Fill in the blanks as required.

In [None]:
# ---------- Fill in the package imports for this script ----------
import ____  
from datetime import datetime
from pathlib import Path
import ____

# ---------- Create an EarthScope client to get token ----------
client = ____


# Fill this with the service endpoint for SAGE data
SAGE_URL = ____
# Fill this with the real base path for GAGE data
GAGE_URL = ____ 

DATA_DIR = Path("./data")
DATA_DIR.mkdir(parents=True, exist_ok=True)

# ---------- Get an EarthScope token ----------
def get_token():
    """
    Use the SDK to refresh (if needed) and return an access token.
    """
    # Refresh if necessary
    ____

    # Retrieve the token string
    token = ____
    return token


# ---------- Helper: common auth header ----------
def auth_header(token):
    """
    Helper function to create header with token
    """
    # Use a standard Bearer token header (lowercase 'authorization' is fine)
    return ____


# ---------- Download a file from EarthScope's web services ----------
# --------------------------------------------------------------------
def download_data(url, data_directory, params={}):  # params is an optional query parameter
    """
    Sends GET with query parameters to the SAGE web service and saves the response 
    body to a file. Expected params include: net, sta, loc, cha, start (ISO), end (ISO), etc.
    Sends GET request to GAGE web service by construcint the URL to the RINEX file.
    """
    # get authorization token
    token = get_token()

    # create the authorization header 
    header = auth_header(token) 


    # This creates a file name for either a miniSEED or RINEX file
    # depending upon if the params dictionary is empty. The bool comparator
    # returns True if the params dictionary is not empty 
    if bool(params): 
        # Parse the start datetime to build filename parts
        start_dt = datetime.strptime(params["start"], "%Y-%m-%dT%H:%M:%S")
        year = str(start_dt.year) # Day-of-year (001-366) for file naming
        doy =  "{:03d}".format(____)  # hint: see how year is extracted from the start
        
        # create a file name for a miniSEED
        file_name = ".".join([
            params["sta"],
            params["net"],
            params["loc"],
            params["cha"],
            year,
            doy,
            "mseed"
        ])
    else: # create a file name for a RINEX file if the params dictionary is empty
        file_name = Path(url).name

    # create a variable with the path to the data directory and file
    out_path = Path(Path(data_directory) / ____)

    # Make the request to the web service with params and bearer auth
    r = requests.____(
        url,
        params=____,
        headers=____,
        stream=True
    ) 

    if r.status_code == requests.codes.____:
        with open(out_path, "wb") as f:
            for data in r:
                f.write(data)
    else:
        #problem occured
        print(f"failure: {r.status_code}, {r.reason}")
        return None


# ---------- Creates URL to download data from the EarthScope GAGE web service ----------
# ---------------------------------------------------------------------------------------
def create_url(year, day, station, compression):
    """
    Construct a URL to a file in an archive given year, day-of-year, station, etc.
    Example output:
      {BASE_URL}{year}/{DOY}/{/STATIONDOY0.YY}{compression}
    """
    doy =  "{:03d}".____(day)
    two_digit_year = ____  # Hint: use the string slice function in the previous notebook

    file_path = "/".join([str(year), doy])
    file_name = "".join(["/", station, doy, "0.", two_digit_year, compression])
    url = "".join([GAGE_URL, file_path, file_name])
    return url


In [None]:
# Download a miniSEED file for an event
# 
params = {"net" : 'IU',
          "sta" : 'ANMO',
          "loc" : '00',
          "cha" : 'BHZ',
          "start": '2010-02-27T06:30:00',
          "end": '2010-02-27T10:30:00'}

# Try the parameterized request
download_data(____, DATA_DIR, ___)

In [None]:
# Download a RINEX file
#
year = 2025
day = 1
station = 'p034'
doy = '%03d'.format(day)
compression = 'd.Z'  # or ".Z" / "" depending on the archive
url = create_url(year, ____, station, compression)
download_data(url, ____)

## Main Answer Key

```{admonition} Click to see answer
:class: dropdown

<PRE>
# ---------- Imports ----------
import requests
import os
from datetime import datetime
from pathlib import Path
from earthscope_sdk import EarthScopeClient

# ---------- Create an EarthScope client to get token ----------
client = EarthScopeClient()

# Fill this with the service endpoint for SAGE data
SAGE_URL = "http://service.iris.edu/fdsnws/dataselect/1/query?"
# Fill this with the real base path for GAGE data
GAGE_URL = 'https://gage-data.earthscope.org/archive/gnss/rinex/obs/'

# create a directory for rinex data
DATA_DIR = "./data"
os.makedirs(DATA_DIR, exist_ok=True)

# ---------- Get an EarthScope token ----------
def get_token():
    """
    Use the SDK to refresh (if needed) and return an access token.
    """
    # Refresh if necessary
    client.ctx.auth_flow.refresh_if_necessary()

    # Retrieve the token string
    token = client.ctx.auth_flow.access_token
    return token


# ---------- Helper: common auth header ----------
def auth_header(token):
    # Use a standard Bearer token header (lowercase 'authorization' is fine)
    return {"authorization": f"Bearer {token}"}


# ---------- Download a file from EarthScope's web services ----------
# --------------------------------------------------------------------
def download_data(url, data_directory, params={}):
    """
    Sends GET with query parameters and saves the response body to a file.
    Expected params include: net, sta, loc, cha, start (ISO), end (ISO), etc.
    """
    # get authorization token
    token = get_token()

    # create the authorization header 
    header = auth_header(token) 
    
    # Example filename: STA.NET.LOC.CHA.YEAR.DOY.mseed
    if bool(params):
        # Parse the start datetime to build filename parts
        start_dt = datetime.strptime(params["start"], "%Y-%m-%dT%H:%M:%S")
        year = str(start_dt.year) # Day-of-year (001-366) for file naming
        doy =  "{:03d}".format(start_dt.day)  # hint: see how year is extracted from the start
        file_name = ".".join([
            params["sta"],
            params["net"],
            params["loc"],
            params["cha"],
            year,
            doy,
            "mseed"
        ])
    else:
        file_name = Path(url).name
        
    out_path = Path(Path(data_directory) / file_name)

    # Make the request with params and bearer auth
    r = requests.get(
        url,
        params=params,
        headers=header,
        stream=True
    ) 

    if r.status_code == requests.codes.ok:
        with open(out_path, "wb") as f:
            for data in r:
                f.write(data)
    else:
        #problem occured
        print(f"failure: {r.status_code}, {r.reason}")
        return None


# ---------- Requesting data from the EarthScope GAGE web service ----------
# --------------------------------------------------------------------------
def create_url(year, day, station, compression):
    """
    Construct a URL to a file in an archive given year, day-of-year, station, etc.
    Example output:
      {BASE_URL}{year}/{DOY}/{/STATIONDOY0.YY}{compression}
    """
    doy =  "{:03d}".format(day)
    two_digit_year = str(year)[2:] # Hint: use the string slice function in the previous notebook

    file_path = "/".join([str(year), doy])
    file_name = "".join(["/", station, doy, "0.", two_digit_year, compression])
    url = "".join([GAGE_URL, file_path, file_name])
    return url
</PRE>
```

## Answer Key for miniSEED Download

```{admonition} Click to see answer
:class: dropdown

<PRE>
# Example query parameters (adjust to a valid service)
params = {"net" : 'IU',
          "sta" : 'ANMO',
          "loc" : '00',
          "cha" : 'BHZ',
          "start": '2010-02-27T06:30:00',
          "end": '2010-02-27T10:30:00'}

# Try the parameterized request
download_data(SAGE_URL, DATA_DIR, params)
</PRE>
```

## Answer Key for RINEX download

```{admonition} Click to see answer
:class: dropdown

<PRE>
# Try the URL-building flow
year = 2025
day = 1
station = 'p034'
doy = '%03d'.format(day)
compression = 'd.Z'  # or ".Z" / "" depending on the archive
url = create_url(year, day, station, compression)
download_data(url, DATA_DIR)
</PRE>
```

## [< Previous](./4_web_services_data.ipynb)&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;[Next >](./6_cloud_native_data.ipynb)