# Cloud Native Data Exercise

## Instructions

This exercise is a script for accessing EarthScope data using cloud services. 

- Create a notebook and title it `cloud_native_data_exercise.ipynb`. 
- Create a code cell and copy exercise A into it.
- Create a code cell and copy exercise B into it.
- Fill in the blanks as required.


## Exercise A

```
# --- Fill in the blanks (marked ____ ) ---
import boto3
from botocore import UNSIGNED
from botocore.config import Config
import io
from obspy import read

# 1) Create an anonymous S3 client in the us-west-2 region
s3 = boto3.client('s3',
                  config=Config(signature_version=____),
                  region_name='____')

# 2) Identify what to fetch
BUCKET_NAME = '____'   # public bucket name
KEY = '____'           # object key/path inside the bucket

# 3) Download the object and put bytes into a buffer
response = s3.get_object(Bucket=____, Key=____)
data_stream = io.BytesIO(response['Body'].read())

# 4) Parse the miniSEED bytes with ObsPy
st = read(____)

# 5) Print a summary of the Stream
print(st)

# 6) (bonus) Print the first Trace's stats: network, station, channel, starttime, sampling_rate
tr = st[0]
print(tr.stats.network, tr.stats.station, tr.stats.channel, tr.stats.starttime, tr.stats.sampling_rate)
```


## Exercise B

```
# --- Fill in the blanks (marked ____ ) ---
import datetime as dt
import pandas as pd
from earthscope_sdk import EarthScopeClient

# 1) Create the client
es = ____()

# 2) Define a start and end datetime (UTC)
start_dt = dt.datetime(____, ____, ____, ____)  # year, month, day, hour
end_dt   = dt.datetime(____, ____, ____, ____)  # ~30 hours later

# 3) Choose a station (4-character ID)
station = "____"

# 4) Make the request for GNSS observations and fetch as an Arrow table
arrow_table = es.data.gnss_observations(
    start_datetime=____,
    end_datetime=____,
    station_name=____,
).____()  # method that actually runs the request

# 5) Convert Arrow ➜ pandas DataFrame and view the first few rows
df = arrow_table.____()
print(df.head())

# 6) (bonus) How many rows did we get?
print("rows:", len(df))

# 7) (bonus) Show unique observation types if present (e.g., L1, L2, C1...)
if "observable" in df.columns:
    print("observables:", sorted(df["observable"].unique()))
```

## Answer Key for Exercise A

```{admonition} Click to see answer
:class: dropdown

<PRE>
import boto3
from botocore import UNSIGNED
from botocore.config import Config
import io
from obspy import read

s3 = boto3.client('s3',
                  config=Config(signature_version=UNSIGNED),
                  region_name='us-west-2')

BUCKET_NAME = 'ncedc-pds'
KEY = 'continuous_waveforms/BK/2014/2014.236/PACP.BK.HHN.00.D.2014.236'

response = s3.get_object(Bucket=BUCKET_NAME, Key=KEY)
data_stream = io.BytesIO(response['Body'].read())

st = read(data_stream)
print(st)

tr = st[0]
print(tr.stats.network, tr.stats.station, tr.stats.channel, tr.stats.starttime, tr.stats.sampling_rate)
</PRE>

```

## Answer Key for Exercise B

```{admonition} Click to see answer
:class: dropdown

<PRE>
import datetime as dt
import pandas as pd
from earthscope_sdk import EarthScopeClient

es = EarthScopeClient()

start_dt = dt.datetime(2025, 7, 20, 21)
end_dt   = dt.datetime(2025, 7, 22, 3)
station = "AC60"

arrow_table = es.data.gnss_observations(
    start_datetime=start_dt,
    end_datetime=end_dt,
    station_name=station,
).fetch()

df = arrow_table.to_pandas()
print(df.head())
print("rows:", len(df))

if "observable" in df.columns:
    print("observables:", sorted(df["observable"].unique()))

</PRE>
```


## [< Previous](./6_cloud_native_data.ipynb)&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;[Next >](./8_wrap_up.ipynb)