# Class Notebooks

* [HAPI_01.ipynb - Basics](HAPI_01.ipynb) 
* [HAPI_02.ipynb - Data structures](HAPI_02.ipynb)
* [HAPI_03.ipynb - Plotting](HAPI_03.ipynb)
* **[HAPI_04.ipynb - Problems](HAPI_04.ipynb) (this Notebook)**

# Instructions

Use the rest of the class time to work on these problems. Each problem is expected to take approximately one hour, so start with a problem that interests you.

For in-person students, we encourage you to work with one or more neighbors on a problem.

Several HAPI experts are available for questions on chat, and three of us will be on-site for in-person questions.

# Working with Metadata I.


## Basic


Starting with

```python
with open('/tmp/availability.pkl', 'rb') as f:
    datasets = pickle.load(f)
    logging.info('Read availability.pkl')
print(datasets)
```

use the information in `datasets` to

1. Create a table showing the time interval of availability of ephemeris data from the SSCWeb HAPI server. The table should have the form

    ```
    ace              1997-08-25T17:48:00.000Z  2022-07-04T23:48:00.000Z
    active           1989-09-29T00:00:00.000Z  1991-10-04T08:00:00.000Z
    aec              1973-12-17T08:01:00.000Z  1978-12-10T00:00:00.000Z
    ... (243 more lines)
    ```

1. Create a plot showing the time interval of availability of ephemeris data from the SSCWeb HAPI server. The plot should have the form

    `TODO`


### Solution

1\. Table

```python
with open('/tmp/availability.pkl', 'rb') as f:
    datasets = pickle.load(f)
    logging.info('Read availability.pkl')

for idx, dataset in enumerate(datasets):
    # Pad ids
    id = "{:15s}".format(datasets[idx]["id"])
    print(f'{id}  {datasets[idx]["startDate"]}  {datasets[idx]["stopDate"]}')
```

2\. Plot

```python
TODO
```

## Advanced

Write a program that creates the information in `datasets` by querying the SSCWeb HAPI server.

### Solution

```python
import os
import pickle
import logging

from hapiclient import hapi

# Change INFO to WARNING or ERROR to suppress logging messages in this script
logging.basicConfig(level=logging.INFO)

if not os.path.exists("availability.pkl"):

    server = 'https://hapi-server.org/servers/SSCWeb/hapi'
    
    resp = hapi(server)
    logging.info(resp)

    datasets = resp['catalog']
    logging.info(datasets)

    for idx, dataset in enumerate(datasets):
        logging.info(f'Working on dataset id {datasets[idx]}')
        resp = hapi(server, dataset["id"], logging=True)
        startDate = resp["startDate"]
        stopDate = resp["stopDate"]
        # Add start/stop to each element in datasets list
        datasets[idx]["startDate"] = startDate
        datasets[idx]["stopDate"] = stopDate
        logging.info(f'  start = {startDate}\tstop = {stopDate}')

    # Save result so we don't need to recreate when we modify table and plot code.
    with open('availability.pkl', 'wb') as f:
        pickle.dump(datasets, f)
        logging.info('Saved availability.pkl')
else:
    with open('availability.pkl', 'rb') as f:
        datasets = pickle.load(f)
        logging.info('Read availability.pkl')

```

# Working with Metadata II.

Starting with

```python
with open('/tmp/availability.pkl', 'rb') as f:
    datasets = pickle.load(f)
    logging.info('Read availability.pkl')
print(datasets)

start = "2003-10-31T23:00:00Z"
stop = "2003-10-31T23:59:00Z"
```

create a table that indicates the spacecraft region on `2003-10-31T23:00:00Z` for all spacecraft available from SSCWeb. Your table should have columns of `Spacecraft`, `Region`, and `Radial distance from Earth`.

## Solution

```python
import os
import pickle
import logging

from hapiclient import hapi
from hapiclient import hapitime2datetime

# Change INFO to WARNING or ERROR to suppress logging messages in this script
logging.basicConfig(level=logging.INFO)

short_run = True # If True, only get data for first three s/c

server = 'https://hapi-server.org/servers/SSCWeb/hapi'
start = "2003-10-31T23:00:00Z"
stop = "2003-10-31T23:59:00Z"


# Warning: This if block was copied from availability.py
if not os.path.exists("availability.pkl"):

    resp = hapi(server)
    logging.info(resp)

    datasets = resp['catalog']
    logging.info(datasets)

    for idx, dataset in enumerate(datasets):
        logging.info(f'Working on dataset id {datasets[idx]}')
        resp = hapi(server, dataset["id"], logging=True)
        startDate = resp["startDate"]
        stopDate = resp["stopDate"]
        # Add start/stop to each element in datasets list
        datasets[idx]["startDate"] = startDate
        datasets[idx]["stopDate"] = stopDate
        logging.info(f'  start = {startDate}\tstop = {stopDate}')

    # Save result so we don't need to recreate when we modify table and plot code.
    with open('availability.pkl', 'wb') as f:
        pickle.dump(datasets, f)
        logging.info('Saved availability.pkl')
else:
    with open('availability.pkl', 'rb') as f:
        datasets = pickle.load(f)
        logging.info('Read availability.pkl')


print(80*"-")
print("Availability")
# Create table
start_wanted = hapitime2datetime(start)
stop_wanted  = hapitime2datetime(stop)
n = 0
for idx, dataset in enumerate(datasets):
    # Pad ids
    id = "{:15s}".format(datasets[idx]["id"])
    start_available = hapitime2datetime(datasets[idx]["startDate"])[0]
    stop_available = hapitime2datetime(datasets[idx]["stopDate"])[0]

    if start_available <= start_wanted and stop_available >= stop_wanted:
        print(f'{id}  {datasets[idx]["startDate"]}  {datasets[idx]["stopDate"]}')
        n = n+1

        logging.info(f'Getting data for {datasets[idx]}')
        data, meta = hapi(server, datasets[idx]["id"], 'Spacecraft_Region', start, stop, logging=False)

        if len(data['Spacecraft_Region']) > 0:
            datasets[idx]["Spacecraft_Region"] = data['Spacecraft_Region'][0]
            datasets[idx]["First_Value"] = data['Time'][0].decode('utf-8')
        else:
            datasets[idx]["Spacecraft_Region"] = None

    if short_run and n > 3:
        break


print(f'\n{n} s/c have ephemeris data from {start} to {stop}')
print(80*"-")
print(f"Spacecraft region for first available data between {start} to {stop}")
print("")
n = 0
for idx, dataset in enumerate(datasets):
    id = "{:15s}".format(datasets[idx]["id"])
    if "Spacecraft_Region" in datasets[idx]:
        n = n + 1
        if datasets[idx]["Spacecraft_Region"] is not None:
            print(f'{id}  {datasets[idx]["Spacecraft_Region"]}\t {datasets[idx]["First_Value"]}')
        else:
            print(f'{id}  No values available')

    if short_run and n > 3:
        break

```

# Working with Data

Many datasets from CDAWeb contain ephemeris (position) data for the associated satellite.

Use https://hapi-server.org/servers/ or https://heliophysicsdata.gsfc.nasa.gov/

1. to find a CDAWeb dataset that contains the ephemeris of a satellite, and
2. to find a SSCWeb dataset that contians the ephemeris of the same satellite.

Create a plot that compares the results.

# Plotting

Use any plotting software to create a stack plot similar to the following.

# Data Fusion

Use SunPy to obtain solar images for the event of DATE. On this date, the X spacecraft was 200 R_E upstream of Earth (and so in the interstellar medium). Obtain solar wind velocity measurements from X from DATE to DATE+4 days. 

Can you seee the signature of the solar event in the X spacecraft solar wind velocity measurements?