# Data Search and Download

In [None]:
import glob

import numpy as np
import matplotlib.pyplot as plt

import astropy.time
import astropy.units as u
from astropy.visualization import ImageNormalize, LogStretch, AsymmetricPercentileInterval

import sunpy.map
from sunpy.net import Fido, attrs as a


## Overview of the `Fido` Unified Downloader

* Fido is sunpy's interface for searching and downloading solar physics data.
* It offers a unified interface for searching and fetching data irrespective of the underlying client or webservice from where the data is obtained.
* Offers a way to search and accesses multiple instruments and all available data providers in a single query.
* It supplies a single, easy, consistent and *extendable* way to get most forms of solar physics data the community need 

Fido offers access to data available through:

 * **VSO**
 * **JSOC**
 * **Individual data providers** from web accessible sources (http, ftp, etc)
 * **HEK**
 * **HELIO**
 
As described here Fido provides access to many sources of data through different `clients`, these clients can be defined inside sunpy or in other packages.
Lets print the current list of available clients within sunpy.

In [None]:
Fido

Sunpy also now provides tab completion to auto-fill the attribute name

In [None]:
a.Instrument.

## Searching for Data

Sunpy uses specified *attributes* to search for data using Fido. The range of these attributes is located in the `attrs` submodule. These `attr` parameters can be combined together to construct data search queries, such as searching over a certain time period, for data from a certain instrument with a certain wavelength etc.

Different clients and provides will have client-specific attributes, but the core attributes are:

* `a.Time`
* `a.Instrument`
* `a.Wavelength`

Let's use these different attributes to construct a query for our CME observation.

In [None]:
cme_start = "2022-03-28T11:00"
cme_end = "2022-03-28T14:00"

In [None]:
cme_time = a.Time(cme_start, cme_end)

We can inspect the instrument attribute to see what instrument `attrs` are currently supported through sunpy. Here we can see the instrument name (i.e. the name to be passed to the `a.Instrument` attribute, the client from which the data is available to access, and the full name of the instrument.

In [None]:
a.Instrument

Sunpy also now provides tab completion to auto-fill the attribute name

In [None]:
a.Instrument

We can combine our time and instrument attributes to search for AIA data within our selected time range using `Fido.search`

In [None]:
Fido.search(cme_time & a.Instrument.aia)

We can further filter our results using the `Wavelength` search attribute.

In [None]:
Fido.search(cme_time & a.Instrument.aia & a.Wavelength(304*u.angstrom))

<div class="alert alert-block alert-warning">
    <h3><u>EXERCISE:</u> <br><br>We want to query the AIA data at a 5 minute cadence rather than the full 12 second cadence. How would we modify our above query to accomplish this?</h3>
</div>

In [None]:
# aia_query = cme_time & a.Wavelength(304*u.angstrom) & a.Instrument.aia & a.Sample(5*u.min)
aia_query = ... # put 5 minute cadence query here

In [None]:
Fido.search(aia_query)

<img src="images/Solar-MACH_2022-02-16_00-00-00.png" width="55%"></img>

<div class="alert alert-block alert-warning">
    <h3><u>EXERCISE:</u> <br><br>We've written a query for the AIA data above. How would we write a query for EUVI data from STEREO-A for the same time range, cadence, and wavelength?</h3>
</div>

In [None]:
stereo_query = ...

### Combining Queries

In addition to making queries for individual instruments, we can also logically combine queries for multiple instruments at once. For example, if we wanted to search for data from both AIA and SECCHI for the same time range and passband,

In [None]:
Fido.search(cme_time, a.Instrument.aia | a.Instrument.secchi, a.Wavelength(304*u.angstrom), a.Sample(10*u.minute))

What if we also wanted to look for the GOES XRS data during this same interval?
GOES/XRS data does not have a `Wavelength` or `Sample` associated with it, but we can still combine the queries for all three of these instruments.

In [None]:
aia_or_secchi = (a.Instrument.aia | a.Instrument.secchi) & a.Wavelength(304*u.angstrom) & a.Sample(5*u.minute)

In [None]:
goes_query = a.Instrument.xrs  & a.goes.SatelliteNumber(17)

In [None]:
combined_query = Fido.search(cme_time, aia_or_secchi | goes_query)

In [None]:
combined_query

### Using External `Fido` Clients and Post-search filtering

*Some comments about external Fido clients*

In [None]:
import sunpy_soar
from sunpy_soar.attrs import Product

In [None]:
Fido

We'll grab the level 2 EUI data 

In [None]:
fsi_304_low_cadence = Fido.search(
    a.Time(cme_start, cme_end) & a.Level(2) & a.soar.Product('EUI-FSI304-IMAGE')
)

In [None]:
fsi_304_low_cadence

Unlike the VSO, our SOAR search does not support the `Sample` attribute for adjusting the cadence of our search.
Thus, this query returns the data at the full cadence (~2-4 minutes).
For the phenomenon we're interested in looking at, a ~10-15 minute cadence is sufficient so we'll select every 5th file.
This also means we'll have fewer files to deal with.
We can accomplish by performing a post-search filter on our search result.

## Downloading Data

We can easily make a single download request from all of our different clients by passing in our combined query for AIA, STEREO, and GOES XRS as well as our filtered EUI query.

In [None]:
files = Fido.fetch(combined_query, fsi_304_low_cadence, path='data/{instrument}')

In [None]:
files