<a id="top"></a>
# Downloading HST Data by Proposal

This tutorial is aimed at researchers of any level looking for _specific_ observations from a HST program. It will cover the basics of authentication, data search, and data downloads.

By the end of this tutorial, you will:

* Know to how to login/logout to access data in astroquery.
* Be able to search for data based on proposal ID.
* Download filtered data products from the MAST Archive.

## Table of Contents
* [Imports](#Imports)
* [Proprietary Data: Logging In/Out](#log)
* [Searching for HST Observations](#search)
* [Getting Associated Data Products](#adp)
* [Filtering and Downloading Data](#filtdown)
    * [Filtering for Desired Products](#filt)
    * [Direct Downloads](#direct)
    * [Downloading via cURL](#curl)

## Imports
The `astroquery.mast` module is the quickest, most convenient way to get to access MAST data in Python. To read more about its features, visit the [astroquery.mast readthedocs](https://astroquery.readthedocs.io/en/latest/mast/mast.html).

The `Observations` class is the only import we'll need.

In [1]:
from astroquery.mast import Observations

---
<a id="log"></a>
## Proprietary Data: Logging In/Out
Most datasets in MAST are publicly accessible. However, during the exclusive access period (EAP), observations are only available to the PI and their team. Accessing this data requires authentication.

To start accessing proprietary data, you'll need an authorized [MyST account](https://archive.stsci.edu/registration/index.html).

For access through the API, you'll need to generate an API token. To create and view tokens associated with your account, visit https://auth.mast.stsci.edu/tokens.

There are several ways to enter your token, including:
1. Manual response to prompt from `Observations.login()` (must be done every time)
2. Python keyring; either through the `keyring` library or `Observations.login`
3. Storing it in the bash environment variable `$MAST-API-TOKEN`

This flexiblility can overwhelming at first; let's take a look at some examples of these methods below.

In [None]:
# Option 1: Respond to prompt. Uncomment the line below to use this option.
#Observations.login()

This works well for infrequent API users, but storing the token is far more convenient for repeated logins. You can conveniently store the token using the built-in `store_token` flag:

In [None]:
# Option 2: Store Token. Uncomment the line below to use this option.
#Observations.login(store_token=True)

Using 'store_token' will allow us to automatically log in, without needing to re-enter the token, for as long as the token remains valid. Note that tokens expire after 10 days of inactivity, or 60 days after creation, whichever comes first. Once it expires, you should use `reenter_token=True` to overwrite the old token with the new one.

The third option is to store the token as the bash environment variable `$MAST_API_TOKEN`. This method varies from system to system; for more details, you can check out [this guide](https://www3.ntu.edu.sg/home/ehchua/programming/howto/Environment_Variables.html) (links to a non-STScI site).

Let's take a minute to verify that our login was successful:

In [None]:
session_info = Observations.session_info()

You should see all of your information above. If not, verify that your token and MyST account are active.

And of course, if the need arises, we can logout:

In [None]:
Observations.logout()
session_info = Observations.session_info()

<a id="search"></a>
## Searching for HST Observations
The `query_criteria` function is the most versatile option for searching for your data. We'll use this to specify a proposal ID, but you can also find a list of all other available filters on the [field descriptions page](https://mast.stsci.edu/api/v0/_c_a_o_mfields.html).

In [15]:
proposal_obs = Observations.query_criteria(proposal_id=7291)
print("Number of observations:",len(proposal_obs))

Number of observations: 5


To look at the first five results returned by this query, run the cell below.

In [16]:
# Preview the first five observations in a table
proposal_obs

dataproduct_type,calib_level,obs_collection,obs_id,target_name,s_ra,s_dec,t_min,t_max,t_exptime,wavelength_region,filters,em_min,em_max,target_classification,obs_title,t_obs_release,instrument_name,proposal_pi,proposal_id,proposal_type,project,sequence_number,provenance_name,s_region,jpegURL,dataURL,dataRights,mtFlag,srcDen,intentType,obsid,objID
str8,int64,str3,str9,str7,float64,float64,float64,float64,float64,str7,str6,float64,float64,str6,str48,float64,str13,str16,str4,str2,str3,int64,str7,str132,str34,str35,str6,bool,float64,str7,str8,str9
spectrum,3,HST,o58701010,TON1480,183.7882083333,33.16516666667,51190.01461732639,51190.0397909375,2175.0,UV,G140L,115.0,173.0,GALAXY,Cooled Gas in X-Ray Emitting Elliptical Galaxies,51555.03978005,STIS/FUV-MAMA,"Bregman, Joel N.",7291,GO,HST,--,CALSTIS,POLYGON 183.79519926 33.16953302 183.78125026 33.16059724 183.78122982 33.16061959 183.79517882 33.16955537 183.79519926 33.16953302,mast:HST/product/o58701010_x1d.png,mast:HST/product/o58701010_flt.fits,PUBLIC,False,,science,24919901,153504018
spectrum,3,HST,o58701030,TON1480,183.7882083333,33.16516666667,51190.13977920139,51190.17388799768,2947.0,UV,G140L,115.0,173.0,GALAXY,Cooled Gas in X-Ray Emitting Elliptical Galaxies,51555.1738773,STIS/FUV-MAMA,"Bregman, Joel N.",7291,GO,HST,--,CALSTIS,POLYGON 183.79519926 33.16953302 183.78125026 33.16059724 183.78122982 33.16061959 183.79517882 33.16955537 183.79519926 33.16953302,mast:HST/product/o58701030_x1d.png,mast:HST/product/o58701030_flt.fits,PUBLIC,False,,science,24919903,153503852
image,1,HST,o58701byq,TON1480,183.7882083333,33.16516666667,51190.00671253472,51190.00872850695,1.5,Optical,MIRVIS,,,GALAXY,Cooled Gas in X-Ray Emitting Elliptical Galaxies,51555.00872681,STIS/CCD,"Bregman, Joel N.",7291,GO,HST,--,CALSTIS,POLYGON 183.79705483 33.16183874 183.78950215 33.15707092 183.77936635 33.16832094 183.78691941 33.17308937 183.79705483 33.16183874,--,mast:HST/product/o58701byq_raw.fits,PUBLIC,False,,science,24454469,121753849
spectrum,3,HST,o58701020,TON1480,183.7882083333,33.16516666667,51190.07262658565,51190.106735416666,2947.0,UV,G140L,115.0,173.0,GALAXY,Cooled Gas in X-Ray Emitting Elliptical Galaxies,51555.10672453,STIS/FUV-MAMA,"Bregman, Joel N.",7291,GO,HST,--,CALSTIS,POLYGON 183.79519926 33.16953302 183.78125026 33.16059724 183.78122982 33.16061959 183.79517882 33.16955537 183.79519926 33.16953302,mast:HST/product/o58701020_x1d.png,mast:HST/product/o58701020_flt.fits,PUBLIC,False,,science,24919902,153503579
image,1,HST,o58701bzq,TON1480,183.7882083333,33.16516666667,51190.01169290509,51190.011701006944,0.7,Optical,MIRVIS,,,GALAXY,Cooled Gas in X-Ray Emitting Elliptical Galaxies,51555.0116898,STIS/CCD,"Bregman, Joel N.",7291,GO,HST,--,CALSTIS,POLYGON 183.79519926 33.16953302 183.78125026 33.16059724 183.78122982 33.16061959 183.79517882 33.16955537 183.79519926 33.16953302,--,mast:HST/product/o58701bzq_raw.fits,PUBLIC,False,,science,24454470,124583200


<a id="adp"></a>
## Gettting Associated Data Products

Let's get the data products associated with the first two observations.

In [17]:
data_products = Observations.get_product_list(proposal_obs[0:2])

# Print the results
print("Number of results:",len(data_products))
print(data_products)

Number of results: 30
 obsID   obs_collection dataproduct_type ... dataRights calib_level
-------- -------------- ---------------- ... ---------- -----------
24919901            HST         spectrum ...     PUBLIC           1
24919901            HST         spectrum ...     PUBLIC           1
24919901            HST         spectrum ...     PUBLIC           1
24919901            HST         spectrum ...     PUBLIC           1
24919901            HST         spectrum ...     PUBLIC           1
24919901            HST         spectrum ...     PUBLIC           1
     ...            ...              ... ...        ...         ...
24919903            HST         spectrum ...     PUBLIC           1
24919903            HST         spectrum ...     PUBLIC           2
24919903            HST         spectrum ...     PUBLIC           1
24919903            HST         spectrum ...     PUBLIC           2
24919903            HST         spectrum ...     PUBLIC           2
24919903            HST   

Note that even though we used just two observations, there are thirty data products available for download. 

<a id="filtdown"></a>
## Filtering and Downloading Data
<a id="filt"></a>
### Filtering for Desired Products

You can apply filter keyword arguments to download only data products that meet your given criteria. Available filters are “mrp_only” ([minimum recommended products](https://outerspace.stsci.edu/display/MASTDOCS/Minimum+Recommended+Products)), “extension” (file extension), calib_level (calibration level), and all products fields listed [here](https://mast.stsci.edu/api/v0/_productsfields.html).

In this example, let's try filtering for only the level 2, calibrated exposures. Let's also filter by "SCIENCE" type products; this will exclude the preview images from our download.

In [24]:
filtered_prod = Observations.filter_products(data_products, calib_level=[2], productType="SCIENCE")

# Display columns of interest for convenience
disp_col = ['obsID','dataproduct_type','productFilename','size','calib_level']
filtered_prod[disp_col]

obsID,dataproduct_type,productFilename,size,calib_level
str8,str8,str18,int64,int64
24919901,spectrum,o58701010_x1d.fits,77760,2
24919901,spectrum,o58701010_x2d.fits,14474880,2
24919901,spectrum,o58701010_flt.fits,10535040,2
24919903,spectrum,o58701030_x1d.fits,77760,2
24919903,spectrum,o58701030_x2d.fits,14474880,2
24919903,spectrum,o58701030_flt.fits,10535040,2


We've reduced the number of files from 30 down to 6. Before we download, let's add up the sizes to make sure we have the necessary space on our computers to download.

In [26]:
total = sum(filtered_prod['size'])
print('{:.2f} MB'.format(total/10**6))

50.18 MB


This isn't too bad. For selections greater than a few gigabytes, you should not attempt a direct download. Using cURL is more robust against connection issues.

<a id="direct"></a>
### Direct Downloads
Let's pass our filtered products to `download_products`. This will immediately send a request to MAST and begin the download. 

This function also produces a manifest that we'll want to capture and print.

In [28]:
manifest = Observations.download_products(filtered_prod)
print(manifest)

INFO: Found cached file ./mastDownload/HST/o58701010/o58701010_x1d.fits with expected size 77760. [astroquery.query]
INFO: Found cached file ./mastDownload/HST/o58701010/o58701010_x2d.fits with expected size 14474880. [astroquery.query]
INFO: Found cached file ./mastDownload/HST/o58701010/o58701010_flt.fits with expected size 10535040. [astroquery.query]
INFO: Found cached file ./mastDownload/HST/o58701030/o58701030_x1d.fits with expected size 77760. [astroquery.query]
INFO: Found cached file ./mastDownload/HST/o58701030/o58701030_x2d.fits with expected size 14474880. [astroquery.query]
INFO: Found cached file ./mastDownload/HST/o58701030/o58701030_flt.fits with expected size 10535040. [astroquery.query]
                   Local Path                    Status  Message URL 
----------------------------------------------- -------- ------- ----
./mastDownload/HST/o58701010/o58701010_x1d.fits COMPLETE    None None
./mastDownload/HST/o58701010/o58701010_x2d.fits COMPLETE    None None
./mast

The helpful output from the manifest tells us where the file was downloaded to, as well as the status of the download. If there was an error, the 'Message' field will contain a description of what went wrong.

<a id="curl"></a>
### Download via cURL

For large downloads, or those which contain many files, it is wise to use cURL. cURL is more robust against network disruptions, which is helpful when a download will take a while to complete.

When we set `curl_flag=True`, we are actually only downloading a bash script (that uses cURL to download). You'll have to run the script on your local machine 

In [31]:
manifest = Observations.download_products(filtered_prod, curl_flag=True)
print(manifest)

Downloading URL https://mast.stsci.edu/api/v0.1/Download/bundle.sh to ./mastDownload_20230120115011.sh ... [Done]
           Local Path             Status  Message
-------------------------------- -------- -------
./mastDownload_20230120115011.sh COMPLETE    None


You can run the script in your terminal by navigating to the desired download location and typing `bash [filename].sh`. For Windows users, this will require [cygwin](https://www.cygwin.com) or other programs that support bash scripts. You may be prompted for your API token.

## About this Notebook
For additonal questions, comments, or feedback, please email `archive@stsci.edu`. 

**Authors:** Thomas Dutkiewicz <br>
**Keywords:** HST, MAST, authentication <br>
**Last Updated:** Jan 2023 <br>

## Citations

If you use `astroquery` for published research, please [cite](https://github.com/astropy/astroquery/blob/main/astroquery/CITATION) the
authors.

[Top of Page](#top)
<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/notebooks/master/assets/stsci_pri_combo_mark_horizonal_white_bkgd.png" alt="Space Telescope Logo" width="200px"/> 