<a id="top"></a>
# ULLYSES Data Download Tutorial

***

## Learning Goals

By the end of this tutorial, you will:

- Know how to use Astroquery to download ULLYSES HLSPs
- Be able to use ullyses-utils.select_pids to get PIDs for desired subsets of ULLYSES programs, and know how to download those datasets
- Understand where to find and how to download the files that went into a ULLYSES HLSP


## Table of Contents
**0. [Introduction](#introduction)**

**1. [Downloading HLSP Files Using Astroquery](#astroquery)**

**2. [Using ullyses_utils.select_pids](#selectpids)**

**3. [Downloading HLSP Constiuent Raw Data Files](#constiuent)**

## Introduction
The Hubble Space Telescope’s (HST) Ultraviolet Legacy Library of Young Stars as Essential Standards ([ULLYSES](https://ullyses.stsci.edu/index.html) program has devoted approximately 1,000 HST orbits to the production of an ultraviolet spectroscopic library of young high- and low-mass stars in the local universe. This Director’s Discretionary program has been designed to take advantage of HST’s unique UV capabilities, as both high- and low-mass stars feature different complex UV emission processes that strongly impact their surroundings, but are difficult to model. The UV emission from star formation is central to a wide range of vital astrophysical issues, ranging from cosmic reionization to the formation of planets.

The ULLYSES program has uniformly sampled the fundamental astrophysical parameter space for each mass regime — including spectral type, luminosity class, and metallicity for massive OB stars (in the Magellanic Clouds and two other lower-metallicity nearby galaxies) and the mass, age, and disk accretion rate for low-mass T Tauri stars (in eight young Galactic associations). The data were gathered over a three-year period, from Cycle 27 through Cycle 29 (2020-2022).

The ULLYSES team produces several types of High Level Science Products (HLSPs). Products are made using both archival data and new HST observations obtained through the ULLYSES program. Data products are available from the [ULLYSES website](https://ullyses.stsci.edu/ullyses-download.html) (HLSPs and contributing data), the [MAST Data Discovery Portal](https://mast.stsci.edu/) (HLSPs and contributing data), directly as a High-Level Science Product collection using the [DOI](https://archive.stsci.edu/hlsp/ullyses) (HLSPs only), or using the Python package [Astroquery](https://astroquery.readthedocs.io/en/latest/) to search for and download files from Python scripts you write.

This notebook will guide users through downloading HLSPs and raw ULLYSES data through various means using Astroquery.

### Imports
<!-- - *numpy* to handle array functions
- *astropy.io fits* for accessing FITS files -->
- *astropy.table Table* for creating tidy tables of the data
<!-- - *matplotlib.pyplot* for plotting data -->
- *Path* to create product and data directories
- *shutil* to perform directory and file operations
- *glob* to work with multiple files in our directories
<!-- - *os* to interact with the operating system -->
<!-- - *pandas* to support data analysis -->
- *astroquery* to download HLSPs and raw data files

In [None]:
# %matplotlib inline
# import numpy as np
# from astropy.io import fits
# from astropy.wcs import WCS
from astropy.table import Table
# import matplotlib.pyplot as plt
# plt.rcParams['figure.figsize']=10,6
# plt.style.use('seaborn-v0_8-notebook')
# import os
# from pathlib import Path
import shutil
import glob
from astroquery.mast import Observations

# add some text to intro about how to install ullyses and ullyses utils!!

from ullyses_utils.select_pids import select_all_pids, select_pids

***

<a id="astroquery"></a>
# Downloading HLSP Files Using Astroquery

The `Observations` Class in Astroquery.mast has several different useful functions for searching and downloading data products from MAST. We can use the `query_criteria` function to first search out all ULLYSES HLSP data that is available on MAST using the search criteria `provenance_name='ULLYSES'` as follows:

In [None]:
search = Observations.query_criteria(provenance_name='ULLYSES')

The above cell returns `search` which is an Astropy.Table that holds our search results. We can print the results next:

In [None]:
# This down-selects the number of columns printed to just a few, which makes it easier to read
search.pprint_include_names = ('target_name', 's_ra', 's_dec', 'proposal_id', 'instrument_name', 'filters')

search.pprint()

# To print all columns use:
# search.pprint_all()

# Or to print a single column use:
# search['target_name'].pprint()

As you can see, there are several results returned for the initial search. We can down-select our search by adding the name of the target we want, using the criteria `target_name='V505-ORI'`.

In [None]:
search_target = Observations.query_criteria(target_name='V505-ORI', provenance_name='ULLYSES')

Next, we use the function `get_product_list` with the input of the observation IDs that are returned from the query in the previous cell to get a table of all the data products that are available in MAST for these observations.

In [None]:
data_products = Observations.get_product_list(search_target['obsid'])
print(data_products['productFilename'])

Then, we can use the function `download_products` to download all the files from our search. The user can specify where to download the data products with `download_dir`. Note that we specify `extension=['fits']`; there are also `yaml` files that may be downloaded. You can add those by adding 'yaml' to the extension list. Note also that we must specify `obs_collection=['HLSP']` so that we only get the high level science products, and not the constiuent data. If you want both, you can delete the `obs_collection` specification altogether. 

In [None]:
output_dir = './v505-ori/'
Observations.download_products(data_products, download_dir=output_dir, extension=['fits'], obs_collection=['HLSP'])

All ULLYSES HLSP data products are now downloaded to your `download_dir` in the sub-directories starting with `mastDownload`. We can move these into a better folder structure using the following function.

In [None]:
def mv_downloads(output_path):
    # Specify the path to where the downloads were placed
    mast_path = os.path.join(output_path, 'mastDownload/')
    
    # Get a list of all obs_id folders. Each folder contains the FITS files
    obs_id_dirs = glob.glob(os.path.join(mast_path, '*', '*'))
    
    # Iterate through each of the sub-folders to change the path of each FITS file
    for subdir in obs_id_dirs: 
    
        # Get a list of all FITS files in the current ./mastDownload/*/<obs_id> folder
        sub_files = glob.glob(subdir + '/*fits')
    
        # Iterating through each of these files to change their path individually:
        # We will be moving them to the top level, ./v505-ori/ in this example
        for file in sub_files: 
            original_path = os.path.join(obs_id_path, file)
            new_path = os.path.join(output_path, file.split('/')[-1])
            shutil.move(file, new_path)
    
    # Last, remove the mastDownload directory
    shutil.rmtree(os.path.join(output_path, 'mastDownload/'))

mv_downloads(output_dir)

Now, all your HLSP files should be nice and tidy in your output directory!

<a id="selectpids"></a>
# Using ullyses_utils.select_pids

The `select_pids` script in `ullyses_utils` has two functions that allow users to select the PIDS of both ULLYSES-orbserved and archival datasets for specific sub-sets of data. For example, one can select a specific region or target type. We will show some examples of these selections next.

The function `select_all_pids()` is useful for selecting out different target types across many regions. We can see more information in the functions docstring by typing the following:

In [None]:
select_all_pids?

The function `select_pids` gives all the PIDs of a certain region regardless of target type. See the options in the docstring below:

In [None]:
select_pids?

Example 1: select all ULLYSES PIDs

In [None]:
all_ullyses_pids = select_all_pids()
print(all_ullyses_pids)

Example 2: select all SMC and LMC targets and their extras

In [None]:
all_smc_lmc_pids = select_pids('smc') + select_pids('lmc')
print(all_smc_lmc_pids)

Example 3: select all massive stars without the extras, and return a list separated by ULLYSES PIDs vs Archival PIDS

In [None]:
massive_pids = select_all_pids(massive=True, extra=False, single_list=False)
print(massive_pids['ULLYSES'])
print(massive_pids['ARCHIVAL'])

We can again use Astroquery to download the HLSPs from these selected PIDs. The following example is for our second selection on all SMC and LMC targets and their extras.

In [None]:
search_pids = Observations.query_criteria(proposal_id=all_smc_lmc_pids, provenance_name='ULLYSES')
search_pids.pprint()

data_products_pids = Observations.get_product_list(search_pids['obsid'])

output_dir_smc_lmc = './all_smc_lmc'

Observations.download_products(data_products_pids, download_dir=output_dir_smc_lmc, extension=['fits'], obs_collection=['HLSP'])

mv_downloads(output_dir_smc_lmc)

<a id="constiuent"></a>
# Downloading HLSP Constiuent Raw Data Files

For the last example, we'll show how to download the raw data files that made up a HLSP. We will use one of the data sets that we downloaded in the last example here. Our HLSP will be the file `hlsp_ullyses_hst_cos_ngc1818-rob-d1_g130m-g160m-g185m_dr6_cspec.fits`, which is a level 4 product for the target NGC1818-ROB-D1. We can look at the provenance table in this file to find the constiuent spectra.

In [None]:
hlsp_file_path = os.path.join(output_dir_smc_lmc, 'hlsp_ullyses_hst_cos_ngc1818-rob-d1_g130m-g160m-g185m_dr6_cspec.fits')

In [None]:
prov_table = Table(fits.getdata(hlsp_file_path, ext=2))
prov_table.pprint_include_names = ('FILENAME', 'PROPOSID', 'INSTRUMENT', 'DETECTOR', 'DISPERSER', 'APERTURE', 'XPOSURE')
prov_table.pprint()

In [None]:
# Make a list that is just the constiuent files from the provenance data
# We will use this for comparison later
spectra_filenames = prov_table['FILENAME']

Next, we'll run a search using Astroquery on the target name and specifying the provenance name as ULLYSES. Then, we can see the data products associated with this search.

In [None]:
search_target = Observations.query_criteria(target_name='ngc1818-rob-d1', provenance_name='ULLYSES')

data_products = Observations.get_product_list(search_target['obsid'])
print(data_products['productFilename'])

Let's use a for-loop to compare the constiuent filenames from the provenance table to the data products returned from our query. We'll only save the observation IDs of the matching data products for later downloading.

In [None]:
to_download = []
for assocfile, obsid in zip(data_products['productFilename'], data_products['obsID']):
    if assocfile in filenames:
        to_download.append(obsid)

Last, we can download the products that we saved in `to_download`. In the call to `download_products` in the following cell, not that we must specify the `obs_collection=['HST']`, which ensures we will not download any HLSPs, as well as `productSubGroupDescription=['X1D', 'SX1']`, otherwise all the other COS or STIS data products will be downloaded as well. One may change these parameters if more data products are desired.

In [None]:
output_dir = './ngc1818-rob-d1/'
Observations.download_products(to_download, download_dir=output_dir, extension=['fits'], obs_collection=['HST'], productSubGroupDescription=['X1D', 'SX1'])
mv_downloads(output_dir)

#### Now, go forth and download and make beautiful discoveries with our ULLYSES products!

***

## Additional Resources

- [ULLYSES](https://ullyses.stsci.edu)
- [MAST API](https://mast.stsci.edu/api/v0/index.html)

## About this Notebook
For support, contact us at the [ULLYSES Helpdesk](https://stsci.service-now.com/hst?id=sc_cat_item&sys_id=a3b8ec5edbb7985033b55dd5ce961990&sysparm_category=ac85189bdb4683c033b55dd5ce96199c).

**Author:**  Elaine M Frazer

**Updated On:** 2023-12-07

## Citations
* See the [ULLYSES website](https://ullyses.stsci.edu/ullyses-cite.html) for citation guidelines.

***

[Top of Page](#top)
<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/notebooks/master/assets/stsci_pri_combo_mark_horizonal_white_bkgd.png" alt="Space Telescope Logo" width="200px"/> 