# How to Access TESS Data: `astroquery`

## Introduction

This notebook uses the MAST Portal's advanced search options to retrieve the (1) all TESS light curves for a single target, and (2) all data products from an example guest investigator program. The notbook will show how to do an advanced query with `astroquery.mast` and filter out data products that are not desired, and then download the files of interest from the AWS S3 bucket.

NOTE -- The list of program IDs can be found at the [TESS GI List of Approved Programs](https://heasarc.gsfc.nasa.gov/docs/tess/approved-programs.html).



## Imports
- The `Observations` module from `astroquery.mast` is needed to make the query and download the data

NOTE -- If you are running this notebook on the TIKE, as recommended, you should not need to install or update your `astroquery` package

In [None]:
from astroquery.mast import Observations

All data from the TESS Mission are available for free on Amazon Web Services in public S3 buckets. (See the small caveat to this at the end of this notebook.) At this point, we'll also tell `astroquery` to download any data directly from the S3 bucket.

NOTE -- Using AWS resources to access public MAST data **no longer requires an AWS account** for all AWS regions. See the [`astroquery.mast` documentation](https://astroquery.readthedocs.io/en/latest/mast/mast.html#cloud-data-access) for more details.

In [None]:
Observations.enable_cloud_dataset(provider='AWS')

## Query for a specific target
In this example, we want to retrieve all TESS data (light curves, target pixel files, and data validation files) for the target TIC 7854182 (a known $\delta$ Scuti star in an eclipsing binary; Chen et al. 2022, Kahraman Aliçavuş et al. 2017, Liakos & Niarchos 2017).

Feel free to use **your** favorite target here!

In [None]:
target_name = "TIC 7854182"

In [None]:
obs = Observations.query_object(target_name, radius="0s")


In [None]:
print(f"TOTAL Number of Observations available for {target_name}: \n{len(obs)}")

The total number of Observations include all Missions (HST, JWST, etc.) in the MAST collection, as we have not filtered down to just TESS data yet.

Next, we want to apply the advanced search filters to narrow our results, and retrieve the list of data products that are associated with each Observation.

The filter names that can be used in the `get_product_list` function are listed on the [MAST API Field Description Page](https://mast.stsci.edu/api/v0/_c_a_o_mfields.html).

To keep this search simple, we will use the following filters:
- obs_collection to specify that we only want TESS data
- dataproduct_type to specify that we want timeseries data

In [None]:
filters = (obs['obs_collection'] == "TESS") & (obs['dataproduct_type'] == 'timeseries')

# Pick which products we want to retrieve using the Advanced Search options
data_prod = Observations.get_product_list(obs[filters])


In [None]:
print(f"Number of TESS time series data products available for {target_name}: \n{len(data_prod)}")

In [None]:
display(data_prod[-5:])

If we decide that we only want to download, e.g., the light curve files, we can further filter the products list based on the columns/fields in the table.

Descriptions of each of the product filters that can be used in the `filter_products` function are available on the [MAST API Product Field Description page](https://mast.stsci.edu/api/v0/_productsfields.html).

In [None]:
data_prod.colnames


In [None]:
# Select which files to download from the S3 bucket by applying additional filters
filt_prod = Observations.filter_products(data_prod, description='Light curves')


In [None]:
print(f"Number of TESS light curves available for {target_name}: \n{len(filt_prod)}")

In [None]:
display(filt_prod)

Now, we're ready to download the data from the AWS S3 bucket!

When set to True, the `cloud_only` parameter in `download_products` skips any data products that are not available in the cloud; all TESS Mission data are available through AWS, so none of the selected data products should be skipped.

NOTE -- If you try to download the same file(s) more than once (e.g., by running the cell below multiple times), you should get the message "Found cached file" instead of "Downloading URL" in the printed manifest.

In [None]:
manifest = Observations.download_products(filt_prod, cloud_only=True)


All TESS light curves for TIC 7854182 have been downloaded, and we're ready to start our science!

## Query for a specific Guest Investigator program
In this example, we want to retrieve all TESS data (light curves, target pixel files, and data validation files) associated with Guest Investigator program G05101 from Cycle 5 (PI: Susan Mullally).

Feel free to use any program here! Again, the list of program IDs can be found at the [TESS GI List of Approved Programs](https://heasarc.gsfc.nasa.gov/docs/tess/approved-programs.html).

In [None]:
pid = "G05101"

In [None]:
obs_pid = Observations.query_criteria(obs_collection = "TESS",
                                  proposal_id = f"*{pid}*")

In [None]:
print(f"TOTAL Number of Observations available for {pid}: \n{len(obs_pid)}")

In [None]:
display(obs_pid[:5])

Next, we will retrieve the list of data products that are associated with each Observation.


In [None]:
# Pick which products we want to retrieve
data_prod_pid = Observations.get_product_list(obs_pid)

In [None]:
print(f"Number of TESS data products available for {pid}: \n{len(data_prod_pid)}")

In [None]:
display(data_prod[10:17])

As above, if we decide that we only want to download, e.g., the light curve files, we can further filter the products list based on the columns/fields in the table.

For our purposes, this step is optional; I'm going to download **all** of the available data products for this program ID.

In [None]:
# OPTIONAL
# Select which files to download from the S3 bucket by applying additional filters
#filt_prod = Observations.filter_products(data_prod_pid, description='Light curves')

Now, we're ready to download the data from the AWS S3 bucket!

By default, all downloads will be placed in a `./mastDownload/` directory on the TIKE. If you'd like to change this directory, use the `download_dir` parameter in the `download_products` functions. For example, you may want to place all products downloaded for GI program G05101 in a directory named `./G05101`, as shown below.


In [None]:
manifest = Observations.download_products(data_prod_pid, download_dir=f'{pid}', cloud_only=True)


If you scroll through the above manifest, you may notice an error:
```
ERROR: Error pulling from S3 bucket: 'productFilename' [astroquery.mast.observations]
WARNING: Skipping file... [astroquery.mast.observations]
```

So, what happened here? Let's check the manifest to see which files were not downloaded.


In [None]:
display(manifest[manifest['Status']!='COMPLETE'])

When we say that "ALL" data from the TESS Mission are available in AWS S3 buckets, there is a small caveat.

The TESS Full Data Validation Reports (`*_dvr.pdf`) and Mini Data Validation Reports (`*_drm.pdf`), which are produced for all TCEs associated with a particular host star, are **not currently** hosted on AWS.

As MAST moves towards a more cloud-based model for data access, MAST is considering adding these types of data products to the AWS S3 bucket, but for now, if we want to download them to our local directory, we'll need to download them directly from MAST. To do this, we set `cloud_only=False`. If set to False and cloud data access is enabled (`enable_cloud_dataset` above), files that are not found in the cloud will be downloaded from MAST.

In [None]:
manifest = Observations.download_products(data_prod_pid, download_dir=f'{pid}', cloud_only=False)


All TESS data products (light curves, target pixel files, and data validation files) associated with Guest Investigator program G05101 have been downloaded

## About this Notebook

**Author:** Hannah M. Lewis, STScI Data Scientist

**Updated On:** 2023-01-05

<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/notebooks/master/assets/stsci_pri_combo_mark_horizonal_white_bkgd.png" alt="STScI logo" width="200px"/>