# Using Cloud-Hosted Data

All of the TESS, Kepler and K2 science data are available in an S3 public bucket.  Astroquery allows you to easily change your code to first try to retrieve the data from the AWS cloud-hosted version before attempting to retrieve it from MAST.

In this tutorial you will learn:
 - How to retrieve data from the public bucket using astroquery.
 - How to specify your AWS credentials.
 - How to retrieve the directory listing of the contents of the bucket.

## Enable Cloud Dataset
To make astroquery retrieve data from the AWS public buckets, do the following:

In [2]:
from astroquery.mast import Observations
Observations.enable_cloud_dataset(provider='AWS', profile='default')

INFO: Using the S3 STScI public dataset [astroquery.mast.core]
INFO: See Request Pricing in https://aws.amazon.com/s3/pricing/ for details [astroquery.mast.core]
INFO: If you have not configured boto3, follow the instructions here: https://boto3.readthedocs.io/en/latest/guide/configuration.html [astroquery.mast.core]




The `astroquery.mast.Observations` module contains a method called `enable_cloud_dataset`.  When this method is called, the code first attempts to retrieve the data from the cloud.

Here is a full example to download a TESS light curve with Astroquery for Kepler-10. See the Astroquery notebooks for more details on finding MAST data with astroquery. 

In [10]:
from astroquery.mast import Observations
Observations.enable_cloud_dataset(provider='AWS', profile='default')

target = "Kepler-10"

#Do a cone search and find the Kepler long cadence data for your target
obs = Observations.query_object(target,radius="0s")
want = (obs['obs_collection'] == "TESS") & (obs['t_exptime'] ==120)

#Get Product List for that observation
data_prod = Observations.get_product_list(obs[want])

#Move data from the S3 bucket to the default astroquery location. 
#cloud_only=True means that data will only be retrieved if available on AWS S3
manifest = Observations.download_products(data_prod, cloud_only=True)

print(manifest)

INFO: Using the S3 STScI public dataset [astroquery.mast.core]
INFO: See Request Pricing in https://aws.amazon.com/s3/pricing/ for details [astroquery.mast.core]
INFO: If you have not configured boto3, follow the instructions here: https://boto3.readthedocs.io/en/latest/guide/configuration.html [astroquery.mast.core]




Downloading URL s3://stpubdata/tess/public/tid/s0014/0000/0003/7778/0790/tess2019198215352-s0014-0000000377780790-0150-s_lc.fits to ./mastDownload/TESS/tess2019198215352-s0014-0000000377780790-0150-s/tess2019198215352-s0014-0000000377780790-0150-s_lc.fits ... [Done]
Downloading URL s3://stpubdata/tess/public/tid/s0014/0000/0003/7778/0790/tess2019198215352-s0014-0000000377780790-0150-s_tp.fits to ./mastDownload/TESS/tess2019198215352-s0014-0000000377780790-0150-s/tess2019198215352-s0014-0000000377780790-0150-s_tp.fits ... [Done]
                                                         Local Path                                                         ...
--------------------------------------------------------------------------------------------------------------------------- ...
./mastDownload/TESS/tess2019198215352-s0014-0000000377780790-0150-s/tess2019198215352-s0014-0000000377780790-0150-s_lc.fits ...
./mastDownload/TESS/tess2019198215352-s0014-0000000377780790-0150-s/tess2019198215

In [11]:
#Get cloud URIs
data_uri = Observations.get_cloud_uris(data_prod)

print(data_uri)

AttributeError: 'ObservationsClass' object has no attribute 'Observations'

## AWS Credentials
You may specify which profile in your aws credentials file you would like to use in order to ensure the correct account is charged. When on the MAST platform you do not need to set the profile as all data is in the same region as the computer and so the access is free.

You can also specify them using

## Manifest Files

Sometimes it is useful to be able to peruse the contents of the directories. S3 does not allow you to do that easily. The public buckets include a manifest. You can directly use the AWS module `boto3` to request the manifest file. 

Here is an example for TESS where it transfers the file to a file called tess-manifest.txt.gz. The manifest file in the bucket is called tess/public/manifest.txt.gz where 'stpubdata' is the name of the bucket. 

In [8]:
import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('stpubdata')
bucket.download_file("tess/public/manifest.txt.gz", "tess-manifest.txt.gz", ExtraArgs = {"RequestPayer" : "requester"})