# Lesson 0: Working on TIKE, with Cloud Data

## Learning Goals: 
- Understand what TIKE is, and the principles behind cloud platforms
- Define cloud terminology: what is a “bucket” or a server? For that matter, what is the “cloud”?
- Access MAST data through astroquery by name, region, or criteria
- Download TESS data and show an image

## What is TIKE?

TIKE stands for the *Timeseries Integrated Knowledge Engine*.

TIKE uses a web-based platform, called JupyterHub, to allow you to run [Jupyter Notebooks](https://jupyterlab.readthedocs.io/en/latest/) and other software "on the cloud" using your web browser: you don't need to install anything on your local computer. TIKE has access to a cloud copy of the [MAST Archive](https://archive.stsci.edu), enabling anyone to access and analyze data from NASA's [TESS mission](https://archive.stsci.edu/missions-and-data/tess). We also have copies of other mission datasets, including data from HST, GALEX, and PanSTARRS. They are generally cataloged in full on the MAST Public Datasets page, so check there for an updated list.

TIKE is continually maintained and updated by humans, so if you run into issues please let us know. Don't hesitate to send us your suggestions for packages and tutorials, either through the [MAST help desk](mailto:archive@stsci.edu) or the [tike_content repository](https://github.com/spacetelescope/tike_content).

## What is the "cloud"?

The "cloud", or cloud computing, refers to the practice of remotely accessing computing resources, rather than hosting them yourself. This term might also be used to refer to software and databases running on those servers. As Randall Munroe put it, "turns out the cloud is just other people's computers".

In our case, "the cloud" is the AWS East Datacenters in northern Virginia. TIKE runs in proximity to this copy of MAST data. This means that the data is not transmitted over the internet, but rather within a data center. This leads to faster access, since data centers have high-quality (likely fiber optic) connections between their machines. 

### Why would I want to work on the cloud?
Using the cloud has several benefits; principally, as mentioned above, there's no need to download data to your local machine. This speeds up data access, and allows you to perform analyses that wouldn't be possible without a major upgrade to your hard drive capacity or internet service. You can access data whenever and wherever you want to, from any device, as long as you have an internet connection. 

![tike-cloud](TIKE-Cloud-Photo.png)

### What's the difference between working on the cloud and working on TIKE?
Although you might choose to work directly with data stored on the cloud, it can be complex to configure such a system. TIKE handles this complexity, making it as easy as opening a Jupyter Notebook.

### How can I access cloud-hosted data?

There are two approaches to accessing cloud-hosted data:
1. While on TIKE, loading files directly into memory (recommended)
2. A traditional download to your local machine from the cloud-hosted copy of MAST

Whenever possible, it's best to use the first method. The vast majority of users, with small tweaks to existing code, should be able to access data this way.

## Imports and Setup

We'll use the standard tools to open and plot a fits file:
- `astropy.io fits` to read in the fits file
- `matplotlib` to create the plot
- `numpy` to automatically set brightness limits in the plot

To access the cloud data, we need
- `astroquery.mast` to search for and select data
- `s3fs` to access cloud files as though they were local

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import s3fs

from astropy.io import fits
from astroquery.mast import Observations
from astropy.wcs import WCS

The most important step in this process is to enable cloud data access. Once we do, we'll be able to get cloud filenames and access files directly. If you're working locally, you can use this command to download data from the cloud copy of MAST data.

## Querying for MAST Observations
The [`astroquery.mast` package](https://astroquery.readthedocs.io/en/latest/mast/mast.html) is an astronomer-friendly way to programatically query MAST data. This is what we'll use in this Notebook and throughout the course of the webinar, since all of the functions we need are built-in.

### Workflow
Before we dive into the details, let's take a look at the bigger picture. The path from "I want MAST data" to getting it takes three steps:

1. Filter MAST Observations using metadata, such as Ra/Dec, exposure time, and wavelength.
2. Filter the underlying files associated with each Observation (e.g. using calibration level or file type).
3. Access the data, by downloading it or loading it directly into memory.

### Step 1: Observation Metadata Query
Metadata queries can be done with three functions:
- `query_region()`: queries within a radius of the target coordinates (default 0.2 degrees)
- `query_object()`: uses SIMBAD and NED to turn an object name into coordinates$^{**}$, then queries the region
- `query_criteria()`: allows for versatile queries, using values like mission and exposure time ([full list of criteria here](https://mast.stsci.edu/api/v0/_c_a_o_mfields.html))

\** In a vanishingly small but non-zero number of cases, a target may be incorrectly resolved. Double-check the output coordinates before defending any PhD dissertations.

Each function adds additional features, so you'll probably want to use the last one for most queries.

#### Warmup: Count Results
You can append `_count` to any of the above functions to get the number of matching results. For example, we can query within 1 arcminue of the coordinates of Fomalhaut:

In [None]:
"22h57m39.04625s -29d37m20.0533s"

Now it's your turn! How many Observations in MAST are within 2 arcseconds of Trappist-1?

In [None]:
# TYPE ANSWER HERE


In [None]:
# hint: uncomment and run
#Observations.query_object?

#### Querying for an Image of Fomalhaut

Let's stick with our example star Fomalhaut, the brightest star in the southern constellation of Piscis Austrinus, the "Southern Fish", and one of the brightest stars in the night sky.

<img src="https://upload.wikimedia.org/wikipedia/commons/a/ae/Heic0821f.jpg" width="300">

`Image Credit: NASA, ESA, and the Digitized Sky Survey 2. Acknowledgment: Davide De Martin (ESA/Hubble)`

We'll use the `query_criteria` function to look for TESS Observations in the vicinity.

The full table can be a bit overwhelming. Let's only show a subset of columns.

In [None]:
cols = ['target_name', 's_ra', 's_dec', 'dataproduct_type', 'calib_level', 't_exptime', 'sequence_number', 'dataRights', 'distance']


The `distance` to all of these observations is zero, even though their coordinates (`s_ra` and `s_dec`) are different. What gives?

As it turns out, `distance` is a measure of the separation (in arcseconds) of our input coordinates and the Observation footprint. So long as our coordinates are within the footprint, the `distance` will be zero.

Since we want to plot an image, we'll select one of the FFIs. Let's use sector 29. We could use standard Python indexing for this, but we could also just reformat our query:


In [None]:
# option 1: use bitwise and 
# match = np.bitwise_and(tess_obs['sequence_number']==29, tess_obs['target_name']=="TESS FFI")
# tess_obs[match]

# option 2: format the query

As expected, we only get one matching observation back.

## Step 2: Get Products

Now that we have our Observation, we'll use the `get_product_list` to find the underlying files.

Wow! That's a lot of products. Each FFI image is stored individually, and we're looking at a month's worth of data. There's not a straightforward way to sort these FFIs, so we'll skip over those details for now. Instead, we'll filter on a fixed ID.

In [None]:
# fixing for reproducability
fixed_id = "tess2020243183914-s0029-1-1-0193-s"

# filter for the product with the matching ID

# Confirm we have the product we want
filtered_image

As expected, we have one result.

## Step 3: Data Access

Once you've identified your file(s) of interest, you must choose your access method.

### Downloading

We won't say much about this method, since it's not recommended to do this on the cloud. Just know that the option exists, both on TIKE and your local machine

In [None]:
# img_path = Observations.download_products(filtered_image)

### Streaming to Memory
A downloaded file has a path on your computer (e.g. `Downloads/docs/copy-of-untitled1.txt`). We need to use the cloud equivalent of this. Fortunately, there's a function for that!

Whether opening a file on the cloud, or on your local machine, it's best practice to close the file once you're done using it. This is most easily done using Python's `with/open` syntax, as we'll do below.

**Note:** Code in the `with` statement should only be used to extract data from the file. Relatively slow computations and plot generation should go outside of this statement so that the file can close in a timely manner.

In [None]:
# Open the file in AWS: 'F' is the S3 file

    # Now actually read in the FITS file 

# Preview the first ten lines of the header


We can see that, in addition to the primary HDU, this file contains calibrated values and associated uncertainties. This is why we read in the 

### Display the Image

Finally, let's plot our full frame image of Fomalhaut. 

In [None]:
# Create a 12x12 figure


# Use our WCS information to set the coordinates


# Plot the image, adjusting some settings along the way


# Create some labels for our axes


There are three things we can notice in this image:
- Fomalhaut is visible in the top left of the image; since it is a bright star, it suffers from charge bleed and has long vertical spikes.
- The blotchy glowing patches in the lower left corner are likely caused by stars outside of the field of view.
- The background is not totally dark. It is brightest near the top and gets dimmer near the bottom, likely due to scattered light from the Earth or the moon.

For full details on how TESS collects and processes images, see the [TESS Instrument Handbook](https://archive.stsci.edu/missions/tess/doc/TESS_Instrument_Handbook_v0.1.pdf).

Ok! With the image plotted, that's all for this lesson.

## Next time on "MAST Summer Webinar"...

The next lesson, we will plot the light curve of an exoplanet. We'll talk more about the TESS mission, time series data, and how we detect exoplanets. Stay tuned!

## Additional Resources
Can't get enough? Here are some links to more information!

If you need an introduction (or a refresher!) to basic Python syntax, there are several great resources available online. [CodeAcademy](https://www.codecademy.com/learn/learn-python-3) is a great service with a totally free option for getting started with Python, note you will have to create an account to use it. Additionally, the Youtube channel FreeCodeCamp.org has a great [video tutorial](https://www.youtube.com/watch?v=rfscVS0vtbw) on everyting you need to get started programming in Python. Another good resource is the [Python 4 Everyone](https://www.py4e.com/) book. 

The full astropy documentation can be found [here](https://docs.astropy.org/en/stable/index.html).

For more info on FITS files, here is a link to the [FITS NASA site](https://fits.gsfc.nasa.gov/). 

SIMBAD is a web-based query service from the University of Strausberg, it is a great resource for getting quick info on stars and other astronomical targets. Here is the link to [Fomalhaut's SIMBAD page](https://simbad.u-strasbg.fr/simbad/sim-id?Ident=fomalhaut&NbIdent=1&Radius=2&Radius.unit=arcmin&submit=submit+id)

## Acknowldegements

If you write a paper using TESS data from MAST, please acknowledge this using the following template:

> This paper includes data collected with the TESS mission, obtained from the MAST data archive at the Space Telescope Science Institute (STScI). Funding for the TESS mission is provided by the NASA Explorer Program. STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5–26555.

Any published work that uses Astroquery should include a citation which can be found at [this link](https://github.com/astropy/astroquery/blob/main/astroquery/CITATION), or can be printed out in a code cell with: `astroquery.__citation__` as long as the astroquery package is imported. 

### About this Notebook:
If you have comments or questions on this notebook, please open a [GitHub issue on tike_content](https://github.com/spacetelescope/tike_content/issues/new) contact us through the [Archive Help Desk e-mail](mailto:archive@stsci.edu).

**Authors:** Emma Lieb, Thomas Dutkiewicz

**Last Updated:** May 2024

[Top of Page](#top)

<img style=float:right; src="https://raw.githubusercontent.com/spacetelescope/notebooks/master/assets/stsci_pri_combo_mark_horizonal_white_bkgd.png" alt="Space Telescope Logo" width="200px"/> 