# Downstream Exploitation of Space Data
## Python Crash Course Part 4: Fits and Databases

### Learning Objectives

You will: 
* know what a .fits file is and how to open it
* know how to query objects with Lightkurve
* be able to download light curves from Lightkurve and MAST

There are no exercises in this part as it is covered by sessions 1, 2, and 5.

### fits files

Working with .fits files (Flexible Image Transport System) is common in astronomy. Working with them in Python is very convenient thanks to the astropy library. Documentation for it is available here: https://docs.astropy.org/en/stable/

Let's import the astropy library:

In [None]:
from astropy.io import fits # we only import part we need, not the entire library

In [None]:
# opening the fits file
fits_file = 'hlsp_qlp_tess_ffi_s0055-0000000027768398_tess_v01_llc.fits' # this fits file is a TESS observation
hdul = fits.open(fits_file)

In [None]:
# printing information about the file
hdul.info()

In [None]:
# exploring header of the file
hdul[0].header

In [None]:
# exploring data of the file
data = hdul[1].data
data

In [None]:
# closing the file after use
hdul.close()

Let's plot the data:

In [None]:
import numpy as np # a math library
import matplotlib.pyplot as plt

In [None]:
with fits.open(fits_file, memmap=False) as hdul: # nevermind this for now, we will talk about it during session 5
    data = hdul[1].data
    time = data['TIME']
    flux = data['SAP_FLUX']
            
    flux_mean = np.mean(flux)
    flux_std = np.std(flux)
    
    time_no_outl = time[np.abs(flux - flux_mean) < 10 * flux_std]
    flux_no_outl = flux[np.abs(flux - flux_mean) < 10 * flux_std]

In [None]:
fig, ax = plt.subplots(figsize=(10, 3))
    
ax.plot(time_no_outl, flux_no_outl, color='black', linewidth=0.5)
ax.set_xlabel('Time [d]', fontsize=12)
ax.set_ylabel('Normalized Flux', fontsize=12)

plt.tight_layout()

This .fits file contained a light curve (see notes for the SSE course) of a single star, which is very different from the content of the .fits file we have opened during session 1.

### Lightkurve

Another great astronomical library is lightkurve. It can do many things, of which the most basic one is querying astronomical objects.

As usual, we first import libraries:

In [None]:
import lightkurve as lk

In [None]:
search = lk.search_lightcurve('TIC 320586229', author='QLP') #the first argument is the object we are looking for, the second is a science product
search

In [None]:
lc = search[1].download() # opening the second one (extended TESS mission starts from Sector 27 and data is better quality)
lc

In [None]:
# two columns we are interested in
time = lc['time'].value # note how here column names are not in capital
flux = lc['sap_flux'].value

Let's see how it looks:

In [None]:
fig, ax = plt.subplots(figsize=(10, 3))
    
ax.plot(time, flux, color='black', linewidth=0.5)
ax.set_xlabel('Time [d]', fontsize=12)
ax.set_ylabel('Normalized Flux', fontsize=12)

plt.tight_layout()

The light curve contains outliers, which prevent us from taking a good look. Let's remove them (more in session 5):

In [None]:
quality_mask = lc.quality

good_quality_mask = (quality_mask == 0)
time1 = time[good_quality_mask]
flux1 = flux[good_quality_mask]
        
flux_mean = np.mean(flux1)
flux_std = np.std(flux1)
    
time_no_outl = time1[np.abs(flux1 - flux_mean) < 10 * flux_std]
flux_no_outl = flux1[np.abs(flux1 - flux_mean) < 10 * flux_std]

fig, ax = plt.subplots(figsize=(10, 3))
    
ax.plot(time_no_outl, flux_no_outl, color='black', linewidth=0.5)
ax.set_xlabel('Time [d]', fontsize=12)
ax.set_ylabel('Normalized Flux', fontsize=12)

plt.tight_layout()

We can also save it as a .fits file:

In [None]:
lc = search[1].download(quality_bitmask='default', flux_column='sap_flux', download_dir='./test') 
# the dowload_dir argument will specify where the .fits file is saved

### MAST

MAST is an astronomical data archive, which contains data from several space missions, including Hubble and TESS. We can use it to download data for multiple objects, in contrast to downloaing light curves one by one like we just did.

In [None]:
from astroquery.mast import Observations

In [None]:
tic_ids = [320586229] # this would be a list of object IDs for TESS
# for didactic purposes, there is just object in this list but you can include several (we will use it during session 5)

In [None]:
obsTable = Observations.query_criteria(provenance_name='QLP',
                                       target_name=tic_ids)
data = Observations.get_product_list(obsTable)
download_lc = Observations.download_products(data)

This has downloaded all available TESS light curves for this object in a folder called mastDownload, you can check its contents to make sure.

This is still impractical (and too slow) when downloading hundreds to millions of light curves, there are other ways to do that via MAST but we will not need it for this course :)