# How fast can TESS FFI cutouts be accessed on TIKE?

*Prepared by Geert Barentsen on Aug 23, 2021.*

## Purpose of this notebook

This notebook investigates the performance of obtaining TESS FFI cutouts in two different ways:

1. from MAST using the [TESSCut API](https://mast.stsci.edu/tesscut/) accessed via the `astroquery` package;
2. from AWS S3 using the experimental [s3-support](https://github.com/spacetelescope/astrocut/pull/44) branch of the `astrocut` package.

## Step 1: Install extra dependencies

The following dependencies are not available on TIKE by default:

In [None]:
!pip install -q --upgrade tess-locator multiprocess

We also need to install the experimental `s3-support` branch of the `astrocut` package for this notebook to be able to access the S3-hosted cube files:

In [None]:
!pip install -q git+https://github.com/barentsen/astrocut.git@s3-support

In [None]:
import astroquery
astroquery.__version__

In [None]:
import astrocut
astrocut.__version__

## Step 2: Obtaining random cutout positions

We define the `get_random_coordinates` function which will return a random set of TESS pixel positions:

In [None]:
from tess_locator import TessCoord, TessCoordList
from random import randint

def get_random_coordinates(n=10, sector=None, camera=None, ccd=None) -> TessCoordList:
    """Returns a list of random TESS pixel positions."""
    return TessCoordList(
            [TessCoord(sector=sector if sector else randint(30, 39),
                       camera=camera if camera else randint(1, 4),
                       ccd=ccd if ccd else randint(1, 4),
                       column=randint(100, 2000),
                       row=randint(100, 2000))
             for idx in range(n)])

Example use:

In [None]:
get_random_coordinates(n=3)

## Step 3: Define helper functions

Below we define the `run_astrocut_s3`, `run_tesscut`, and `run_benchmark` helper functions which will be used to execute the benchmarks in the next step.

In [None]:
# This cell implements the helper function to obtain a single cutout using Astrocut.

from astrocut import CutoutFactory

CUTOUT_SIZE = 3

def run_astrocut_s3(crd):
    """Obtain a single cutout using the S3-powered version of astrocut.
    
    Parameters
    ----------
    crd : TessCoord
        TESS pixel position to cut out.
    
    Returns
    -------
    target_pixel_file : str
        Local filename of the extracted Target Pixel File.
    """
    print(f"Starting {crd}")
    # Where is the cube file located on S3?
    cube_file = f"s3://stpubdata/tess/public/mast/tess-s{crd.sector:04d}-{crd.camera}-{crd.ccd}-cube.fits"
    # Name of the output file
    target_pixel_file = f"astrocut{hash(str(crd))}.fits"
    # Create and return the cutout
    try:
        CutoutFactory().cube_cut(cube_file,
                                 coordinates=crd.to_skycoord(),
                                 cutout_size=CUTOUT_SIZE,
                                 target_pixel_file=target_pixel_file)
        return target_pixel_file
    except Exception as e:
        print(f"Exception encountered for {crd}:\n\n{e}")
        return "error"

In [None]:
# This cell implements the helper function to obtain a single cutout using TESSCut.

from astroquery.mast import Tesscut

def run_tesscut(crd):
    """Obtain a single cutout using the TESSCut API hosted at MAST.
    
    Parameters
    ----------
    crd : TessCoord
        TESS pixel position to cut out.

    Returns
    -------
    path : str
        Local filename of the extracted Target Pixel File.
    """
    print(f"Starting {crd}")
    try:
        result = Tesscut.download_cutouts(crd.to_skycoord(),
                                          size=CUTOUT_SIZE,
                                          sector=crd.sector,
                                          path=".",
                                          inflate=False)
        return result["Local Path"][0]
    except Exception as e:
        print(f"Exception encountered for {crd}:\n\n{e}")
        return "error"

In [None]:
# This cell implements the `run_benchmark` helper function.

# We are required to use `multiprocess` instead of `multiprocessing`
# because the latter does not support interactive notebooks
from multiprocess import Pool

def run_benchmark(func, n_cutouts=1, cutout_size=10, sector=None, processes=1):
    """Uses the `func` helper function to obtain `n_cutouts` random cutouts.

    Parameters
    ----------
    func : `run_tesscut` or `run_astrocut_s3`
        Helper function that will be used to obtain each cutout.
    n_cutouts : int
        Total number of random cutouts to obtain.
    sector : int or None
        Restrict cutouts to a specific sector?  Pass None to use random sectors.
    processes : int
        Number of parallel processes to use.

    Returns
    -------
    cutouts : list of str
        List containing the local paths of the Target Pixel Files obtained.
    """        
    # Generate random positions to cut out
    crdlist = get_random_coordinates(n=n_cutouts, sector=sector)

    # Hack: use a global constant to pass cutout size as an argument to `func`
    global CUTOUT_SIZE
    CUTOUT_SIZE = cutout_size
    
    # Run the target function
    with Pool(processes) as p:
        result = p.map(func, crdlist)

    return result

## Step 4: Run the benchmarks

We can now execute the benchmarks for a specific number of cutouts (`n_cutouts`), a cutout size (`cutout_size`), and a number of parallel processes (`processes`).

For example:

In [None]:
%%time
# Request three 10-by-10px cutouts from TESSCut in a single process as follows:
result = run_benchmark(run_tesscut, n_cutouts=3, cutout_size=10, processes=1)
result

In [None]:
%%time
# Request three 10-by-10px cutouts from Astrocut in a single process as follows:
result = run_benchmark(run_astrocut_s3, n_cutouts=3, cutout_size=10, processes=1)
result