## LINCC Frameworks Workflow Template
This notebook serves as a template for creating a "wishlist" (pseudo-code) workflow for a particular science use case for LINCC Frameworks software. The following sections are meant to serve as some structure to aid in workflow creation, but should not be considered as boundaries if that structure does not fit.

# Workflow: Rapid & Efficient Transit Finding in LSST

Summary: Take in lightcurves from LSST (tested for now with ZTF), looking to find periodic transits a time-efficient manner. Expected number is 100 million stars. This is part of a first-order search focused on removing outliers, this would be followed up by a "proper" search on the remaining candidates.

Stakeholders: Tansu Daylan, Neven Caplar, Wilson Beebe, Doug Branton

## Imports: What packages are being used? (only the most important is fine)

In [1]:
# Import List

# import astropy

# a few lines, will eventually live in a repo
# 1. finds outliers, may have to detrend first
# 2. take outliers, match in period-space
# once you find two pairs that have the same period, call BLS
import astropy # for BLS, but Tansu has his own implementation as well
import tansu_code
import tape

#maybe
import astroquery # query for stellar information

#or instead
import lsdb # pull in stellar information from a dedicated catalog (For ZTF, gaia is good enough)

## Data I/O: What data are we using? How do we load it?

In [3]:
# Datapaths

# path = /path/to/my/data/data.extension
ztf_path = "ztf_data"
gaia_path ="gaia_data"

In [None]:
# Loading 

ztf_cat = lsdb.catalog(ztf_path)
gaia_cat = lsdb.catalog(gaia_path)

xmatch = lsdb.cross_match(ztf_cat, gaia_cat)

# cut on stellar priors, but may be nice to test without for largest data volume possible
xmatch.query("parameters ><= thresholds")

ens = Ensemble()

ens.from_dask_dataframes(xmatch._source_dataframe,
                        xmatch._object_dataframe)

# my_data = package.load(path)

## Data Preprocessing: What do we need to do to the data prior to analysis?

In [4]:
#  Pre-processing
# cross-matches, normalization, pre-filtering?

# detrending, usually come detrended but may not
# check for trends
def check_trend():
    #using some known stats (from scipy)
    # return some metric of trend
    pass
res = Ensemble.batch(check_trend)

# detrending
def detrend():
    pass
res = Ensemble.batch(detrend) # replace our source dataframe


# maybe some ztf specific data quality cuts

## Analysis: What are we using the data to do?

In [None]:
# Analysis

# a few lines, will eventually live in a repo
# 1. finds outliers, may have to detrend first
# 2. take outliers, match in period-space
# once you find two pairs that have the same period, call BLS
res = Ensemble.batch(tansu_code) # period and an epoch, expect to find maybe ~100 candidates (optimistic)

## Results Inspection/Visualization: How would we like to interact with the results?

In [None]:
# Plotting results, sanity checking?

# plot some results for sanity, but workflow mainly ends after the above.