<img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250 style="padding: 10px"> 
<b>Send a Flipbook of Variable Stars Images to Zooniverse</b> <br>
Author(s): Becky Nevin and Sreevani Jarugula <br>
Contact author: Becky Nevin<br>
Last verified to run: 2024-05-02 <br>
LSST Science Pipelines version: Weekly 2024_04 <br>
Container size: small or medium <br>
Targeted learning level: intermediate

**Description:**
Query and send a flipbook of variable star images from RSP to Zooniverse

**Skills:** Use various TAP tables, including joining multiple tables. Get calexp images. Extract time-series photometry.

**LSST Data Products:** TAP tables dp02_dc2_catalogs.MatchesTruth, TruthSummary, ForcedSource, CcdVisit<br>

**Packages:** rubin.cistci, astropy, lsst.daf.butler, lsst.afw.display, lsst.geom 

**Credit:** Rubin tutorial notebooks 03a, 04a, 04b, 07b, and 08

**Support:** Support is available and questions are welcome - (cscience@lsst.org)

## 1. Introduction <a class="anchor" id="first-bullet"></a>
This notebook will guide a PI through the process of sending a flipbook of five images of a variable star from the Rubin Science Platform (RSP) to the Zooniverse. It is recommended to run the `01_Introduction_to_Citsci_Pipeline.ipynb` notebook first, which provides an introduction to sending images to Zooniverse.

### 1.1 Package imports <a class="anchor" id="second-bullet"></a>

#### Install Pipeline Package

First, install the Rubin Citizen Science Pipeline package by doing the following:

1. Open up a New Launcher tab
2. In the "Other" section of the New Launcher tab, click "Terminal"
3. Use `pip` to install the `rubin.citsci` package by entering the following command:
```
pip install rubin.citsci
```
Note that this package will soon be installed directly on RSP.

If this package is already installed, make sure it is updated:
```
pip install --u rubin.citsci
```

4. Confirm the next cell containing `from rubin.citsci import pipeline` works as expected and does not throw an error

In [None]:
from rubin.citsci import pipeline
import utils
import numpy as np
import pandas as pd
import os
import astropy
from matplotlib import image as mpimg
import matplotlib.pyplot as plt
import lsst.afw.display as afwdisplay
afwdisplay.setDefaultBackend("matplotlib")

### 1.2 Define functions and parameters <a class="anchor" id="third-bullet"></a>
If you haven't already, [make a Zooniverse account](https://www.zooniverse.org/accounts/registerhttps://www.zooniverse.org/accounts/register) and create your project.

IMPORTANT: Your Zooniverse project must be set to "public", a "private" project will not work. Select this setting under the "Visibility" tab, (it does not need to be set to live). 

Supply the email associated with your Zooniverse account and project slug below.

A "slug" is the string of your Zooniverse username and your project name without the leading forward slash, for instance: "username/project-name". [Click here for more details](https://www.zooniverse.org/talk/18/967061?comment=1898157&page=1).


In [None]:
email = ""
slug_name = ""
print("Running utilities to establish a link with Zooniverse")
print("Enter your Zooniverse username followed by password below")
cit_sci_pipeline = pipeline.CitSciPipeline()
cit_sci_pipeline.login_to_zooniverse(slug_name, email)

## 2. Make a subject set of a variable star to send to Zooniverse <a class="anchor" id="fourth-bullet"></a>
A subject set is a collection of data (images, plots, etc) that are shown to citizen scientists. It is also the unit of data that is sent to Zooniverse.

This notebook curates a subject set of flipbook images from a variable star to send to Zooniverse. This can be modified to create your own subject set. Your subject set must have 100 objects or less in the testing phase before your project is approved by the EPO Data Rights panel. 

This example makes one set of image cutouts of a confirmed variable star at five different moments in time.

### 2.1 Initialize the Butler

In [None]:
config = 'dp02'
collection = '2.2i/runs/DP0.2'
service, butler, skymap = utils.setup_query_tools(config, collection)

### 2.2 Get familiar with the DiaObject and ForcedSourceOnDiaObject tables
These are _difference_ image tables, created by identifying objects not present in the template deepcoadd images. For more information, see https://lse-163.lsst.io/.


In [None]:
pd.set_option('display.max_rows', 200, 'display.max_colwidth', 1000)
results_diaobject = service.search("SELECT column_name, datatype, description,\
                          unit from TAP_SCHEMA.columns\
                          WHERE table_name = 'dp02_dc2_catalogs.DiaObject'")

In [None]:
results_diaobject.to_table().to_pandas().head()

diaObjectId is the unique ID for each object in the table; note that these are different IDs from the ObjectId in the Object table. From https://lse-163.lsst.io/:

>There is no direct DIASource-to-Object match: in general, a time-domain object is not necessarily the same astrophysical object as a static-sky object, even if the two are positionally coincident (eg. an asteroid overlapping a galaxy). Therefore, adopted data model emphasizes that having a DIASource be positionally coincident with an Object does not imply it is physically related to it. Absent other information, the least presumptuous data model relationship is one of positional association, not physical identity.

It is also necessary to have visit information to create images for each visit. Obtain visit information from the ForcedSourceOnDiaObject table (below). 

In [None]:
results_forceddiaobject = service.search(
    "SELECT column_name, datatype, description, unit "
    "FROM TAP_SCHEMA.columns "
    "WHERE table_name = 'dp02_dc2_catalogs.ForcedSourceOnDiaObject'"
)

In [None]:
results_forceddiaobject.to_table().to_pandas().head()

Finally, examine the CcdVisit catalog, which is matched with the ForcedSourceOnDiaObject catalog in order to retrieve timing information of when the exposure was taken.

In [None]:
results_ccdvisit = service.search("SELECT column_name, datatype, description,\
                          unit from TAP_SCHEMA.columns\
                          WHERE table_name = 'dp02_dc2_catalogs.CcdVisit'")

In [None]:
results_ccdvisit.to_table().to_pandas().head()

In [None]:
del results_forceddiaobject, results_diaobject, results_ccdvisit

### 2.3 Do a search for variable stars
Perform this search by joining the three catalogs explored above.

For more details, please see the `DP02_07b_Variable_Star_Lightcurves.ipynb` notebook in the tutorial notebooks by Jeff Carlin and Ryan Lau. All the code in this section is derivative of that notebook.

Use the coordinates of a known variable star.

In [None]:
ra_known_rrl = 62.1479031
dec_known_rrl = -35.799138

The below query will return a massive list of sources, some of which are repeat object IDs.

In [None]:
query = "SELECT diao.diaObjectId, "\
        "fsodo.forcedSourceOnDiaObjectId, "\
        "diao.ra, diao.decl, "\
        "diao.gPSFluxNdata, "\
        "diao.gPSFluxStetsonJ, "\
        "diao.gTOTFluxMean, diao.gTOTFluxSigma, "\
        "scisql_nanojanskyToAbMag(fsodo.psfFlux) as psfMag, "\
        "fsodo.diaObjectId, "\
        "fsodo.ccdVisitId, fsodo.band, fsodo.psfFlux, fsodo.psfFluxErr, "\
        "fsodo.psfDiffFlux, fsodo.psfDiffFluxErr, "\
        "cv.expMidptMJD, cv.detector, cv.visitId, "\
        "scisql_nanojanskyToAbMag(fsodo.psfFlux) as fsodo_gmag "\
        "FROM dp02_dc2_catalogs.DiaObject as diao "\
        "JOIN dp02_dc2_catalogs.ForcedSourceOnDiaObject as fsodo "\
        "ON fsodo.diaObjectId = diao.diaObjectId "\
        "JOIN dp02_dc2_catalogs.CcdVisit as cv "\
        "ON cv.ccdVisitId = fsodo.ccdVisitId "\
        "WHERE diao.gTOTFluxSigma/diao.gTOTFluxMean > 0.25 "\
        "AND diao.gTOTFluxSigma/diao.gTOTFluxMean < 1.25 "\
        "AND scisql_nanojanskyToAbMag(diao.gTOTFluxMean) > 18 "\
        "AND scisql_nanojanskyToAbMag(diao.gTOTFluxMean) < 23 "\
        "AND diao.gPSFluxNdata > 30 "\
        "AND diao.gPSFluxStetsonJ > 20 "\
        "AND CONTAINS(POINT('ICRS', diao.ra, diao.decl), "\
        "CIRCLE('ICRS',"+str(ra_known_rrl)+", "+str(dec_known_rrl)+", 5)) = 1 "

results = service.search(query)
fsodo_sources = results.to_table()
fsodo_sources

List by unique source instead.

In [None]:
select_cols = ['diaObjectId',
               'ra',
               'decl',
               'expMidptMJD',
               'band',
               'ccdVisitId',
               'visitId',
               'detector']
unique_variables = astropy.table.unique(fsodo_sources,
                                        keys='diaObjectId')[select_cols]
print("Length of unique variables: ", len(unique_variables))

### 2.4 Select one variable star
Select one pre-selected diaobjectID. It is possible to select another, but for the purposes of this tutorial this is not recommended because many of these sources are not true variable stars.

In [None]:
diaobjectid = 1567428592185376787
selection = unique_variables[unique_variables["diaObjectId"] == diaobjectid]
ra = selection['ra'].value[0]
dec = selection['decl'].value[0]
print('ra and dec of variable star', ra, dec)

### 2.5 Select some moments in time
To do this, go back to the original table to get all of the necessary information necessary to plot a series of images, including visit information.

In [None]:
columns_select = ['diaObjectId',
                  'ra',
                  'decl',
                  'ccdVisitId',
                  'visitId',
                  'band',
                  'psfFlux',
                  'psfFluxErr',
                  'expMidptMJD',
                  'detector',
                  'psfMag']
source = fsodo_sources[fsodo_sources["diaObjectId"] ==
                       diaobjectid][columns_select]

Create a function that will select by band.


In [None]:
plot_band_labels = ['u', 'g', 'r', 'i', 'z', 'y']
pick = {}
for band in plot_band_labels:
    pick[band] = (source['band'] == band)

From now on, select only the r-band images. Also order by date.

In [None]:
print(type(source[pick['r']]))
select_r = source[pick['r']]
sorted_sources = select_r[select_r['expMidptMJD'].argsort()]
sorted_sources[0:5]

Select some random moments in time. Keep these the same to observe a change in brightness, or select your own *at your own risk*.

In [None]:
idx_select = [10, 15, 25, 40, 63]

Show the selected moments against all dates.

In [None]:
fig = plt.figure(figsize=(6, 4))
plt.plot(sorted_sources['expMidptMJD'],
         sorted_sources['psfMag'],
         'k.', ms=10)
plt.plot(sorted_sources[idx_select]['expMidptMJD'],
         sorted_sources[idx_select]['psfMag'],
         'r.', ms=10, label='selected calexp')
plt.minorticks_on()
plt.xlabel('MJD (days)')
plt.ylabel('r')
plt.gca().invert_yaxis()
plt.legend(loc=2)
plt.show()

### 2.6 Save images and write the `manifest.csv` to file
This tutorial section utilizes plotting utilities, which are stored in the `utils.py` file.

Define the directory where the flipbook images will be saved (`batch_dir`). Then running through the list of moments in time, create calexp images, and add a row to the manifest file for each image. Note that the diaObjectID is saved as `objectId` in this table. This naming schema is required for the `manifest.csv` file.

In [None]:
print('Specify the directory that the cutouts will be output to')
batch_dir = './variable_stars_output/'
# Create the directory
os.makedirs(batch_dir, exist_ok=True)
print(f"Make the manifest file and save both the manifest and the cutout images in this folder: {batch_dir}")
manifest = utils.make_manifest_with_calexp_images(sorted_sources, diaobjectid, idx_select, butler, batch_dir)

In [None]:
manifest_path = cit_sci_pipeline.write_manifest_file(manifest, batch_dir)
print("The manifest CSV file can be found at the following relative path:")
print(manifest_path)

### 2.7 Display images in notebook
Display the images saved to the manifest file using the image directory (`batch_dir`).

In [None]:
image_dir = batch_dir
num_variable_images = 5
stars_matchid_list = [diaobjectid]
star_name = np.zeros((1, num_variable_images))
star_name = []
for i, id_star in enumerate(stars_matchid_list):
    # go through and sort by ccdID so that they are in order of time
    # DOUBLE CHECK THAT THIS IS TIME ORDER
    ccdid_list = []
    for j, file in enumerate(os.listdir(image_dir)):
        if os.path.isfile(image_dir + file):
            if (str.split(file, '.')[1] == 'png' and
                    str.split(file, '_')[0] == str(id_star)):
                img_id = int(str.split(str.split(file, '_')[1], '.')[0])
                star_name.append(str(id_star) + '_' +
                                 str(img_id) + '.png')
fig, axs = plt.subplots(1, 5, figsize=(20, 20))
print('star', stars_matchid_list[0])
for j in range(num_variable_images):
    image = mpimg.imread(image_dir + star_name[j])
    axs[j].imshow(image)
    axs[j].axis('off')
plt.show()

try:
    print('star', stars_matchid_list[1])
    fig, axs = plt.subplots(1, 5, figsize=(20, 20))

    for j in range(num_variable_images):
        image = mpimg.imread(image_dir + star_name[j + num_variable_images])
        axs[j].imshow(image)
        axs[j].axis('off')
    plt.show()

except IndexError:  # this will happen if you have only one star
    print('only one star')


The third and fifth image should be the brightest.

### A word of caution
These are calexp images, which have not been aligned like individual visits that are combined in a deepcoadd image. Therefore, the pixelscale is not guaranteed to be the same from one image to the next and the astrometry is not guaranteed to align. 

## 3. Send the data to Zooniverse
Zip up the data and send it to the Zooniverse.

### 3.1 Zip up the data
Running the below cell will zip up all the cutouts into a single file - this can take 5 to 10 minutes for large data sets (> 5k cutouts).

In [None]:
zip_path = cit_sci_pipeline.zip_image_cutouts(batch_dir)
print(zip_path)

### 4.2 Send image data
This cell will let PIs send one subject set. Name the subject set as it will appear on Zooniverse.

Running this cell will also initiate the data transfer and make your data available on the Zooniverse platform.

In [None]:
subject_set_name = ""
cit_sci_pipeline.send_image_data(subject_set_name, zip_path, flipbook=True)