In [None]:
# %load_ext pycodestyle_magic
# %flake8_on
# import logging
# logging.getLogger("flake8").setLevel(logging.FATAL)

<img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250 style="padding: 10px"> 
<b>Citizen Science Notebook</b> <br>
Contact author: Becky Nevin<br>
Last verified to run: 2024-01-04 <br>
LSST Science Pipelines version: Weekly 2023_47 <br>
Container size: small or medium <br>


**Description:**
Query and send variable star images and light curves (CURRENTLY NOT DOING LCs) from RSP to Zooniverse

**Skills:** Use various TAP tables, including joining multiple tables. Get calexp images. Extract time-series photometry.

**LSST Data Products:** TAP tables dp02_dc2_catalogs.MatchesTruth, TruthSummary, ForcedSource, CcdVisit<br>

**Packages:** astropy, lsst.daf.butler, lsst.afw.display, lsst.geom 

**Credit:** Tutorial notebooks 03a, 04a, 04b, 07b, and 08

**Support:** Support is available and questions are welcome - (cscience@lsst.org)

DEBUG VERSION note that this version of the notebook contains additional debugging and the first cell will need to be run once

## 1. Introduction <a class="anchor" id="first-bullet"></a>
This notebook will guide you through the process of sending images of variable stars from the Rubin Science Platform (RSP) to the Zooniverse.

### 1.1 Package imports <a class="anchor" id="second-bullet"></a>

#### Install Pipeline Package

First, install the Rubin Citizen Science Pipeline package by doing the following:

1. Open up a New Launcher tab
2. In the "Other" section of the New Launcher tab, click "Terminal"
3. Use `pip` to install the `rubin.citsci` package by entering the following command:
```
pip install rubin.citsci
```
4. Confirm the next cell containing `from rubin.citsci import pipeline` works as expected and does not throw an error

!pip install --upgrade --force-reinstall --no-deps rubin.citsci --quiet

In [None]:
from rubin.citsci import pipeline

In [None]:
import utils # this is not currently used but will be used when we migrate some of the plotting functions to utils
import matplotlib
from matplotlib import image as mpimg
import matplotlib.pyplot as plt
import gc # this is only used in remove_figure
import numpy as np
import pandas as pd
import os

# astropy imports
import astropy
from astropy.wcs import WCS
from astropy import units as u

import lsst.geom as geom

# image visualization routines
import lsst.afw.display as afwDisplay
# must explicitly set this to save figures
afwDisplay.setDefaultBackend("matplotlib")

### 1.2 Define functions and parameters <a class="anchor" id="third-bullet"></a>
If you haven't already, [make a Zooniverse account](https://www.zooniverse.org/accounts/registerhttps://www.zooniverse.org/accounts/register) and create your project.

IMPORTANT: Your Zooniverse project must be set to "public", a "private" project will not work. Select this setting under the "Visibility" tab, (it does not need to be set to live). 


A "slug" is the string of your Zooniverse username and your project name without the leading forward slash, for instance: "username/project-name". [Click here for more details](https://www.zooniverse.org/talk/18/967061?comment=1898157&page=1).


In [None]:
# %load_ext pycodestyle_magic
# %flake8_on
# import logging
# logging.getLogger("flake8").setLevel(logging.FATAL)
email = "beckynevin@gmail.com"  
slug_name = "rebecca-dot-nevin/test-project" 

print("Running utilities to establish a link with Zooniverse")
print("Enter your Zooniverse username followed by password below")
cit_sci_pipeline = pipeline.CitSciPipeline()
cit_sci_pipeline.login_to_zooniverse(slug_name, email)

## 2. Make a subject set of a variable star to send to Zooniverse <a class="anchor" id="fourth-bullet"></a>
A subject set is a collection of data (images, plots, etc) that are shown to citizen scientists. It is also the unit of data that is sent to Zooniverse.

Here, we curate the subject set of objects to send to Zooniverse. This can be modified to create your own subject set. Your subject set must have 100 objects or less in the testing phase before your project is approved by the EPO Data Rights panel. 

This example makes one set of image cutouts of a confirmed variable star at five different moments in time.

### 2.1 Initialize Butler

In [None]:
config = 'dp02'
collection = '2.2i/runs/DP0.2'
service, butler, skymap = utils.setup_butler(config, collection)

### 2.2 Familiarize yourself with the tables we'll be using
This includes the DiaObject table and the ForcedSourceOnDiaObject table. Note that these are _difference_ image tables, created by identifying objects not present in the template deepcoadd images. For more information, see https://lse-163.lsst.io/.


In [None]:
pd.set_option('display.max_rows', 200, 'display.max_colwidth', 1000)
results_diaobject = service.search("SELECT column_name, datatype, description,\
                          unit from TAP_SCHEMA.columns\
                          WHERE table_name = 'dp02_dc2_catalogs.DiaObject'")

In [None]:
results_diaobject.to_table().to_pandas()

diaObjectId is the unique ID for each object in the table; note that these are different IDs from the ObjectId in the Object table. From https://lse-163.lsst.io/:

>There is no direct DIASource-to-Object match: in general, a time-domain object is not necessarily the same astrophysical object as a static-sky object, even if the two are positionally coincident (eg. an asteroid overlapping a galaxy). Therefore, adopted data model emphasizes that having a DIASource be positionally coincident with an Object does not imply it is physically related to it. Absent other information, the least presumptuous data model relationship is one of positional association, not physical identity.

We're currently lacking visit information, which we'll need for creating images of each visit. We can get from the ForcedSourceOnDiaObject table (below). 

In [None]:
results_forceddiaobject = service.search("SELECT column_name, datatype, description,\
                          unit from TAP_SCHEMA.columns\
                          WHERE table_name = 'dp02_dc2_catalogs.ForcedSourceOnDiaObject'")

In [None]:
results_forceddiaobject.to_table().to_pandas()

Finally, let's examine the CcdVisit catalog, which we will match with the ForcedSourceOnDiaObject catalog in order to retrieve timing information of when the exposure was taken.

In [None]:
results_ccdvisit = service.search("SELECT column_name, datatype, description,\
                          unit from TAP_SCHEMA.columns\
                          WHERE table_name = 'dp02_dc2_catalogs.CcdVisit'")

In [None]:
results_ccdvisit.to_table().to_pandas()

In [None]:
del results_forceddiaobject, results_diaobject, results_ccdvisit

### 2.3 Do a search for variable stars
We will perform this search by joining the three catalogs we explored above.

For more details, please see the `DP02_07b_Variable_Star_Lightcurves.ipynb` notebook in the tutorial notebooks by Jeff Carlin and Ryan Lau. All the code in this section is derivative of that notebook.

We are using the coordinates of a known variable star.

In [None]:
ra_known_rrl = 62.1479031
dec_known_rrl = -35.799138

This query will return a massive list of sources, some of which are repeat object IDs.

In [None]:
query = "SELECT diao.diaObjectId, "\
        "fsodo.forcedSourceOnDiaObjectId, "\
        "diao.ra, diao.decl, "\
        "diao.gPSFluxNdata, "\
        "diao.gPSFluxStetsonJ, "\
        "diao.gTOTFluxMean, diao.gTOTFluxSigma, "\
        "scisql_nanojanskyToAbMag(fsodo.psfFlux) as psfMag, "\
        "fsodo.diaObjectId, "\
        "fsodo.ccdVisitId, fsodo.band, fsodo.psfFlux, fsodo.psfFluxErr, "\
        "fsodo.psfDiffFlux, fsodo.psfDiffFluxErr, "\
        "cv.expMidptMJD, cv.detector, cv.visitId, "\
        "scisql_nanojanskyToAbMag(fsodo.psfFlux) as fsodo_gmag "\
        "FROM dp02_dc2_catalogs.DiaObject as diao "\
        "JOIN dp02_dc2_catalogs.ForcedSourceOnDiaObject as fsodo "\
        "ON fsodo.diaObjectId = diao.diaObjectId "\
        "JOIN dp02_dc2_catalogs.CcdVisit as cv ON cv.ccdVisitId = fsodo.ccdVisitId "\
        "WHERE diao.gTOTFluxSigma/diao.gTOTFluxMean > 0.25 "\
        "AND diao.gTOTFluxSigma/diao.gTOTFluxMean < 1.25 "\
        "AND scisql_nanojanskyToAbMag(diao.gTOTFluxMean) > 18 "\
        "AND scisql_nanojanskyToAbMag(diao.gTOTFluxMean) < 23 "\
        "AND diao.gPSFluxNdata > 30 "\
        "AND diao.gPSFluxStetsonJ > 20 "\
        "AND CONTAINS(POINT('ICRS', diao.ra, diao.decl), "\
        "CIRCLE('ICRS',"+str(ra_known_rrl)+", "+str(dec_known_rrl)+", 5)) = 1 "

results = service.search(query)
fsodo_sources = results.to_table()
fsodo_sources 

List by unique source instead.

In [None]:
unique_variables = astropy.table.unique(fsodo_sources,keys = 'diaObjectId')['diaObjectId','ra','decl','expMidptMJD','band','ccdVisitId','visitId', 'detector']
unique_variables

### 2.4 Select one variable star
For the purposes of this tutorial, we will select one pre-ordained source. You can choose another, but be warned that many of these sources are not true variable stars.

In [None]:
diaobjectID = 1567428592185376787
selection = unique_variables[unique_variables["diaObjectId"]==diaobjectID]
ra = selection['ra'].value[0]
dec = selection['decl'].value[0]
print('ra and dec of variable star', ra, dec)

### 2.5 Select some moments in time
To do this, we'll need to go back to the original table to get all of the necessary information you need to plot a series of images, including visit information.

In [None]:
source = fsodo_sources[fsodo_sources["diaObjectId"]==diaobjectID]['diaObjectId','ra','decl','ccdVisitId','visitId',
                                                         'band','psfFlux','psfFluxErr',
                                                         'expMidptMJD','detector','psfMag']
source

Create a function that will select by band.


In [None]:
plot_filter_labels = ['u', 'g', 'r', 'i', 'z', 'y']
plot_filter_colors = {'u': '#56b4e9', 'g': '#008060', 'r': '#ff4000',
                      'i': '#850000', 'z': '#6600cc', 'y': '#000000'}
plot_filter_symbols = {'u': 'o', 'g': '^', 'r': 'v', 'i': 's', 'z': '*', 'y': 'p'}

pick = {}
for filter in plot_filter_labels:
    pick[filter] = (source['band'] == filter)

From now on, we'll only consider the r-band images.

In [None]:
# also select some key moments in time
# begin by ordering by mjd
print(type(source[pick['r']]))
select_r = source[pick['r']]
sorted_sources = select_r[select_r['expMidptMJD'].argsort()]
sorted_sources



Select some random moments in time. Keep these the same to observe a change in brightness, or select your own [at your own risk].

In [None]:
idx_select = [10,15,25,40,63]

Show the selected moments against all dates.

In [None]:
fig = plt.figure(figsize=(6, 4))
plt.plot(sorted_sources['expMidptMJD'], sorted_sources['psfMag'],
         'k.', ms=10)
plt.plot(sorted_sources[idx_select]['expMidptMJD'],
         sorted_sources[idx_select]['psfMag'],
         'r.', ms=10, label = 'selected calexp')
plt.minorticks_on()
plt.xlabel('MJD (days)')
plt.ylabel('r')
plt.gca().invert_yaxis()
plt.legend(loc = 2)
plt.show()

### 2.6 Save images

In [None]:
def get_cutout_image(butler, ra_deg, dec_deg, visit, detector, cutoutSideLength):
    """
    Get the cutout image information from butler.
    Specifically for calexp datatype.
    This should be followed by make_fig

    Input Parameters
    ----------
    ra : ra of source in degrees
    dec : dec of source in degrees
    visit : visit id
    detector : detector number
    cutoutSideLength : size of the cutout
    
    Returns
    ----------
    Cutout image information
    """
    cutoutSize = geom.ExtentI(cutoutSideLength, cutoutSideLength)
    
    radec = geom.SpherePoint(ra_deg, dec_deg, geom.degrees)
    
    dataId = {'visit': visit, 'detector': detector}  
    calexp_wcs = butler.get('calexp.wcs', **dataId)
    
    print('calexp wcs: ', calexp_wcs)
    
    xy = geom.PointI(calexp_wcs.skyToPixel(radec))
    bbox = geom.BoxI(xy - cutoutSize // 2, cutoutSize)
    parameters = {'bbox': bbox}
    print('xy: ', xy)
    print('bbox: ', bbox)
    
    cutout_image = butler.get('calexp', parameters=parameters, **dataId)
    return cutout_image

def make_calexp_fig(cutout_image, out_name):
    """
    Create an image.
    should be followed with remove_figure
    
    Parameters
    ----------
    cutout_image : cutout_image from butler.get
    ra : ra of source in degrees
    dec : dec of source in degrees
    out_name : file name where you'd like to save it
    
    Returns
    ----------
    cutout image
    """
    # fig = plt.figure(figsize=(4, 4))
    # afw_display = afwDisplay.Display(frame=fig)
    # afw_display.scale('asinh', 'zscale')
    # afw_display.mtv(cutout_image.image)
    
#     cutout_wcs = cutout_image.getWcs()
#     radec = geom.SpherePoint(ra, dec, geom.degrees)
#     xy = geom.PointI(cutout_wcs.skyToPixel(radec))
    
#     afw_display.dot('x', xy.getX(), xy.getY(), size=1, ctype='orange')
#     plt.gca().axis('off')
#     plt.savefig(out_name)
    
    fig = plt.figure()
    plt.subplot(projection=WCS(cutout_image.getWcs().getFitsMetadata()))
    
    #print('wcs ra: ', cutout_image.getWcs().getFitsMetadata()['CRVAL1'])
    #print('wcs dec: ', cutout_image.getWcs().getFitsMetadata()['CRVAL2'])
    
    calexp_extent = (cutout_image.getBBox().beginX, cutout_image.getBBox().endX,
                 cutout_image.getBBox().beginY, cutout_image.getBBox().endY)
    im = plt.imshow(abs(cutout_image.image.array), cmap='gray', 
                extent=calexp_extent, origin='lower', norm = matplotlib.colors.LogNorm(vmin=1e1, vmax = 1e5))#, vmax=5e4))
    #im = plt.imshow(cutout_image.image.array, cmap='gray', vmin=-200.0, vmax=5000,
    #            extent=calexp_extent, origin='lower')
    plt.colorbar(location='right', anchor=(0, 0.1))
    # plt.gca().axis('off')
    plt.xlabel('Right Ascension')
    plt.ylabel('Declination')
    plt.savefig(out_name)
    
    return fig

def remove_figure(fig):
    """
    Remove a figure to reduce memory footprint.
    Parameters
    ----------
    fig: matplotlib.figure.Figure
        Figure to be removed.
    Returns
    -------
    None
    """
    # get the axes and clear their images
    for ax in fig.get_axes():
        for im in ax.get_images():
            im.remove()
    fig.clf()       # clear the figure
    plt.close(fig)  # close the figure

    gc.collect()    # call th

In [None]:
# main directory
batch_dir = './variable_stars_output/' 

'''
star_id = diaobjectID # in sree's OG example, this was the object ID not the diaObjID, so maybe change
star_ccdid = 662532066

calexp_image = get_cutout_image(butler, 59.4814837, -37.7323315, 662532, 66, 'g', 50, datasetType='calexp')
figout = make_calexp_fig(calexp_image, 59.4814837, -37.7323315, batch_dir+"/images/"+str(star_id)+"_"+str(star_ccdid)+".png")
''' 

# 
figout_data = {"sourceId": diaobjectID}
'''
if "coord_ra" in fields_to_add:
    figout_data["coord_ra"] = stars_ra[j]
if "coord_dec" in fields_to_add:
    figout_data["coord_dec"] = stars_dec[j]
'''
cutouts = []
    
for i, idx in enumerate(idx_select):
    star_ra = sorted_sources['ra'][idx]
    star_dec = sorted_sources['decl'][idx]
    star_visitid = sorted_sources['visitId'][idx]
    star_detector = sorted_sources['detector'][idx]
    star_id = sorted_sources['diaObjectId'][idx] # WAS objectId
    star_ccdid = sorted_sources['ccdVisitId'][idx]

    calexp_image = get_cutout_image(butler,
                                    star_ra,
                                    star_dec,
                                    star_visitid,
                                    star_detector,
                                    50) 
    figout = make_calexp_fig(calexp_image,
                             batch_dir+"/images/"+str(star_id)+"_"+str(star_ccdid)+".png")
    plt.show()
    remove_figure(figout)
    
    
    
    figout_data['location:image_'+str(i)] = str(star_id)+"_"+str(star_ccdid)+".png"
    figout_data['diaObjectId:image_'+str(i)] = str(star_id)
    figout_data['filename'] = str(star_id)+"_"+str(star_ccdid)+".png"
        
#cutouts.append(figout_data)

# manifest file
df_manifest = pd.DataFrame(data = figout_data, index=[0])


    
#df_manifest = pd.concat(df_final) # final manifest file with all variable stars
outfile = batch_dir+"images/manifest.csv"
df_manifest.to_csv(outfile, index=False, sep=',')


### 2.7 Display images in notebook
Do this using the image directory that you've already saved the images to: `variable_stars_output/images`

In [None]:
image_dir = 'variable_stars_output/images/'
num_variable_images = 5
stars_matchid_list = [diaobjectID]

star_name = np.zeros((1, num_variable_images))
star_name = []
for i, id_star in enumerate(stars_matchid_list):
    # go through and sort by ccdID so that they are in order of time
    # DOUBLE CHECK THAT THIS IS TIME ORDER
    ccdID_list = []
    for j, file in enumerate(os.listdir(image_dir)):
        if str.split(file,'.')[1] == 'png' and str.split(file,'_')[0] == str(id_star):
            star_name.append(str(id_star)+'_'+str(int(str.split(str.split(file,'_')[1],'.')[0]))+'.png')

# Okay now go through and plot each of these
fig, axs = plt.subplots(1,5, figsize = (20,20))
print('star', stars_matchid_list[0])
for j in range(num_variable_images):
    image = mpimg.imread(image_dir + star_name[j])
    axs[j].imshow(image)#, norm = matplotlib.colors.LogNorm())
    axs[j].axis('off')
plt.show()

try:
    print('star', stars_matchid_list[1])
    fig, axs = plt.subplots(1,5, figsize = (20,20))

    for j in range(num_variable_images):
        image = mpimg.imread(image_dir + star_name[j+num_variable_images])
        axs[j].imshow(image)
        axs[j].axis('off')
    plt.show()

except IndexError: # which will happen if you have only one star
    print('only one star')


The third and fifth image should be the brightest.

### A word of caution
Note that because we're using calexp images here, they have not been aligned like individual visits that are combined in a deepcoadd image. Therefore, the pixelscale is not guaranteed to be the same from one image to the next and the astrometry is not guaranteed to align. 

## 3. Send manifest to Zooniverse

In [None]:
cutout_dir = batch_dir+"images/"
subject_set_name = "test_flipbook" 
cit_sci_pipeline.send_image_data(subject_set_name, cutout_dir)