<img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250 style="padding: 10px"> 
<b>Citizen Science Notebook</b> <br>
Contact author: Clare Higgs & Eric Rosas <br>
Last verified to run: 2023-07-06 <br>

## Table of Contents
* [Introduction](#first-bullet)
* [Set up necessary modules and log on to the Zooniverse platform](#second-bullet)
* [Make a subject set to send to Zooniverse](#third-bullet)
* [Create a manifest file](#fourth-bullet)
* [Send the data to Zooniverse](#fifth-bullet)
* [Retrieve the data](#sixth-bullet)

## Introduction <a class="anchor" id="first-bullet"></a>

<div class="alert alert-info">

This notebook is intended to guide a PI through the process of sending data from the Rubin Science Platform (RSP) to the Zooniverse and retrieving classifications from Zooniverse. We encourage PIs new to the Rubin dataset to explore the RSP tutorial notebooks and documentation ('/home/your_username/notebooks/tutorial-notebooks').

As explained in the guide, this notebook will restrict the number of object sent to the Zooniverse to 100 objects. This limit is intended to demonstrate your project prior to full approval from the EPO Data Rights Panel.
    
Support is available and questions are welcome - (cscience@lsst.org)

</div>

## 1.0 Set up necessary modules and log on to the Zooniverse platform <a class="anchor" id="second-bullet"></a>



<div class="alert alert-info">

If you haven't already, [create a Zooniverse account](https://www.zooniverse.org/accounts/registerhttps://www.zooniverse.org/accounts/register) and create your project.

IMPORTANT: Your Zooniverse project must be set to "public", a "private" project will not work. Select this setting under the "Visibility" tab, (it does not need to be set to live). 

Supply your email and project slug below. 

> A "slug" is the string of your Zooniverse username and your project name without the leading forward slash, for instance: "username/project-name". [Click here for more details](https://www.zooniverse.org/talk/18/967061?comment=1898157&page=1).

</div>

In [None]:
import utils

In [None]:
email = "" # Email associated with Zooniverse account 
slug_name = "" # Do not include the leading forward-slash, see above 
%run Citizen_Science_Install.ipynb

from rubin_citsci_core_pipeline import CitSciPipeline
print("Loading and running utilities to establish a link with Zooniverse")
print("Enter your Zooniverse username followed by password below")
cit_sci_pipeline = CitSciPipeline()
cit_sci_pipeline.login_to_zooniverse(slug_name, email)

## 2.0 Make a subject set to send to Zooniverse <a class="anchor" id="third-bullet"></a>

<div class="alert alert-info">

> A subject set is a collection of data (images, plots, etc) that are shown to citizen scientists. It is also the unit of data that is sent to Zooniverse.

Here, we curate the subject set of objects to send to Zooniverse. This can be modified to create your own subject set. Your subject set must have 100 objects or less in the testing phase before your project is approved by the EPO Data Rights panel. 

This example makes a set of image cutouts of galaxies.
</div>

In [None]:
print('Establishing the connection to the Butler')
config = "dp02"
collection = "2.2i/runs/DP0.2"
service, butler, skymap = utils.setup_butler(config, collection)
print('Connected')

In [None]:
print('Setting the parameters for making image cutouts')
number_sources = 5  # change this to 100 for a full subject set test
use_center_coords = "62, -37"
use_radius = "1.0"

<div class="alert alert-info">
This query can be modified to select other types of sources. This query can be modified to select other types of sources. 

If you want more details on this please have a look at the RSP tutorial notebooks ('/home/your_username/notebooks/tutorial-notebooks').
</div>

In [None]:
print('Running the Butler query to return objects')
results = utils.run_butler_query(service, number_sources, use_center_coords, use_radius)

In [None]:
print('Preparing the table')
results_table = utils.prep_table(results, skymap)

<div class="alert alert-info">
Have a look at the table you'll use to save the cutout images.
</div>

In [None]:
results_table

## 3.0 Create a manifest file <a class="anchor" id="fourth-bullet"></a>


<div class="alert alert-info">

> A manifest file is a csv file that is used to send all of the classification subjects to the Zooniverse. This file can be used to initiate options on the Zooniverse side. [Click here for an overview](https://about.pfe-preview.zooniverse.org/lab-how-to)

You may desire to send data besides the image cutouts. To do so, edit the make_manifest_and_images utility. __Note:__ Object ID must be included.
</div>

In [None]:
print('Specify the directory that the cutouts will be output to')
batch_dir = "./cutouts/"
print(f"Make the manifest file and save both the manifest and the cutout images in this folder: {batch_dir}")
manifest = utils.make_manifest_and_images(results_table, butler, batch_dir)

<div class="alert alert-info">
Let's have a look at some of the cutout images you saved.
</div>

In [None]:
from matplotlib import image as mpimg
for file in os.listdir(batch_dir):
    try:
        plt.title(file)
        image = mpimg.imread(batch_dir + file)
        plt.imshow(image)
        plt.axis('off')
        plt.show()
    except:
        continue

### Option 1: Write the manifest file to the filesystem automatically

<div class="alert alert-info">

The below cell writes the `manifest.csv` file to the filesystem, which will be used by Zooniverse.
</div> 


In [None]:
manifest_path = cit_sci_pipeline.write_manifest_file(manifest, batch_dir)
print("The manifest CSV file can be found at the following relative path:")
print(manifest_path)

### Option 2: Specify the path to your own manifest file

<div class="alert alert-info">
If desirable, specify the manifest CSV file manually. This is a simple matter of ensuring that it is named `metadata.csv` and placed in the `./cutouts/` folder (or whatever you renamed the `batch_dir` variable to).
</div>


### Option 3: Make your own manifest file


<div class="alert alert-info">
    
You are welcome to edit the automatically created manifest file (option 1) or create a new manifest file (option 2).

The manifest file _must_ abide by [RFC4180](https://datatracker.ietf.org/doc/html/rfc4180.html) as the backend service that parses the manifest file expects this format. In addition, you may have a column with no values, but there _must_ be an empty column value indicated with a comma. For example:

Valid syntax for empty column:
```
column1,column2,empty_column,column4
1,1,,4
1,1,,4
1,1,,4
```

**Important**: The manifest file must be named `manifest.csv` in order for the processing on the backend to work correctly.
</div>

## 4.0 Send the data to Zooniverse <a class="anchor" id="fifth-bullet"></a>

<div class="alert alert-info">

This cell will let you send one subject set. If you already have a set on Zooniverse, it will notify you and fail. If you want to send more data, delete what is on the Zooniverse and send again. You *may* get a warning that your set still exists or a "Could not find subject_set with id=' '" error. If so, wait (~10min) and try again, as Zooniverse takes a minute to process your changes. You may also have re-run the "Look up your project cell". Don't click the below cell multiple times, the upload will fail if multiple runs are attempted.

It has successfully worked if you get nofication and an email saying your data has been sent.
</div>


<div class="alert alert-info">
Name your subject set as it will appear on the Zooniverse. Try not to reuse names.
</div>

In [None]:
print('Send the data to Zooniverse')
subject_set_name = ""
cit_sci_pipeline.send_image_data(subject_set_name, batch_dir, manifest)

## 5.0 Retrieve the data <a class="anchor" id="sixth-bullet"></a>

<div class="alert alert-info">
There are two ways to do this:

1) By directly going to your Zooniverse project and downloading the output csv files found on the 'Data Exports' tab. Click the 'Request new classification report' button and per Zooniverse: "Please note some exports may take a long time to process. We will email you when they are ready. You can only request one of each type of data export within a 24-hour time period."

2) Programatically (as we show below):
</div>

In [None]:
# This project_id is found on Zooniverse by selecting 'build a project' and then selecting the project
# You don't need to be the project owner.
print('Retrieve the classifications from Zooniverse')
project_id = 19539
df = retrieve_data(project_id)
df