<img align="left" src = https://project.lsst.org/sites/default/files/Rubin-O-Logo_0.png width=250 style="padding: 10px"> 
<b>Citzen Science Notebook</b> <br>
Contact author: Clare Higgs & Eric Rosas <br>
Last verified to run: 2022-10-20 <br>
LSST Science Piplines version: Weekly 2022_40 <br>
Container size: medium <br>


## 1.0 Introduction
This notebook is intended to guide a PI through the process of sending data from the Rubin Science Platform (RSP) to the Zooniverse.
A detailed guide to Citizen Science projects, outlining the process, requirements and support available is here: (*link to citscipiguide*)
The data sent can be currated on the RSP as a necessary and take many forms. Here, we include an example of sending png cutout images. 
We encourage PIs new to the Rubin dataset to explore the tutorial notebooks and documentation.

As explained in the guide, this notebook will restrict the number of object sent to the Zooniverse to 100 objects. This limit is intended to demonstrate your project prior to full approval from the EPO Data Rights Panel. 

Support is available and questions are welcome - (*some email/link etc*)


**DEBUG VERSION note that this version of the notebook contains additional debugging and the first cell will need to be run once**

### Terminal Prep Work
The follow cell will run the necessary terminal commands that make this notebook possible.

**These cells only need to be run the first time this notebook is run and can be skipped after!**
**This cell should be incorporated in to the RSP and will not be part of the final notebook**

In [1]:
# %load_ext pycodestyle_magic
# %flake8_on
# Install panoptes client package to dependencies and create necessary folders
!mkdir -p project/citizen-science/astro-cutouts/
print("Installing external dependencies...")
!python -m pip install google-cloud-storage --quiet
!pip install panoptes_client --quiet
print("Done installing external dependencies!")

!export GOOGLE_APPLICATION_CREDENTIALS=/opt/lsst/software/jupyterlab/butler-secret/butler-gcs-idf-creds.json

# Temporary debugging, won't affect anything with this notebook, instead helps Zooniverse developers troubleshoot issues on their side
!export PANOPTES_DEBUG=1

Installing external dependencies...
Done installing external dependencies!


## 2.0 Create a Zooniverse Account
If you haven't already, [create a Zooniverse account here.](https://www.zooniverse.org/) 
and [create your project](https://www.zooniverse.org/lab). Your project must be set to "public". To set your project to public, select the "Visibility" tab.
Note you will need to enter your username, password, and [project slug](https://www.zooniverse.org/talk/18/967061?comment=1898157&page=1) below.

After creating your account and project, return to this notebook.

### Log in to Zooniverse
Now that you have a Zooniverse account, log into the Zooniverse(Panoptes) client.

In [2]:
import panoptes_client
client = panoptes_client.Panoptes.connect(login="interactive")
print("You now are logged in to the Zooniverse platform.")

Enter your Zooniverse credentials...


Username:  sreevani
 ········


You now are logged in to the Zooniverse platform.


 ### Look Up Your Zooniverse Project
 Supply your email and project slug below. 
(If you don't know what a "slug" is in this context, see: https://www.zooniverse.org/talk/18/967061?comment=1898157&page=1)
Do not include the leading forward slash.
 </br>
 
 IMPORTANT: Your Zooniverse project must be set to "public", a "private" project will not work. Select this setting under the "Visibility" tab, (it does not need to be set to live).
 The following code will not work if you have not authenticated in the cell titled "Log in to Zooniverse". 

In [3]:
from panoptes_client import Project, SubjectSet, Classification
project = Project.find(slug=slugName)

NameError: name 'slugName' is not defined

### Run the below cell to activate the Rubin Citizen Science SDK
**just run this cell**

**this cell should be gone in the final version**

In [3]:
# HiPS astrocutout libraries
from astroquery.hips2fits import hips2fits
from IPython.display import display
import matplotlib.pyplot as plt
from matplotlib.colors import Colormap
import astropy.units as u
from astropy.coordinates import Longitude, Latitude, Angle
import csv

# GCP libraries
from google.cloud import storage

# Import organizational libraries
import uuid, os, shutil, json, logging, urllib.request
from datetime import datetime, timezone, timedelta

# Prep work
global email
vendor_batch_id = 0
_HIPS_CUTOUTS = "hips_cutouts"
project_id = project.id
guid = ""
cutouts_dir = ""
manifest_url = ""
edc_response = ""
step = 0

def clean_up_unused_subject_set():
    global client, vendor_batch_id
    log_step("Cleaning up unused subject set on the Zooniverse platform, vendor_batch_id : " + str(vendor_batch_id))
    
    try:
        subject_set = SubjectSet.find(str(vendor_batch_id))

        if subject_set.id == vendor_batch_id:
            subject_set.delete()

    except:
        display(f"** Warning: Failed to find the subject set with id: {str(vendor_batch_id)}- perhaps it's been deleted?.")
    return

def send_zooniverse_manifest():
    global vendor_batch_id, manifest_url, client
    log_step("Sending the manifest URL to Zooniverse")
    display("** Information: subject_set.id: " + str(vendor_batch_id) + "; manifest: " + manifest_url);

    payload = {"subject_set_imports": {"source_url": manifest_url, "links": {"subject_set": str(vendor_batch_id)}}}
    json_response, etag = client.post(path='/subject_set_imports', json=payload)
    return

def create_new_subject_set(name):
    global project, panoptes_client, vendor_batch_id
    log_step("Creating a new Zooniverse subject set")
    
    # Create a new subject set
    subject_set = panoptes_client.SubjectSet()
    subject_set.links.project = project

    # Give the subject set a display name (that will only be visible to you on the Zooniverse platform)
    subject_set.display_name = name 
    subject_set.save()
    project.reload()
    vendor_batch_id = subject_set.id
    return vendor_batch_id

def check_status():
    global guid
    status_uri = "https://rsp-data-exporter-dot-skyviewer.uw.r.appspot.com/citizen-science-ingest-status?guid=" + guid
    raw_response = urllib.request.urlopen(status_uri).read()
    response = raw_response.decode('UTF-8')
    return json.loads(response)

def download_batch_metadata():
    global guid
    project_id_str = str(project_id)
    dl_response = "https://rsp-data-exporter-dot-skyviewer.uw.r.appspot.com/active-batch-metadata?vendor_project_id=" + project_id_str
    raw_response = urllib.request.urlopen(dl_response).read()
    response = raw_response.decode('UTF-8')
    return json.loads(response)


# Validates that the RSP user is allowed to create a new subject set
def send_data(subject_set_name, batch_dir, cutout_data = None):
    global manifest_url, edc_response, step
    step = 0
    log_step("Checking batch status")
    if has_active_batch() is True:
        raise CitizenScienceError("You cannot send another batch of data while a subject set is still active on the Zooniverse platform - you can only send a new batch of data if all subject sets associated to a project have been completed.")
    if __cit_sci_data_type == _HIPS_CUTOUTS:
        zip_path = zip_hips_cutouts(batch_dir)
        upload_hips_cutouts(zip_path)
        subject_set_id = create_new_subject_set(subject_set_name)
        
        edc_response = alert_edc_of_new_citsci_data(subject_set_id)
        if(edc_response == None):
            edc_response = { "status": "error", "messages": "An error occurred while processing the data transfer process upload" }
        else:
            edc_response = json.loads(edc_response)

    else:
        # send_butler_data_to_edc()
        subject_set_id = create_new_subject_set(subject_set_name)
        manifest_url = send_butler_data_to_edc()
    
    if edc_response["status"] == "success":
        manifest_url = edc_response["manifest_url"]
        if len(edc_response["messages"]) > 0:
            display("** Additional information:")
            for message in edc_response["messages"]:
                logging.warning(message)
                # display("    ** " + message)
        else:
            log_step("Success! The URL to the manifest file can be found here:")
            display(manifest_url)
    else:
        clean_up_unused_subject_set()
        logging.error("** One or more errors occurred during the last step **")
        logging.error(edc_response["messages"])
        logging.error(f"Email address: {email}")
        logging.error(f"Timestamp: {str(datetime.now(timezone(-timedelta(hours=7))))}")
        # for message in edc_response["messages"]:
        #     display("        ** " + message)
        return

    send_zooniverse_manifest()
    log_step("Transfer process complete, but further processing is required on the Zooniverse platform and you will receive an email at " + email)
    return

def write_metadata_file(manifest, batch_dir):    
    manifest_filename = 'metadata.csv'
    with open(batch_dir + manifest_filename, 'w', newline='') as csvfile:
        fieldnames = list(manifest[0].keys())
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()

        for cutout in manifest:
            writer.writerow(cutout)
        
    return f"{batch_dir}{manifest_filename}"

def zip_hips_cutouts(batch_dir):
    global guid
    guid = str(uuid.uuid4())
    log_step("Zipping up all the astro cutouts - this can take a few minutes with large data sets, but unlikely more than 10 minutes.")
    shutil.make_archive("./" + guid, 'zip', batch_dir)
    return ["./" + guid + '.zip', guid + '.zip']

def upload_hips_cutouts(zip_path):
    log_step("Uploading the citizen science data")
    bucket_name = "citizen-science-data"
    destination_blob_name = zip_path[1]
    source_file_name = zip_path[0]

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)
    return

def alert_edc_of_new_citsci_data(vendor_batch_id):
    global guid
    project_id_str = str(project_id)
    log_step("Notifying the Rubin EPO Data Center of the new data, which will finish processing of the data and notify Zooniverse")
    
    try:
        edc_endpoint = "https://rsp-data-exporter-dot-skyviewer.uw.r.appspot.com/citizen-science-bucket-ingest?email=" + email + "&vendor_project_id=" + project_id_str + "&guid=" + guid + "&vendor_batch_id=" + str(vendor_batch_id) + "&debug=True"
        response = urllib.request.urlopen(edc_endpoint).read()
        manifestUrl = response.decode('UTF-8')
        return manifestUrl
    except Exception as e:
        clean_up_unused_subject_set()
        return None

# def send_butler_data_to_edc():
#     log_step("Notifying the Rubin EPO Data Center of the new data, which will finish processing of the data and notify Zooniverse")
#     edcEndpoint = "https://rsp-data-exporter-e3g4rcii3q-uc.a.run.app/citizen-science-butler-ingest?email=" + email + "&collection=" + datasetId + "&sourceId=" + sourceId + "&vendorProjectId=" + str(projectId) + "&vendor_batch_id=" + str(vendor_batch_id)
#     log_step('Processing data for Zooniverse, this may take up to a few minutes.')
#     response = urllib.request.urlopen(edcEndpoint).read()
#     manifestUrl = response.decode('UTF-8')
#     return

def has_active_batch():
    active_batch = False
    for subject_set in project.links.subject_sets:
        try:
            for completeness_percent in list(subject_set.completeness.values()):
                if completeness_percent == 1.0:
                    active_batch = True
                    break
            if active_batch:
                break
        except:
            display("    ** Warning! - The Zooniverse client is throwing an error about a missing subject set, this can likely safely be ignored.");
    return active_batch

def log_step(msg):
    global step
    step += 1
    display(str(step) + ". " + msg)
    return

# Custom error handling for this notebook
class CitizenScienceError(Exception):
   
    # Constructor or Initializer
    def __init__(self, value):
        self.value = value
   
    # __str__ is to print() the value
    def __str__(self):
        return(repr(self.value))
    
print("Loaded Citizen Science SDK")              

NameError: name 'project' is not defined