<img align="left" src="https://panoptes-uploads.zooniverse.org/project_avatar/86c23ca7-bbaa-4e84-8d8a-876819551431.png" type="image/png" height=100 width=100>
</img>


<h1 align="right">KSO Tutorials #12: Analyse Zooniverse classifications</h1>
<h3 align="right">Written by @jannesgg and @vykanton</h3>
<h5 align="right">Last updated: Feb 17th, 2022</h5>

# Set up and requirements

### Import Python packages

In [1]:
# Set the directory of the libraries
import sys
sys.path.append('..')

# Set to display dataframes as interactive tables
from itables import init_notebook_mode
init_notebook_mode(all_interactive=True)

# Import required modules
import kso_utils.tutorials_utils as t_utils
import kso_utils.t12_utils as t12

print("Packages loaded successfully")

<IPython.core.display.Javascript object>

Packages loaded successfully


### Choose your project

In [2]:
project = t_utils.choose_project()

Dropdown(description='Project:', options=('Koster_Seafloor_Obs', 'Spyfish_Aotearoa', 'SGU'), value='Koster_Sea…

### Set up initial information

In [3]:
db_info_dict, zoo_project, zoo_info_dict = t12.setup_initial_info(project.value)

Enter the key id for the aws server········
Enter the secret access key for the aws server········


..\db_starter\db_csv_info\sites_buv_doc.csv: 100%|██████████| 123k/123k [00:00<00:00, 430kB/s]
..\db_starter\db_csv_info\movies_buv_doc.csv: 100%|██████████| 14.7k/14.7k [00:00<00:00, 94.0kB/s]
..\db_starter\db_csv_info\species_buv_doc.csv: 100%|██████████| 7.45k/7.45k [00:00<00:00, 59.0kB/s]
..\db_starter\db_csv_info\surveys_buv_doc.csv: 100%|██████████| 1.78k/1.78k [00:00<00:00, 12.3kB/s]
..\db_starter\db_csv_info\choices_buv.csv: 100%|██████████| 3.29k/3.29k [00:00<00:00, 21.4kB/s]


Updated sites
Updated movies
Updated species
Enter your Zooniverse user········
Enter your Zooniverse password········
Retrieving subjects from Zooniverse
subjects were retrieved successfully
Retrieving workflows from Zooniverse
workflows were retrieved successfully
Retrieving classifications from Zooniverse
classifications were retrieved successfully
Updated subjects
The database has a total of 216 frame subjects and 4993 clip subjects have been updated


### Step 1: Specify the Zooniverse workflow id and version of interest

*Note:  A manual export in Zooniverse is required to get the most up-to-date classifications here**

Make sure your workflows in Zooniverse have different names to avoid issues while selecting the workflow id

In [4]:
# Display a selectable list of workflow names and a list of versions of the workflow of interest
workflows_df = zoo_info_dict["workflows"]
wm = t12.WidgetMaker(workflows_df)
wm

WidgetMaker(children=(IntText(value=0, description='Number of workflows:', style=DescriptionStyle(description_…

In [None]:
# Retrieve classifications from the workflow of interest
class_df = t12.get_classifications(wm.checks,
                                   workflows_df,
                                   wm.checks['Subject type: #0'], 
                                   zoo_info_dict["classifications"], 
                                   db_info_dict["db_path"])

### Step 2: Aggregate classifications received on the workflow of interest

In [None]:
# Specify the agreement threshold required among cit scientists
agg_params = t12.choose_agg_parameters(wm.checks['Subject type: #0'])

In [None]:
agg_class_df, raw_class_df = t12.aggregrate_classifications(class_df, 
                                                            wm.checks['Subject type: #0'], 
                                                            project.value, 
                                                            agg_params)

### Step 3: Summarise the number of classifications based on the agreement specified

In [None]:
agg_class_df.groupby("label")["subject_ids"].agg("count")

### Step 4: Display the aggregated classifications in a table

In [None]:
# Display the dataframe into a table
t12.launch_table(agg_class_df, wm.checks['Subject type: #0'])

### Step 5: Use the subject explorer widget to visualise subjects and their aggregated classifications

In [None]:
# Launch the subject viewer
t12.launch_viewer(agg_class_df, wm.checks['Subject type: #0'])

### Step 6: Use the subject explorer widget to get more information about specific subjects and their "raw" classifications

In [None]:
# Launch the classifications_per_subject explorer
t12.explore_classifications_per_subject(raw_class_df, wm.checks['Subject type: #0'])

In [None]:
# END