<img align="left" src="https://panoptes-uploads.zooniverse.org/project_avatar/86c23ca7-bbaa-4e84-8d8a-876819551431.png" type="image/png" height=100 width=100>
</img>
<h1 align="right">KSO Tutorials #5: Add new frames to a Zooniverse workflow</h1>
<h3 align="right">Written by @jannesgg and @vykanton</h3>
<h5 align="right">Last updated: Dec 7th, 2021</h5>

# Set up and requirements

### Import Python packages

In [54]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [32]:
# Set the directory of the libraries
import sys
sys.path.append('..')

# Set to display dataframes as interactive tables
from itables import init_notebook_mode
init_notebook_mode(all_interactive=True)

# Import required modules
import kso_utils.tutorials_utils as t_utils
import kso_utils.server_utils as s_utils
import kso_utils.t5_utils as t5
import kso_utils.zooniverse_utils as zoo

print("Packages loaded successfully")

<IPython.core.display.Javascript object>

Packages loaded successfully


### Choose your project

In [3]:
project = t_utils.choose_project()

Dropdown(description='Project:', options=('Koster_Seafloor_Obs', 'Spyfish_Aotearoa', 'SGU'), value='Koster_Sea…

### Initiate sql and zoo project

In [150]:
# Initiate db
db_info_dict = t_utils.initiate_db(project.value)

Enter your username for SNIC server········
Enter your password for SNIC server········
Updated sites
Updated movies
Updated species


In [151]:
# Connect to Zooniverse project
zoo_project = t_utils.connect_zoo_project(project.value)

Enter your Zooniverse user········
Enter your Zooniverse password········


# Retrieve info about zooniverse clips

In [152]:
zoo_info_dict = t_utils.retrieve__populate_zoo_info(project_name = project.value, 
                                                    db_info_dict = db_info_dict,
                                                    zoo_project = zoo_project,
                                                    zoo_info = ["subjects", "classifications", "workflows"])

Retrieving subjects from Zooniverse
subjects were retrieved successfully
Retrieving classifications from Zooniverse
classifications were retrieved successfully
Retrieving workflows from Zooniverse
workflows were retrieved successfully
Updated subjects
The database has a total of 2342 frame subjects and 7362 clip subjects have been updated


# Retrieve frames

## Select the species of interest

In [74]:
# Specify the species of interest
species_i = t5.choose_species(db_info_dict["db_path"])

SelectMultiple(description='Species', index=(0,), options=('Angular crab', 'Ascidians (any species)', 'Bivalve…

### Create the frames

In [159]:
tuple(species_i.value)

('Deep water coral', 'Deeplet sea anemone')

In [209]:
frames_to_upload_df = t5.get_frames(species_ids = species_i.value, 
                                    db_path = db_info_dict["db_path"],
                                    zoo_info_dict = zoo_info_dict,
                                    project_name = project.value,
                                    n_frames = 10)

There are 2747 classifications out of 54324 missing subject info. Maybe the subjects have been removed from Zooniverse?
Zooniverse classifications have been retrieved
Aggregrating the classifications
1878 classifications aggregated out of 5202 unique subjects available
1879
1825
UNIQUE constraint failed: agg_annotations_clip.species_id, agg_annotations_clip.subject_id
Updated agg_annotations_clip
   subject_id  first_seen  species_id  clip_start_time  movie_id  \
0    38166823         1.5           5                0        44   
1    38166827         0.0           5             1440        44   
2    38166828         0.0           5             1620        44   
3    38166830         3.0           5             1980        44   
4    38166833         3.0           5             2160        44   

   first_seen_movie  id                                              fpath  \
0               1.5  44  /cephyr/NOBACKUP/groups/snic2021-6-9/koster_mo...   
1            1440.0  44  /cephyr/NO

In [210]:
frames_to_upload_df.head()

first_seen,species_id,clip_start_time,movie_id,first_seen_movie,id,fpath,fps,exists,frame_number


## List available frames to upload to Zooniverse

In [33]:
# Specify the server to connect to
server_i = s_utils.connect_to_server(project.value)

In [35]:
server_i

{}

In [None]:
# Check availability of movies that correspond to the aggregated clips
t_utils.connect_server(server_i.value)

In [None]:
# Specify how many frames per clip


In [None]:
# Set the subject_set to upload the frames to


### Preview the frames

In [211]:
# Compare the original and modified clips
t5.compare_frames(df = frames_to_upload_df)

Dropdown(description='Select original frame:', layout=Layout(width='50%'), options=('/cephyr/NOBACKUP/groups/s…

Output()

### Set Zooniverse metadata

In [None]:
upload_to_zoo, sitename, created_on = t5.set_zoo_metadata(df = frames_to_upload_df,
                                                          project_name = project.value)

#upload_to_zoo = upload_to_zoo.fillna("unknown")



### Upload frames to Zooniverse

You may receive an error message related to file size if clips exceed the recommended limit for Zooniverse uploads. In this case, we recommend shortening the clip length to achieve a suitable filesize.

In [None]:
t5.upload_frames_to_zooniverse(upload_to_zoo = upload_to_zoo, 
                              sitename = sitename,
                              created_on = created_on,
                              project = zoo_clips_info_dict["project"])

#END