# Data Prep for Crowd Annotation Pipeline

1. Collect raw data 
2. Adjust contrast of images
3. Chop up images into manageable pieces
4. Make into montages
5. Upload to Figure8

Files are named by these scripts such that the code blocks can run back-to-back with minimal input. For this reason, it is recommended that users run through the whole pipeline before processing another set of images.

In [None]:
# import statements
from __future__ import absolute_import

from ipywidgets import interact, fixed, interactive
import numpy as np
import skimage as sk
import os
from scipy import ndimage
import scipy
import sys
from skimage.io import imread
import matplotlib.pyplot as plt

import scipy.ndimage as ndi

%matplotlib inline

from dcde.pre_annotation.montage_makers import montage_maker, multiple_montage_maker
from dcde.pre_annotation.overlapping_chopper import overlapping_crop_dir
from dcde.pre_annotation.aws_upload import aws_upload, upload
from dcde.pre_annotation.montage_to_csv import csv_maker
from dcde.pre_annotation.fig_eight_upload import fig_eight
from dcde.pre_annotation.contrast_adjustment import contrast, adjust_folder, adjust_overlay

from dcde.utils.io_utils import get_img_names
from dcde.utils.widget_utils import choose_img, edit_image, choose_img_pair, overlay_images

In [None]:
#sometimes raw images are in .tif stacks, not individual .tif files
#optional code block for turning into individual slices

## 2. Adjust contrast of images
Before doing anything else, we need to adjust the contrast of the raw data. contrast_adjustment blurs the data using a gaussian filter, finds the edges, inverts, and does additional equalization if needed. The user defines the parameters needed using the widgets below.

### Option 1: Annotations of images, no overlays
Some images, such as those of fluorescent nuclei, are relatively easy to annotate. Use the following code blocks to adjust the contrast of those images and save them. For more difficult data, such as cytoplasmic images, you may overlay two images (such as phase and fluorescence) to help guide annotations. To overlay images for annotation, skip to option 2.

In [None]:
# Define path to desired raw directory
base_dir = "/gnv_home/data/contrast_test/pics181220"
raw_folder = "pos1"
identifier = "test0"

dirpath = os.path.join(base_dir, raw_folder)

In [None]:
# Choose which raw image you would like to use to test on the contrast adjustment
choose_raw = interactive(choose_img, name=get_img_names(dirpath), dirpath =fixed(dirpath));
choose_raw

In [None]:
# Test with choosen image to fix adjustment parameters
img = imread(choose_image.result)
edit_raw = interactive(edit_image, image=fixed(img), blur=(0.0,4,0.1), gamma_adjust=(0.1,4,0.1), sobel_factor=(10,10000,100));
edit_raw

In [None]:
# With choosen parameters, process all the raw data in the folder
contrast(base_dir, raw_folder, identifier, gaussian_sigma, hist, adapthist, gamma)

### Option 2: Overlay two images types for annotation

In [None]:
# Define path to desired raw and overlay directories
base_dir = "/gnv_home/data/contrast_overlay_test"
raw_folder = "FITC"
overlay_folder = "phase"
identifier = "overlay_test_eq"

raw_path = os.path.join(base_dir, raw_folder)
overlay_path = os.path.join(base_dir, overlay_folder)

In [None]:
#pick a matched pair of images to adjust contrast
#choose representative images for best results
max_frame = len(get_img_names(raw_path))

choose_pair = interactive(choose_img_pair, frame = (0, max_frame, 1), raw_dir = fixed(raw_path), overlay_dir = fixed(overlay_path), continuous_update = False);
choose_pair

In [None]:
#adjust raw image
raw_img = imread(choose_pair.result[0])
edit_raw = interactive(edit_image, image=fixed(raw_img), blur=(0.0,4,0.1), gamma_adjust=(0.1,4,0.1), sobel_factor=(10,10000,100));
edit_raw

In [None]:
#adjust overlay image
overlay_img = imread(choose_pair.result[1])
edit_overlay = interactive(edit_image, image=fixed(overlay_img), blur=(0.0,4,0.1), gamma_adjust=(0.1,4,0.1), sobel_factor=(10,10000,100));
edit_overlay

In [None]:
#overlay images
raw_adjusted = edit_raw.result
overlay_adjusted = edit_overlay.result
edit_combination = interactive(overlay_images, raw_img = fixed(raw_adjusted), overlay_img =fixed(overlay_adjusted), prop_raw =(0,1.0, 0.1), v_min = (0, 255, 1), v_max = (0, 255, 1))
edit_combination

In [None]:
#apply overlay settings to all images in folder
#modified images are saved to new folder and do not overwrite originals
raw_settings = edit_raw.kwargs
overlay_settings = edit_overlay.kwargs
combined_settings = edit_combination.kwargs

In [None]:
adjust_overlay(base_dir, raw_folder, overlay_folder, identifier, raw_settings, overlay_settings, combined_settings)

## 3. Chop up images into manageable pieces

Each full-size image usually has many cells in it. This makes them difficult to fully annotate! For ease of annotation (and better results), each frame is chopped up into smaller, overlapping frames, ultimately creating a set of movies. 

These smaller movies can be made with overlapping edges, making it easier to stitch annotations together into one large annotated movie (in the post-annotation pipeline). A large overlap will result in redundant annotations.

Even if you want to process the full-sized image, run the chopper with num_segments of 1. The montage makers are written to run on the output of the chopper.

In [None]:
# base_direc = "/home/geneva/Desktop/Nb_testing/"
# raw_direc = os.path.join(base_direc, "MouseBrain_s7_nuclear")
# identifier = "MouseBrain_s7_nuc"

num_x_segments = 4
num_y_segments = 4
overlap_perc = 10

In [None]:
raw_direc = "/gnv_home/data/contrast_overlay_test/FITC_overlay_phase_1"
overlapping_crop_dir(raw_direc, identifier, num_x_segments, num_y_segments, overlap_perc)

## 4. Make into montages
multiple_montage_maker is written to run on the output of the chopper, ie the folder where each chopped movie folder is saved. It will make montages of each subfolder according to the variables specified. It will make more than one montage per subfolder if there are enough frames to do so.

The variables used in multiple_montage_maker are saved in a JSON file so they can be reused in post-annotation processing.

In [None]:
montage_len = 10

direc = raw_direc + "_chopped_" + str(num_x_segments) + "_" + str(num_y_segments)
#direc = "/home/geneva/Desktop/Nb_testing/nuclear_test_chopped_4_4"

save_direc = os.path.join(base_dir, identifier + "_montages_" + str(num_x_segments) + "_" + str(num_y_segments))
#save_direc = "/home/geneva/Desktop/Nb_testing/montages"

log_direc = os.path.join(base_dir, "json_logs")

row_length = 10
x_buffer = 20
y_buffer = 20

In [None]:
multiple_montage_maker(montage_len, direc, save_direc, identifier, 
                       num_x_segments, num_y_segments, row_length, x_buffer, y_buffer, log_direc)

## 5. Upload to Figure Eight
Now that the images are processed into montages, they need to be uploaded to an AWS bucket and submitted to Figure Eight. This involves uploading the files to AWS, making a CSV file with the links to the uploaded images, and using that CSV file to create a Figure Eight job.

### Upload files to AWS
aws_upload will look for image files in the specified directory (folder_to_upload, set by default to be wherever the output of multiple_montage_maker was saved) and upload them into a bucket.

For the Van Valen lab, the default bucket is "figure-eight-deepcell" and keys (aws_folder + file names) correspond to the file structure of our data server.

aws_upload returns a list of the urls to which images were uploaded.

In [None]:
#import os

bucket_name = "figure-eight-deepcell" #default
aws_folder = "Ed/3T3/set0/cyto_test"
#folder_to_upload = save_direc #usually .../montages
folder_to_upload = "/gnv_home/data/contrast_overlay_test/contrast_test_examples"

uploaded_montages = aws_upload(bucket_name, aws_folder, folder_to_upload)

#os.path.join("https://s3.us-east-2.amazonaws.com", bucket_name, aws_folder)
#print(uploaded_montages)
#from io_utils import get_img_names
#imgs_to_upload = get_img_names(folder_to_upload)
#for index, img in enumerate(imgs_to_upload):
#    print(img)
#    print(os.path.join(folder_to_upload, img))

### Make CSV file
Figure Eight jobs can be created easily by using a CSV file where each row contains information about one task. For our jobs, each row has the link to the location of one montage, and information about that montage (currently, just the "identifier" specified at the beginning of the pipeline). The CSV file is saved as "identifier".csv in a folder that only holds CSVs. CSV folders are usually in cell-type directories, so identifiers should be able to distinguish between sets, parts, etc.

In [None]:
#identifier = "test"
csv_direc = os.path.join(base_dir, "CSV")

In [None]:
csv_maker(uploaded_montages, identifier, csv_direc)

### Create Figure Eight job
The Figure Eight API allows us to create a new job and upload data to it from this notebook. However, since our jobs don't include required test questions, editing job information such as the title of the job must be done via the website. This section of the notebook uses the API to create a job and upload data to it, then reminds the user to finish editing the job on the website.

Some sample job IDs to copy are provided below.

In [None]:
#job_id_to_copy = 1344258 #Elowitz timelapse RFP pilot
job_id_to_copy = 1346216 #Deepcell MouseBrain 3x5
#job_id_to_copy = 1306431 #Deepcell overlapping Mibi
#job_id_to_copy = 1292179 #Deepcell HEK
#job_id_to_copy =

In [None]:
from dcde.pre_annotation.fig_eight_upload import fig_eight

fig_eight(csv_direc, identifier, job_id_to_copy)