# Data Prep for Crowd Annotation Pipeline - 2D data

2D data refers to data that doesn't need to be tracked across time or space.

1. Collect raw data 
2. Adjust contrast of images
3. Chop up images into manageable pieces
4. Upload to Figure8

Files are named by these scripts such that the code blocks can run back-to-back with minimal input. For this reason, it is recommended that users run through the whole pipeline before processing another set of images.

In [1]:
# import statements
from __future__ import absolute_import

import os

from ipywidgets import fixed, interactive
from skimage.io import imread

%matplotlib inline

from deepcell_toolbox.pre_annotation.overlapping_chopper import overlapping_crop_dir
from deepcell_toolbox.pre_annotation.aws_upload import aws_upload, upload
from deepcell_toolbox.pre_annotation.montage_to_csv import csv_maker
from deepcell_toolbox.pre_annotation.fig_eight_upload import fig_eight
from deepcell_toolbox.pre_annotation.contrast_adjustment import adjust_folder, adjust_overlay

from deepcell_toolbox.utils.io_utils import get_img_names
from deepcell_toolbox.utils.widget_utils import choose_img, edit_image, choose_img_pair, overlay_images

### Select folder of images to process

"base_dir" is a directory where several subfolders will be created to hold intermediate processed images and files. For example, at the start of this pipeline, "/home/gnv/data/example" might hold a few folders of images (different channels of the same dataset). By the end of this pipeline, it will also hold:
 - a folder for contrast-adjusted images 
 - a folder for sub-images
 - a folder that contains json files (store information about variables used to process images)
 - a folder that contains a CSV file to upload to Figure Eight

In [2]:
# Define path to desired raw directory
base_dir = "/gnv_home/data/2D_upload_test/set0"
raw_folder = "raw"
identifier = "ecoli_s0"

raw_dir = os.path.join(base_dir, raw_folder)

In [None]:
#sometimes raw images are in .tif stacks, not individual .tif files
#optional code block for turning into individual slices

## 2. Adjust contrast of images
Before doing anything else, we need to adjust the contrast of the raw data. The following section of this notebook allows the user to interactively choose how the raw images will be processed. The user should adjust the images to make them the most clear for annotators; these contrast-adjustment images will only be used for annotation.

### Option 1: Annotations of images, no overlays
Some images, such as those of fluorescent nuclei, are relatively easy to annotate. Use the following code blocks to adjust the contrast of those images and save them. For more difficult data, such as cytoplasmic images, you may overlay two images (such as phase and fluorescence) to help guide annotators. To overlay images for annotation, skip to option 2.

This widget will allow the user to adjust the following settings, then apply them to a directory of images:
 - "blur" changes a gaussian filter that blurs or sharpens the image
 - "sobel_toggle" determines if a sobel filter is applied on top of the original image; if on, the edges of objects in the image will have the highest contrast
 - "sobel_factor" changes how heavily the sobel filter is applied to the original image, if "sobel_toggle" is on
 - "invert_img"  inverts the intensity range of the image, so that the maximum value becomes the minimum, and vice versa
 - "gamma_adjust" changes the overall brightness of the image without interfering with histogram normalization of the image

 - "equalize_hist" - uses histogram equalization of the whole image to rescale pixel values
 - "equalize_adapthist" - uses histogram equalization applied to local regions of the image to rescale pixel values

In [3]:
# Choose which raw image you would like to use to test on the contrast adjustment
choose_raw = interactive(choose_img, name=get_img_names(raw_dir), dirpath =fixed(raw_dir));
choose_raw

interactive(children=(Dropdown(description='name', options=('tile_x001_y001.tif', 'tile_x001_y002.tif', 'tile_…

In [4]:
# Test with choosen image to fix adjustment parameters
img = imread(choose_raw.result)
edit_raw = interactive(edit_image, image=fixed(img), blur=(0.0,4,0.1), gamma_adjust=(0.1,4,0.1), sobel_factor=(10,10000,100));
edit_raw

interactive(children=(FloatSlider(value=1.0, description='blur', max=4.0), Checkbox(value=True, description='s…

In [5]:
# With choosen parameters, process all the raw data in the folder
sigma = edit_raw.kwargs['blur']
hist = edit_raw.kwargs['equalize_hist']
adapthist = edit_raw.kwargs['equalize_adapthist']
gamma = edit_raw.kwargs['gamma_adjust']
sobel_option = edit_raw.kwargs['sobel_toggle']
sobel = edit_raw.kwargs['sobel_factor']
invert = edit_raw.kwargs['invert_img']

adjust_folder(base_dir, raw_folder, identifier, sigma, hist, adapthist, gamma, sobel_option, sobel, invert)

Processed data will be located at /gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted
Processing image 1 of 25
Processing image 2 of 25
Processing image 3 of 25
Processing image 4 of 25
Processing image 5 of 25
Processing image 6 of 25
Processing image 7 of 25
Processing image 8 of 25
Processing image 9 of 25
Processing image 10 of 25
Processing image 11 of 25
Processing image 12 of 25
Processing image 13 of 25
Processing image 14 of 25
Processing image 15 of 25
Processing image 16 of 25
Processing image 17 of 25
Processing image 18 of 25
Processing image 19 of 25
Processing image 20 of 25
Processing image 21 of 25
Processing image 22 of 25
Processing image 23 of 25
Processing image 24 of 25
Processing image 25 of 25


### Option 2: Overlay two images types for annotation
First, define the folders where your images can be found. This assumes that the images you want to overlay are in separate subfolders. The directory the contains these subfolders, "base_dir", is where contrast adjusted images and subsequent processed images will be saved (each in an appropriate subfolder). The subfolders should contain the same number of images; they are expected to be different channels of the same original image.

Next, a widget will load that allows you to scroll through the images contained in the source subfolders. The user can select a pair of images that are representative of the data set.

Next, a widget will load that allows the user to adjust image processing settings for the first image in the pair (the "raw" image). After you are happy with the image, move on to the next code block; the settings you have determined will be saved.
 
Next, a similar widget will load that allows the user to adjust the image that will be overlaid on the "raw" image. Once you are satisfied with this image, move on to the next code block; the settings you have determined will be saved.
 
Next, a widget will load that allows the user to adjust how the images are overlaid. The user can specify the weighting of the overlay, and change the brightness settings of the final image to increase contrast. The two images to be overlaid can be readjusted individually if they need to be, by going back to the previous widgets and changing the settings. Just re-run the overlay widget and the new settings will be loaded.
 
Finally, when you are satisfied with the adjustments made to the individual and overlaid images, running "adjust_overlay" will take the last-used settings from each widget, apply them to each image in the subfolders specified, and create an overlaid image. The adjusted images will be saved in a new folder; the original images will not be modified. The folder for the adjusted images will be named {raw}\_overlay\_{overlay} to indicate which source folders were combined.

In [None]:
# Define path to desired raw and overlay directories
base_dir = "/home/gnv/data/example"
raw_folder = "FITC"
overlay_folder = "phase"
identifier = "overlay_example"

raw_path = os.path.join(base_dir, raw_folder)
overlay_path = os.path.join(base_dir, overlay_folder)

In [None]:
#pick a matched pair of images to adjust contrast
#choose representative images for best results
max_frame = len(get_img_names(raw_path))

choose_pair = interactive(choose_img_pair, frame = (0, max_frame, 1), raw_dir = fixed(raw_path), overlay_dir = fixed(overlay_path), continuous_update = False);
choose_pair

In [None]:
#adjust raw image
raw_img = imread(choose_pair.result[0])
edit_raw = interactive(edit_image, image=fixed(raw_img), blur=(0.0,4,0.1), gamma_adjust=(0.1,4,0.1), sobel_factor=(10,10000,100));
edit_raw

In [None]:
#adjust overlay image
overlay_img = imread(choose_pair.result[1])
edit_overlay = interactive(edit_image, image=fixed(overlay_img), blur=(0.0,4,0.1), gamma_adjust=(0.1,4,0.1), sobel_factor=(10,10000,100));
edit_overlay

In [None]:
#overlay images
raw_adjusted = edit_raw.result
overlay_adjusted = edit_overlay.result
edit_combination = interactive(overlay_images, raw_img = fixed(raw_adjusted), overlay_img =fixed(overlay_adjusted), prop_raw =(0,1.0, 0.1), v_min = (0, 255, 1), v_max = (0, 255, 1))
edit_combination

In [None]:
#apply overlay settings to all images in folder
#modified images are saved to new folder and do not overwrite originals
raw_settings = edit_raw.kwargs
overlay_settings = edit_overlay.kwargs
combined_settings = edit_combination.kwargs

In [None]:
adjust_overlay(base_dir, raw_folder, overlay_folder, identifier, raw_settings, overlay_settings, combined_settings)

## 3. Chop up images into manageable pieces

Each full-size image usually has many cells in it. This makes them difficult to fully annotate! For ease of annotation (and better results), each frame is chopped up into smaller, overlapping frames, ultimately creating a set of movies. 

These smaller movies can be made with overlapping edges, making it easier to stitch annotations together into one large annotated movie (in the post-annotation pipeline). A large overlap will result in redundant annotations. 

"is_2D" toggles between two modes for saving the files; "is_2D = True" will save all chopped images in one folder together (instead of subfolders meant to contain movies) and names the chopped images slightly differently.

Even if you want to process the full-sized image, run the chopper with num_segments of 1. This is necessary for downstream scripts to work properly.

Information about which settings were used will be stored in a .json file in a folder "base_dir/json_logs" for later processing steps to reference, when needed.

In [5]:
image_input_folder = "raw_contrast_adjusted_2"
image_input_dir = os.path.join(base_dir, image_input_folder)

num_x_segments = 4
num_y_segments = 4
overlap_perc = 10
is_2D = True

In [6]:
overlapping_crop_dir(image_input_dir, identifier, num_x_segments, num_y_segments, overlap_perc, is_2D)

Current Image Size:  (1608, 1608)
Correct dimensionality? (y/n): y
Your new images will be  482  pixels by  482  pixels in size.
Processing...


  warn('%s is a low contrast image' % fname)


Cropped files saved to /gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4


## 4. Upload to Figure Eight
Now that the images are processed into subimages, they need to be uploaded to an AWS bucket and submitted to Figure Eight. This involves uploading the files to AWS, making a CSV file with the links to the uploaded images, and using that CSV file to create a Figure Eight job.

### Upload files to AWS
aws_upload will look for image files in the specified directory (folder_to_upload, set by default to be wherever the output of multiple_montage_maker was saved) and upload them into a bucket. If you don't want to include all of the montages you have made in the figure eight job, move the montages of interest to a new folder and upload that.

For the Van Valen lab, the default bucket is "figure-eight-deepcell" and keys (aws_folder + file names) correspond to the file structure of our data server.

aws_upload returns a list of the urls to which images were uploaded.

In [7]:
bucket_name = "figure-eight-deepcell" #default
aws_folder = "ecoli/set0"
#folder_to_upload = save_dir
folder_to_upload = "/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4"
#folder_to_upload = "/home/gnv/data/example/only_some_of_the_montages"

uploaded_montages = aws_upload(bucket_name, aws_folder, folder_to_upload)

What is your AWS access key id? ········
What is your AWS secret access key id? ········
Connected to AWS
/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_00.png  93540 / 93540.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_01.png  88722 / 88722.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_02.png  87442 / 87442.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_03.png  91800 / 91800.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_04.png  9186 / 9186.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_05.png  84562 / 84562.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_00_frame_06.png  

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_11.png  94915 / 94915.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_12.png  99187 / 99187.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_13.png  93797 / 93797.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_14.png  86085 / 86085.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_15.png  97458 / 97458.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_16.png  95629 / 95629.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_00_y_02_frame_17.png  98930 / 98930.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_00_frame_23.png  92495 / 92495.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_00_frame_24.png  86140 / 86140.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_01_frame_00.png  110325 / 110325.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_01_frame_01.png  116181 / 116181.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_01_frame_02.png  109175 / 109175.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_01_frame_03.png  117662 / 117662.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_01_frame_04.png  11884 / 11884.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_09.png  95624 / 95624.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_10.png  71111 / 71111.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_11.png  96688 / 96688.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_12.png  97131 / 97131.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_13.png  94310 / 94310.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_14.png  84328 / 84328.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_01_y_03_frame_15.png  93975 / 93975.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_01_frame_20.png  103676 / 103676.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_01_frame_21.png  112032 / 112032.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_01_frame_22.png  102818 / 102818.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_01_frame_23.png  114099 / 114099.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_01_frame_24.png  110524 / 110524.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_02_frame_00.png  114373 / 114373.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_02_y_02_frame_01.png  116626 / 116626.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_cho

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_06.png  41726 / 41726.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_07.png  91805 / 91805.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_08.png  90590 / 90590.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_09.png  88283 / 88283.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_10.png  68034 / 68034.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_11.png  89788 / 89788.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_00_frame_12.png  87689 / 87689.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_18.png  99170 / 99170.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_19.png  91637 / 91637.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_20.png  92183 / 92183.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_21.png  87652 / 87652.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_22.png  87005 / 87005.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_23.png  92556 / 92556.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli_s0_x_03_y_02_frame_24.png  90211 / 90211.0  (100.00%)

/gnv_home/data/2D_upload_test/set0/raw_contrast_adjusted_2_chopped_4_4/ecoli

### Make CSV file
Figure Eight jobs can be created easily by using a CSV file where each row contains information about one task. For our jobs, each row has the link to the location of one montage, and information about that montage (currently, just the "identifier" specified at the beginning of the pipeline). The CSV file is saved as "identifier".csv in a folder that only holds CSVs.

In [8]:
csv_dir = os.path.join(base_dir, "CSV")

In [9]:
csv_maker(uploaded_montages, identifier, csv_dir)

### Create Figure Eight job
The Figure Eight API allows us to create a new job and upload data to it from this notebook. However, since our jobs don't include required test questions, editing job information such as the title of the job must be done via the website. This section of the notebook uses the API to create a job and upload data to it, then reminds the user to finish editing the job on the website.

Some sample job IDs to copy are provided below.

In [11]:
#job_id_to_copy = 1366009 #E coli phase, adjusted with sobel
job_id_to_copy = 1306431 #Deepcell overlapping Mibi
#job_id_to_copy = 1292179 #Deepcell HEK

In [12]:
fig_eight(csv_dir, identifier, job_id_to_copy)

Figure eight api key? ········
New job ID is: 1366009
Added data
Now that the data is added, you should go to the Figure Eight website to: 
-change the job title 
-review the job design 
-confirm pricing 
-launch the job (or contact success manager)
