# This is the first step in the pipeline
### Spots are detected in this notebook. The input file is expected to be in the zarr format 

## Follow the Instructions below to run through the notebook properly 

This notebook detects spots on your movie. The movie should be a zarr object; if it's not, run Final/Data Preparation/full_movie_to_zarr.ipynb

**Parameters to adjust below** 

* **channel_to_detect**: Which channel will be tracked? This should be the channel with the longest tracks (i.e. AP2). Options are channel 1, 2, or 3.

* **threshold_intensity**: What intensity value distinguishes background from signal? Open up a frame of the movie in Fiji or napari (at the end of this notebook) and mouse over different pixels to figure out this threshold value.

* **all_frames**: When initially optimizing, set this to false and set number_frames_to_detect to 2, in order to run detection on only two time points. This will speed up diagnosing detection quality at the end of the notebook.

Additional parameters for optimization:

* **dist_between_spots**: this distance divided by 2 is the minimum distance that should exist between spots in pixels. For example if you set this to 10 then all spots within 5 pixels of the center of your spot will not be detected. 
* **sigma_estimations**: The expected radius of our spots, in pixels, as [spread_in_z, spread_in_y, spread_in_x]. You can measure the width of a spot in Fiji and divide by two.
* **n_jobs**: The number of CPUs to use for detections. You can set it to -1 and it will use all of your machine's CPUs but one for processing. 

* **number_frames_to_detect**: the number of frames to process. This can be useful when you just want to test your parameters selected for the Detector object like spot_intensity, dist_between_spots and sigma_estimates. 



## Set all parameters in the below cell 

In [57]:
#refer to the above cell for explanation of each parameter 
channel_to_detect = 3 
threshold_intensity = 180
all_frames = True

dist_between_spots = 4
sigma_estimations = [2,2,2]
n_jobs = -1
number_frames_to_detect = 1     

## Import Necessary Packages 

In [58]:
import pandas as pd
import time
import os
import sys
import zarr
import napari 
import dask.array as da 

pythonPackagePath = os.path.abspath('../src/')
sys.path.append(pythonPackagePath)
from parallel import Detector
from gaussian_visualization import visualize_3D_gaussians

### Do not change the code in cell below 

In [59]:
# This assumes that your notebook is inside 'Jupyter Notebooks', which is at the same level as 'test_data'
base_dir = os.path.join(os.path.dirname(os.path.abspath("__file__")), '..', 'movie_data')
# base_dir = os.path.join(os.path.dirname(os.path.abspath("__file__")), '..', 'test_movie_1')

zarr_directory = 'zarr_file/all_channels_data'
zarr_full_path = os.path.join(base_dir, zarr_directory)

save_directory = 'datasets'
save_directory_full = os.path.join(base_dir, save_directory)

In [60]:
#Import the zarr file by adding file path in read mode
z2 = zarr.open(zarr_full_path, mode='r')
frames = z2.shape[0]
print(f'the number of frames are {frames}')
z2.info

the number of frames are 35


0,1
Type,zarr.core.Array
Data type,uint16
Shape,"(35, 3, 61, 254, 340)"
Chunk shape,"(1, 1, 61, 254, 340)"
Order,C
Read-only,True
Compressor,"Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)"
Store type,zarr.storage.DirectoryStore
No. bytes,1106271600 (1.0G)
No. bytes stored,551241928 (525.7M)


## In the below cell Detector object is initilized to perform detection. More details on the Detector object can be attained by the following line of code: 
**copy and paste in a new cell**

?Detector

In [61]:
z2

<zarr.core.Array (35, 3, 61, 254, 340) uint16 read-only>

In [62]:
detector = Detector(zarr_obj = z2, 
                    save_directory = save_directory_full, 
                    spot_intensity = threshold_intensity, 
                    dist_between_spots = dist_between_spots, 
                    sigma_estimations = sigma_estimations, n_jobs = n_jobs, channel_to_detect = channel_to_detect)

In [63]:
#the following function returns the dataframe and also saves it to the provided path in pkl format
#set all_frames = True, to process all the time frames 
#max_frames is useful when you just want to perform detection on a subset of frames. 
#Note: when all_frames= True then max_frames is ignored 
df = detector.run_parallel_frame_processing(max_frames = number_frames_to_detect, all_frames = all_frames)

  z_popt, z_pcov = curve_fit(gaussian_1d_output, z_range, z_data, bounds = ([peak-width_parameters,z_center-width_parameters,-np.inf],[peak+width_parameters,z_center+width_parameters,np.inf]))
Processing frames:   6%|▌         | 2/35 [00:20<04:44,  8.62s/it]

the number of times the gaussian fitting worked was 370 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 374 and the number of times the gaussian did not fit was 0


Processing frames:   9%|▊         | 3/35 [00:21<02:36,  4.88s/it]

the number of times the gaussian fitting worked was 398 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 407 and the number of times the gaussian did not fit was 0


Processing frames:  11%|█▏        | 4/35 [00:21<01:33,  3.03s/it]

the number of times the gaussian fitting worked was 402 and the number of times the gaussian did not fit was 0


Processing frames:  17%|█▋        | 6/35 [00:22<00:45,  1.56s/it]

the number of times the gaussian fitting worked was 422 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 436 and the number of times the gaussian did not fit was 0


Processing frames:  23%|██▎       | 8/35 [00:33<01:33,  3.46s/it]

the number of times the gaussian fitting worked was 393 and the number of times the gaussian did not fit was 0


Processing frames:  26%|██▌       | 9/35 [00:34<01:12,  2.77s/it]

the number of times the gaussian fitting worked was 406 and the number of times the gaussian did not fit was 0


Processing frames:  29%|██▊       | 10/35 [00:34<00:54,  2.19s/it]

the number of times the gaussian fitting worked was 369 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 393 and the number of times the gaussian did not fit was 0


Processing frames:  31%|███▏      | 11/35 [00:35<00:40,  1.68s/it]

the number of times the gaussian fitting worked was 361 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 416 and the number of times the gaussian did not fit was 0


Processing frames:  40%|████      | 14/35 [00:35<00:17,  1.20it/s]

the number of times the gaussian fitting worked was 425 and the number of times the gaussian did not fit was 0


Processing frames:  43%|████▎     | 15/35 [00:46<00:57,  2.89s/it]

the number of times the gaussian fitting worked was 341 and the number of times the gaussian did not fit was 0


Processing frames:  46%|████▌     | 16/35 [00:47<00:48,  2.55s/it]

the number of times the gaussian fitting worked was 389 and the number of times the gaussian did not fit was 0


Processing frames:  49%|████▊     | 17/35 [00:48<00:37,  2.10s/it]

the number of times the gaussian fitting worked was 373 and the number of times the gaussian did not fit was 0


Processing frames:  51%|█████▏    | 18/35 [00:48<00:28,  1.67s/it]

the number of times the gaussian fitting worked was 401 and the number of times the gaussian did not fit was 0


Processing frames:  54%|█████▍    | 19/35 [00:49<00:21,  1.34s/it]

the number of times the gaussian fitting worked was 413 and the number of times the gaussian did not fit was 0


Processing frames:  57%|█████▋    | 20/35 [00:49<00:15,  1.03s/it]

the number of times the gaussian fitting worked was 407 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 414 and the number of times the gaussian did not fit was 0


Processing frames:  63%|██████▎   | 22/35 [01:00<00:48,  3.76s/it]

the number of times the gaussian fitting worked was 407 and the number of times the gaussian did not fit was 0


Processing frames:  66%|██████▌   | 23/35 [01:02<00:37,  3.12s/it]

the number of times the gaussian fitting worked was 386 and the number of times the gaussian did not fit was 0


Processing frames:  69%|██████▊   | 24/35 [01:02<00:24,  2.27s/it]

the number of times the gaussian fitting worked was 423 and the number of times the gaussian did not fit was 0


Processing frames:  71%|███████▏  | 25/35 [01:03<00:17,  1.78s/it]

the number of times the gaussian fitting worked was 389 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 404 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 418 and the number of times the gaussian did not fit was 0

Processing frames:  80%|████████  | 28/35 [01:03<00:05,  1.28it/s]


the number of times the gaussian fitting worked was 416 and the number of times the gaussian did not fit was 0


Processing frames:  83%|████████▎ | 29/35 [01:14<00:19,  3.21s/it]

the number of times the gaussian fitting worked was 397 and the number of times the gaussian did not fit was 0


Processing frames:  89%|████████▊ | 31/35 [01:16<00:08,  2.04s/it]

the number of times the gaussian fitting worked was 392 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 372 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 403 and the number of times the gaussian did not fit was 0


Processing frames:  94%|█████████▍| 33/35 [01:16<00:02,  1.21s/it]

the number of times the gaussian fitting worked was 388 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 393 and the number of times the gaussian did not fit was 0


Processing frames: 100%|██████████| 35/35 [01:16<00:00,  2.19s/it]

the number of times the gaussian fitting worked was 385 and the number of times the gaussian did not fit was 0





# Visualising the Output
## Labels are only for time frame 0, for all z slices 

## Below you can see detected spots as masks on the original image and can adjust detection parameters if you think spots are not detected correctly 

### Once you are in the napari viewer you should adjust the contrast and the opacity to make sure both the masks and the raw movie is visible properly.  

In [64]:
z2.info

0,1
Type,zarr.core.Array
Data type,uint16
Shape,"(35, 3, 61, 254, 340)"
Chunk shape,"(1, 1, 61, 254, 340)"
Order,C
Read-only,True
Compressor,"Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)"
Store type,zarr.storage.DirectoryStore
No. bytes,1106271600 (1.0G)
No. bytes stored,551241928 (525.7M)


In [65]:
# Make a mask of the first time point of the detections

masks = visualize_3D_gaussians(zarr_obj = z2, gaussians_df = df[df['frame'] == 0])
# masks = visualize_3D_gaussians(zarr_obj = z2, gaussians_df = df)

# Create a napari viewer
viewer = napari.Viewer()

#open the zarr file in read mode
dask_array = da.from_zarr(z2)

# first time point of the zarr file and the channel to detect
#the axis arrangement is (t,c,z,y,x)

dask_array_slice = dask_array[0,channel_to_detect-1,:,:,:]

# Add the 3D stack to the viewer
layer_raw = viewer.add_image(dask_array_slice, name='fluorescence', interpolation3d = 'nearest', blending = 'additive', colormap = 'magenta')

# layer_mask = viewer.add_image(masks, name = 'detections mask')
layer_mask = viewer.add_image(masks, name = 'detections', interpolation3d = 'nearest', blending = 'additive', colormap = 'green')

#other useful parameters 
#color_map = list
#contrast_limits = list of list 

# Add Bounding Box
layer_raw.bounding_box.visible = True


If the detections don't line up well with the spots in the image:
* mouse over the spots in napari to get a sense for the intensity of the spots vs background - use the threshold distinguishing spots from background as threshold_intensity 
* vary the dist_between_spots: if the detections are at a higher density than the visible spots, increase the dist_between_spots. And vice versa, if you see spots at a higher density than detections, lower the dist_between_spots.
* If the detections are missing larger or smaller spots you can try increasing or decreasing the sigma_estimations. 
If you see elongated detections, these will be filtered out in the next notebook.

# move to 02.filtering_spots for next steps 