# This is the first step in the pipeline
### Spots are detected in this notebook. The input file is expected to be in the zarr format 

In [1]:
import pandas as pd
import time
import os
import sys
import zarr
import napari 
import dask.array as da 

pythonPackagePath = os.path.abspath('../src/')
sys.path.append(pythonPackagePath)
from parallel import Detector
from gaussian_visualization import visualize_3D_gaussians

### Do not change the code in cell below 

In [2]:
# This assumes that your notebook is inside 'Jupyter Notebooks', which is at the same level as 'test_data'
base_dir = os.path.join(os.path.dirname(os.path.abspath("__file__")), '..', 'movie_data')
# base_dir = os.path.join(os.path.dirname(os.path.abspath("__file__")), '..', 'test_movie_1')

zarr_directory = 'zarr_file/all_channels_data'
zarr_full_path = os.path.join(base_dir, zarr_directory)

save_directory = 'datasets'
save_directory_full = os.path.join(base_dir, save_directory)

## Follow the Instructions below to run through the notebook properly 

This notebook detects spots on your movie. The movie should be a zarr object; if it's not, run Final/Data Preparation/full_movie_to_zarr.ipynb

**Parameters to adjust below** 

* **channel_to_detect**: Which channel will be tracked? This should be the channel with the longest tracks (i.e. AP2). Options are channel 1, 2, or 3.

* **threshold_intensity**: What intensity value distinguishes background from signal? Open up a frame of the movie in Fiji or napari (at the end of this notebook) and mouse over different pixels to figure out this threshold value.

* **all_frames**: When initially optimizing, set this to false and set number_frames_to_detect to 2, in order to run detection on only two time points. This will speed up diagnosing detection quality at the end of the notebook.

Additional parameters for optimization:

* **dist_between_spots**: this distance divided by 2 is the minimum distance that should exist between spots in pixels. For example if you set this to 10 then all spots within 5 pixels of the center of your spot will not be detected. 
* **sigma_estimations**: The expected radius of our spots, in pixels, as [spread_in_z, spread_in_y, spread_in_x]. You can measure the width of a spot in Fiji and divide by two.
* **n_jobs**: The number of CPUs to use for detections. You can set it to -1 and it will use all of your machine's CPUs but one for processing. 

* **number_frames_to_detect**: the number of frames to process. This can be useful when you just want to test your parameters selected for the Detector object like spot_intensity, dist_between_spots and sigma_estimates. 



## Set all parameters in the below cell 

In [3]:
#refer to the above cell for explanation of each parameter 
channel_to_detect = 3 
threshold_intensity = 190
all_frames = True

dist_between_spots = 6
sigma_estimations = [4,2,2]
n_jobs = -1
number_frames_to_detect = 2 

In [4]:
#Import the zarr file by adding file path in read mode
z2 = zarr.open(zarr_full_path, mode='r')
frames = z2.shape[0]
print(f'the number of frames are {frames}')
z2.info

the number of frames are 60


0,1
Type,zarr.core.Array
Data type,uint16
Shape,"(60, 3, 114, 744, 303)"
Chunk shape,"(1, 1, 114, 744, 303)"
Order,C
Read-only,True
Compressor,"Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)"
Store type,zarr.storage.DirectoryStore
No. bytes,9251729280 (8.6G)
No. bytes stored,1837785304 (1.7G)


## In the below cell Detector object is initilized to perform detection. More details on the Detector object can be attained by the following line of code: 
**copy and paste in a new cell**

?Detector

In [5]:
detector = Detector(zarr_obj = z2, 
                    save_directory = save_directory_full, 
                    spot_intensity = threshold_intensity, 
                    dist_between_spots = dist_between_spots, 
                    sigma_estimations = sigma_estimations, n_jobs = n_jobs, channel_to_detect = channel_to_detect)

In [6]:
#the following function returns the dataframe and also saves it to the provided path in pkl format
#set all_frames = True, to process all the time frames 
#max_frames is useful when you just want to perform detection on a subset of frames. 
#Note: when all_frames= True then max_frames is ignored 
df = detector.run_parallel_frame_processing(max_frames = number_frames_to_detect, all_frames = all_frames)

Processing frames:   2%|▏         | 1/60 [00:19<19:09, 19.48s/it]

the number of times the gaussian fitting worked was 2831 and the number of times the gaussian did not fit was 0


Processing frames:   3%|▎         | 2/60 [00:19<07:57,  8.24s/it]

the number of times the gaussian fitting worked was 2878 and the number of times the gaussian did not fit was 1


Processing frames:   5%|▌         | 3/60 [00:20<04:24,  4.65s/it]

the number of times the gaussian fitting worked was 2906 and the number of times the gaussian did not fit was 0


Processing frames:   7%|▋         | 4/60 [00:21<03:04,  3.29s/it]

the number of times the gaussian fitting worked was 2828 and the number of times the gaussian did not fit was 0


Processing frames:   8%|▊         | 5/60 [00:21<02:01,  2.21s/it]

the number of times the gaussian fitting worked was 2859 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 2832 and the number of times the gaussian did not fit was 0


Processing frames:  12%|█▏        | 7/60 [00:22<01:01,  1.16s/it]

the number of times the gaussian fitting worked was 2850 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 2753 and the number of times the gaussian did not fit was 0


Processing frames:  15%|█▌        | 9/60 [00:22<00:38,  1.31it/s]

the number of times the gaussian fitting worked was 2904 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 2852 and the number of times the gaussian did not fit was 0


Processing frames:  18%|█▊        | 11/60 [00:22<00:25,  1.92it/s]

the number of times the gaussian fitting worked was 2857 and the number of times the gaussian did not fit was 0


Processing frames:  20%|██        | 12/60 [00:37<02:50,  3.55s/it]

the number of times the gaussian fitting worked was 3050 and the number of times the gaussian did not fit was 0


Processing frames:  22%|██▏       | 13/60 [00:39<02:28,  3.16s/it]

the number of times the gaussian fitting worked was 2923 and the number of times the gaussian did not fit was 0


Processing frames:  23%|██▎       | 14/60 [00:40<02:01,  2.65s/it]

the number of times the gaussian fitting worked was 2963 and the number of times the gaussian did not fit was 0


Processing frames:  28%|██▊       | 17/60 [00:43<01:05,  1.52s/it]

the number of times the gaussian fitting worked was 3038 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 3026 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 3015 and the number of times the gaussian did not fit was 0


Processing frames:  30%|███       | 18/60 [00:43<00:51,  1.22s/it]

the number of times the gaussian fitting worked was 3064 and the number of times the gaussian did not fit was 0


Processing frames:  32%|███▏      | 19/60 [00:43<00:41,  1.01s/it]

the number of times the gaussian fitting worked was 3059 and the number of times the gaussian did not fit was 0


Processing frames:  33%|███▎      | 20/60 [00:43<00:32,  1.24it/s]

the number of times the gaussian fitting worked was 3064 and the number of times the gaussian did not fit was 0


Processing frames:  35%|███▌      | 21/60 [00:44<00:30,  1.29it/s]

the number of times the gaussian fitting worked was 3097 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 3092 and the number of times the gaussian did not fit was 0


Processing frames:  38%|███▊      | 23/60 [00:55<01:43,  2.81s/it]

the number of times the gaussian fitting worked was 3099 and the number of times the gaussian did not fit was 0


Processing frames:  40%|████      | 24/60 [01:00<02:02,  3.40s/it]

the number of times the gaussian fitting worked was 3092 and the number of times the gaussian did not fit was 0


Processing frames:  42%|████▏     | 25/60 [01:02<01:42,  2.92s/it]

the number of times the gaussian fitting worked was 3112 and the number of times the gaussian did not fit was 0


Processing frames:  43%|████▎     | 26/60 [01:04<01:35,  2.82s/it]

the number of times the gaussian fitting worked was 3158 and the number of times the gaussian did not fit was 0


Processing frames:  47%|████▋     | 28/60 [01:05<00:53,  1.68s/it]

the number of times the gaussian fitting worked was 3175 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 3139 and the number of times the gaussian did not fit was 0


Processing frames:  48%|████▊     | 29/60 [01:06<00:40,  1.30s/it]

the number of times the gaussian fitting worked was 3196 and the number of times the gaussian did not fit was 0


Processing frames:  50%|█████     | 30/60 [01:06<00:30,  1.01s/it]

the number of times the gaussian fitting worked was 3141 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 3190 and the number of times the gaussian did not fit was 0


Processing frames:  55%|█████▌    | 33/60 [01:07<00:15,  1.73it/s]

the number of times the gaussian fitting worked was 3218 and the number of times the gaussian did not fit was 0
the number of times the gaussian fitting worked was 3226 and the number of times the gaussian did not fit was 0


Processing frames:  57%|█████▋    | 34/60 [01:14<00:59,  2.28s/it]

the number of times the gaussian fitting worked was 3182 and the number of times the gaussian did not fit was 0


Processing frames:  58%|█████▊    | 35/60 [01:21<01:27,  3.49s/it]

the number of times the gaussian fitting worked was 3187 and the number of times the gaussian did not fit was 0


Processing frames:  60%|██████    | 36/60 [01:24<01:21,  3.41s/it]

the number of times the gaussian fitting worked was 3230 and the number of times the gaussian did not fit was 0


Processing frames:  62%|██████▏   | 37/60 [01:25<01:03,  2.76s/it]

the number of times the gaussian fitting worked was 3175 and the number of times the gaussian did not fit was 0


Processing frames:  63%|██████▎   | 38/60 [01:25<00:45,  2.07s/it]

the number of times the gaussian fitting worked was 3268 and the number of times the gaussian did not fit was 0


Processing frames:  65%|██████▌   | 39/60 [01:27<00:38,  1.82s/it]

the number of times the gaussian fitting worked was 3270 and the number of times the gaussian did not fit was 0


Processing frames:  67%|██████▋   | 40/60 [01:28<00:31,  1.60s/it]

the number of times the gaussian fitting worked was 3187 and the number of times the gaussian did not fit was 0


Processing frames:  68%|██████▊   | 41/60 [01:29<00:30,  1.61s/it]

the number of times the gaussian fitting worked was 3249 and the number of times the gaussian did not fit was 0


Processing frames:  70%|███████   | 42/60 [01:31<00:27,  1.52s/it]

the number of times the gaussian fitting worked was 3238 and the number of times the gaussian did not fit was 0


Processing frames:  72%|███████▏  | 43/60 [01:32<00:24,  1.42s/it]

the number of times the gaussian fitting worked was 3205 and the number of times the gaussian did not fit was 0


Processing frames:  73%|███████▎  | 44/60 [01:32<00:17,  1.11s/it]

the number of times the gaussian fitting worked was 3261 and the number of times the gaussian did not fit was 0


Processing frames:  75%|███████▌  | 45/60 [01:33<00:14,  1.02it/s]

the number of times the gaussian fitting worked was 3226 and the number of times the gaussian did not fit was 0


Processing frames:  77%|███████▋  | 46/60 [01:39<00:33,  2.38s/it]

the number of times the gaussian fitting worked was 3252 and the number of times the gaussian did not fit was 0


Processing frames:  78%|███████▊  | 47/60 [01:42<00:34,  2.64s/it]

the number of times the gaussian fitting worked was 3181 and the number of times the gaussian did not fit was 0


Processing frames:  80%|████████  | 48/60 [01:44<00:28,  2.40s/it]

the number of times the gaussian fitting worked was 3205 and the number of times the gaussian did not fit was 0


Processing frames:  82%|████████▏ | 49/60 [01:46<00:25,  2.36s/it]

the number of times the gaussian fitting worked was 3260 and the number of times the gaussian did not fit was 0


Processing frames:  83%|████████▎ | 50/60 [01:50<00:29,  2.91s/it]

the number of times the gaussian fitting worked was 3205 and the number of times the gaussian did not fit was 0


Processing frames:  85%|████████▌ | 51/60 [01:51<00:21,  2.37s/it]

the number of times the gaussian fitting worked was 3254 and the number of times the gaussian did not fit was 0


Processing frames:  87%|████████▋ | 52/60 [01:52<00:14,  1.84s/it]

the number of times the gaussian fitting worked was 3264 and the number of times the gaussian did not fit was 0


Processing frames:  88%|████████▊ | 53/60 [01:53<00:10,  1.51s/it]

the number of times the gaussian fitting worked was 3202 and the number of times the gaussian did not fit was 0


Processing frames:  90%|█████████ | 54/60 [01:54<00:09,  1.54s/it]

the number of times the gaussian fitting worked was 3203 and the number of times the gaussian did not fit was 0


Processing frames:  92%|█████████▏| 55/60 [01:56<00:08,  1.75s/it]

the number of times the gaussian fitting worked was 3205 and the number of times the gaussian did not fit was 0


Processing frames:  93%|█████████▎| 56/60 [01:57<00:05,  1.39s/it]

the number of times the gaussian fitting worked was 3215 and the number of times the gaussian did not fit was 0


Processing frames:  95%|█████████▌| 57/60 [01:59<00:04,  1.54s/it]

the number of times the gaussian fitting worked was 3211 and the number of times the gaussian did not fit was 0


Processing frames:  97%|█████████▋| 58/60 [02:00<00:02,  1.26s/it]

the number of times the gaussian fitting worked was 3211 and the number of times the gaussian did not fit was 0


Processing frames:  98%|█████████▊| 59/60 [02:01<00:01,  1.20s/it]

the number of times the gaussian fitting worked was 3198 and the number of times the gaussian did not fit was 1


Processing frames: 100%|██████████| 60/60 [02:02<00:00,  2.05s/it]

the number of times the gaussian fitting worked was 3232 and the number of times the gaussian did not fit was 0





# Visualising the Output
## Labels are only for time frame 0, for all z slices 

## Below you can see detected spots as masks on the original image and can adjust detection parameters if you think spots are not detected correctly 

### Once you are in the napari viewer you should adjust the contrast and the opacity to make sure both the masks and the raw movie is visible properly.  

In [7]:
z2.info

0,1
Type,zarr.core.Array
Data type,uint16
Shape,"(60, 3, 114, 744, 303)"
Chunk shape,"(1, 1, 114, 744, 303)"
Order,C
Read-only,True
Compressor,"Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)"
Store type,zarr.storage.DirectoryStore
No. bytes,9251729280 (8.6G)
No. bytes stored,1837785304 (1.7G)


In [8]:
# Make a mask of the first time point of the detections

masks = visualize_3D_gaussians(zarr_obj = z2, gaussians_df = df[df['frame'] == 0])
# masks = visualize_3D_gaussians(zarr_obj = z2, gaussians_df = df)

# Create a napari viewer
viewer = napari.Viewer()

#open the zarr file in read mode
dask_array = da.from_zarr(z2)

# first time point of the zarr file and the channel to detect
#the axis arrangement is (t,c,z,y,x)

dask_array_slice = dask_array[0,channel_to_detect-1,:,:,:]

# Add the 3D stack to the viewer
layer_raw = viewer.add_image(dask_array_slice, name='fluorescence', interpolation3d = 'nearest', blending = 'additive', colormap = 'magenta')

# layer_mask = viewer.add_image(masks, name = 'detections mask')
layer_mask = viewer.add_image(masks, name = 'detections', interpolation3d = 'nearest', blending = 'additive', colormap = 'green')

#other useful parameters 
#color_map = list
#contrast_limits = list of list 

# Add Bounding Box
layer_raw.bounding_box.visible = True


If the detections don't line up well with the spots in the image:
* mouse over the spots in napari to get a sense for the intensity of the spots vs background - use the threshold distinguishing spots from background as threshold_intensity 
* vary the dist_between_spots: if the detections are at a higher density than the visible spots, increase the dist_between_spots. And vice versa, if you see spots at a higher density than detections, lower the dist_between_spots.
* If the detections are missing larger or smaller spots you can try increasing or decreasing the sigma_estimations. 
If you see elongated detections, these will be filtered out in the next notebook.

# move to 02.filtering_spots for next steps 