## To-do
- Now that we have import and metadata extraction working, we need to start preprocessing (mostly interpolating timepoints for z-slices if recorded on frame-by-frame basis by the scope) and some scheme for identification of a nuclear and a spot channel that is compatible with switching between the two channels (e.g. using mCherry to segment nuclei during cycles but not at the division).
- Makes sense to use dask for visualization (e.g. choosing a threshold).
- Write DoG/segmentation fuction so that it can take either 2D or 3D data - give the option to segment off of a projection, or off of raw 3D data.
    - Write in options for DoG and LoG segmentation algorithm with standard nuclear sizes vs box DoG/LoG vs watershed.
        - Actually, box filtering might not be very helpful if we're cutting off part of the nucleus is z - the BP filtering will project it into a distorted gaussian if we're not right in the middle of the nucleus, and then misplace the centroid and botch the diameter estimation from $\sigma$. For 3D segmentation, it might be better to use a single filter to find markers then perform a watershed.
- 3D DoG notes:
    - $\sigma_{x, y} = 8$ works perfectly to segment out nuclei during nc 13.
    - $\sigma_z$ is BP-filtered (1, 9) where 9 is the Z-sigma corresponding to the whole nucleus. This allow the BP to be very permissive in Z and filter out the nuclei in x and y.
- Proposed procedure for local peak finding:
    - Run box DoG as below with permissive BP in z and LoG approximation in (x, y), only varying $\sigma$ in the latter.
    - Peak-finding on standard image (e.g. $\sigma_{x, y} = 8$), then use coordinates as initial guess for next sigma values.
- Simple BP filter + peak finding does a good job finding markers. Give option then to watershed segment directly off of the image, off of distance-transformed otsu thresholded image, and off of edge-finding.
    - For data with the mid-nuclear plane on the boundary of our z-stack, might be useful to give the option to segment in 2D, then threshold each nuclear column locally to identify the nucleus.
    - Need to write loop over timepoints, clean up small objects at each step, then commit segmentation to file.

In [1]:
from preprocessing.import_data import import_save_dataset

# from nuclear_segmentation import segment_nuclei
import napari

trim_series = True
lif_test_name = "test_data/2021-06-14/p2pdpwt"
lsm_test_name = "test_data/2023-04-07/p2pdp_zld-sites-ctrl_fwd_1"

(
    channels_full_dataset,
    original_global_metadata,
    original_frame_metadata,
    export_global_metadata,
    export_frame_metadata,
) = import_save_dataset(lsm_test_name, trim_series=trim_series, mode="tiff")

  warn('Due to an issue with JPype 0.6.0, reading is slower. '
  imsave(collated_data_path, channel_data, plugin="tifffile")
  imsave(collated_data_path, channel_data, plugin="tifffile")


In [3]:
nuclear_channel_metadata = export_frame_metadata[1]
nuclear_channel = channels_full_dataset[1]

In [4]:
viewer = napari.view_image(nuclear_channel, name="Nuclear Channel")
napari.run()

In [2]:
from nuclear_analysis import segmentation
from tracking import track_features, detect_mitosis

import numpy as np
from dask.distributed import LocalCluster, Client

In [3]:
cluster = LocalCluster(
    host="localhost",
    scheduler_port=8786,
    threads_per_worker=1,
    n_workers=12,
    memory_limit="4GB",
)

In [4]:
client = Client(cluster)

In [5]:
client

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 12
Total threads: 12,Total memory: 44.70 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:8786,Workers: 12
Dashboard: http://127.0.0.1:8787/status,Total threads: 12
Started: Just now,Total memory: 44.70 GiB

0,1
Comm: tcp://127.0.0.1:46435,Total threads: 1
Dashboard: http://127.0.0.1:42487/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:37045,
Local directory: /tmp/dask-scratch-space/worker-yng5l_c2,Local directory: /tmp/dask-scratch-space/worker-yng5l_c2

0,1
Comm: tcp://127.0.0.1:34575,Total threads: 1
Dashboard: http://127.0.0.1:42761/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:45527,
Local directory: /tmp/dask-scratch-space/worker-51fwg1fm,Local directory: /tmp/dask-scratch-space/worker-51fwg1fm

0,1
Comm: tcp://127.0.0.1:39269,Total threads: 1
Dashboard: http://127.0.0.1:43597/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:44587,
Local directory: /tmp/dask-scratch-space/worker-i2kt0bxh,Local directory: /tmp/dask-scratch-space/worker-i2kt0bxh

0,1
Comm: tcp://127.0.0.1:44013,Total threads: 1
Dashboard: http://127.0.0.1:41813/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:37133,
Local directory: /tmp/dask-scratch-space/worker-0pbethpi,Local directory: /tmp/dask-scratch-space/worker-0pbethpi

0,1
Comm: tcp://127.0.0.1:41061,Total threads: 1
Dashboard: http://127.0.0.1:34173/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:39551,
Local directory: /tmp/dask-scratch-space/worker-48misp_b,Local directory: /tmp/dask-scratch-space/worker-48misp_b

0,1
Comm: tcp://127.0.0.1:43249,Total threads: 1
Dashboard: http://127.0.0.1:46533/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:35041,
Local directory: /tmp/dask-scratch-space/worker-pnmqswgj,Local directory: /tmp/dask-scratch-space/worker-pnmqswgj

0,1
Comm: tcp://127.0.0.1:41497,Total threads: 1
Dashboard: http://127.0.0.1:43361/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:46259,
Local directory: /tmp/dask-scratch-space/worker-n5lf_7se,Local directory: /tmp/dask-scratch-space/worker-n5lf_7se

0,1
Comm: tcp://127.0.0.1:35195,Total threads: 1
Dashboard: http://127.0.0.1:44059/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:41841,
Local directory: /tmp/dask-scratch-space/worker-cn_j06mc,Local directory: /tmp/dask-scratch-space/worker-cn_j06mc

0,1
Comm: tcp://127.0.0.1:42285,Total threads: 1
Dashboard: http://127.0.0.1:41659/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:33941,
Local directory: /tmp/dask-scratch-space/worker-f3aklb5y,Local directory: /tmp/dask-scratch-space/worker-f3aklb5y

0,1
Comm: tcp://127.0.0.1:35621,Total threads: 1
Dashboard: http://127.0.0.1:40875/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:37309,
Local directory: /tmp/dask-scratch-space/worker-gk7dgyem,Local directory: /tmp/dask-scratch-space/worker-gk7dgyem

0,1
Comm: tcp://127.0.0.1:34591,Total threads: 1
Dashboard: http://127.0.0.1:44991/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:34297,
Local directory: /tmp/dask-scratch-space/worker-gn5f4a_k,Local directory: /tmp/dask-scratch-space/worker-gn5f4a_k

0,1
Comm: tcp://127.0.0.1:35313,Total threads: 1
Dashboard: http://127.0.0.1:35685/status,Memory: 3.73 GiB
Nanny: tcp://127.0.0.1:38589,
Local directory: /tmp/dask-scratch-space/worker-_38zihsd,Local directory: /tmp/dask-scratch-space/worker-_38zihsd


In [8]:
%%time

(
    denoised,
    denoised_futures,
    nuclear_channel_futures,
) = segmentation.denoise_movie_parallel(
    nuclear_channel,
    denoising="gaussian",
    denoising_sigma=3,
    client=client,
)

mask, mask_futures, _ = segmentation.binarize_movie_parallel(
    denoised_futures,
    thresholding="global_otsu",
    closing_footprint=segmentation.ellipsoid(3, 3),
    client=client,
    futures_in=False,
)

markers, markers_futures, _ = segmentation.mark_movie_parallel(
    *nuclear_channel_futures,  # Wrapped in list from previous parallel run, needs unpacking
    mask_futures,
    low_sigma=[3, 5.5, 5.5],
    high_sigma=[10, 14.5, 14.5],
    max_footprint=((1, 25), segmentation.ellipsoid(3, 3)),
    max_diff=1,
    client=client,
    futures_in=False,
)

marker_coords = np.array(np.nonzero(markers)).T

labels, labels_futures, _ = segmentation.segment_movie_parallel(
    denoised_futures,
    markers_futures,
    mask_futures,
    watershed_method="raw",
    min_size=200,
    client=client,
    futures_in=False,
)

segmentation_dataframe = track_features.segmentation_df(
    labels,
    nuclear_channel,
    nuclear_channel_metadata,
)

tracked_dataframe = track_features.link_df(
    segmentation_dataframe,
    search_range=15,
    # adaptive_stop=1,
    # adaptive_step=0.99,
    memory=0,
    pos_columns=["x", "y"],
    t_column="frame_reverse",
    velocity_predict=True,
    velocity_averaging=2,
)

centroids = np.unique(
    np.array(
        [
            [row["frame"] - 1, int(row["z"]), int(row["y"]), int(row["x"])]
            for _, row in tracked_dataframe.iterrows()
        ]
    ),
    axis=0,
)

mitosis_dataframe = detect_mitosis.construct_lineage(
    tracked_dataframe,
    pos_columns=["y", "x"],
    search_range_mitosis=35,
    # adaptive_stop=0.05,
    # adaptive_step=0.99,
    antiparallel_coordinate="collision",
    antiparallel_weight=None,
    min_track_length=3,
    image_dimensions=[256, 512],
    exclude_border=0.02,
    minimum_age=8,
)

reordered_labels, _, _ = track_features.reorder_labels_parallel(
    labels_futures,
    mitosis_dataframe,
    client=client,
    futures_in=False,
    futures_out=False,
)

Frame 2: 1 trajectories present.
CPU times: user 24.5 s, sys: 41.6 s, total: 1min 6s
Wall time: 3min 26s


Using the rule of thumb $r \approx \sigma \sqrt{2} \ (2D)$ and $r \approx \sigma \sqrt{3} \ (3D)$ as rough bounds for the kernels used for band-pass filtering seems to net a perfect segmentation.

In [9]:
viewer.add_labels(reordered_labels)

<Labels layer 'reordered_labels' at 0x7fbedc5db760>

In [10]:
_ = detect_mitosis.tracks_to_napari(
    viewer, mitosis_dataframe, name="nuclear_tracks", output=False
)

In [77]:
cluster.close()

In [6]:
transcription_channel_metadata = export_frame_metadata[0]
transcription_channel = channels_full_dataset[0]

In [10]:
viewer.add_image(transcription_channel)

<Image layer 'transcription_channel' at 0x7fc0a0475a50>

In [7]:
from skimage.filters import difference_of_gaussians, threshold_triangle
from skimage.measure import label
from skimage.morphology import remove_small_objects
import numpy as np
from functools import partial
from utils import parallel_computing


def _bandpass_movie(movie, low_sigma, high_sigma):
    """
    Runs bandpass filter using `skimage.filters.difference_of_gaussians` on each
    frame of a movie before collating the resuls in a bandpass-filtered movie of the
    same shape as input `movie`.
    """
    bandpassed_movie = np.empty_like(movie, dtype=float)

    num_timepoints = movie.shape[0]
    for i in range(num_timepoints):
        bandpassed_movie[i] = difference_of_gaussians(movie[i], low_sigma, high_sigma)

    return bandpassed_movie


def _make_spot_labels(movie, threshold, min_size, connectivity):
    """
    Thresholds `movie`, labels connected components in binarized mask as per specified
    connectivity using `skimage.measure.label` and removes objects below size `min_size`.
    """
    try:
        spot_labels = label(movie > threshold, connectivity=connectivity).astype(
            np.uint32
        )
    except TypeError:
        raise Exception("`threshold` option not supported.")

    filtered_labels = remove_small_objects(
        spot_labels, min_size=min_size, out=spot_labels
    )

    return filtered_labels


def _bandpass_movie_parallel(movie, low_sigma, high_sigma, client):
    """
    Runs `skimage.filters.difference_of_gaussians` parallelized across a Dask
    Client passed as `client` on a movie with time along the 0-th axis.
    """
    dog_func = partial(_bandpass_movie, low_sigma=low_sigma, high_sigma=high_sigma)

    bandpassed_movie, bandpassed_movie_futures, _ = parallel_computing.parallelize(
        [movie],
        dog_func,
        client,
        evaluate=True,
        futures_in=False,
        futures_out=True,
    )
    return bandpassed_movie, bandpassed_movie_futures


def _make_spot_labels_parallel(movie, threshold, min_size, connectivity, client):
    """
    Thresholds `movie`, labels connected components in binarized mask as per specified
    connectivity using `skimage.measure.label` and removes objects below size `min_size`
    parallelized across a Dask Dlient passed as `client`.
    """
    spot_labels_func = partial(
        _make_spot_labels,
        threshold=threshold,
        min_size=min_size,
        connectivity=connectivity,
    )

    spot_labels, _, _ = parallel_computing.parallelize(
        [movie],
        spot_labels_func,
        client,
        evaluate=True,
        futures_in=False,
        futures_out=False,
    )

    return spot_labels


def detect_spots(
    spot_movie,
    *,
    low_sigma,
    high_sigma,
    threshold="triangle",
    min_size=0,
    connectivity=None,
    return_bandpass=False,
    client=None,
):
    """
    Constructs a labelled mask separating spots from background, bandpassing and
    thresholding the image and removing objects smaller than the specified size.
    If a Dask Client is passed as a `client` kwarg, the bandpass filtering and
    thresholding will be parallelized across the client.

    :param low_sigma: Sigma to use as the low-pass filter (mainly filters out
        noise). Can be given as float (assumes isotropic sigma) or as sequence/array
        (each element corresponsing the sigma along of the image axes).
    :type low_sigma: scalar or tuple of scalars
    :param high_sigma: Sigma to use as the high-pass filter (removes structured
        background and dims down areas where nuclei are close together that might
        start to coalesce under other morphological operations). Can be given as float
        (assumes isotropic sigma) or as sequence/array (each element corresponsing the
        sigma along of the image axes).
    :type high_sigma: scalar or tuple of scalars
    :param threshold: Threshold below which to clip `spot_movie` after bandpass filter.
        Note that bandpass filtering forces a conversion to normalized float, so the
        threshold should not exceed 1. Setting `threshold="triangle"` uses automatic
        thresholding using the triangle method.
    :type threshold: {"triangle", float}
    :param int min_size: The smallest allowable object size.
    :param int connectivity: The connectivity defining the neighborhood of a pixel
        during small object removal.
    :param client: Dask client to send the computation to.
    :type client: `dask.distributed.client.Client` object.
    :return: Labelled mask of spots in `spot_movie`.
    :rtype: Numpy array of integers.
    """
    if client is None:
        bandpassed_movie = _bandpass_movie(spot_movie, low_sigma, high_sigma)
        bandpassed_movie_futures = None
    else:
        bandpassed_movie, bandpassed_movie_futures = _bandpass_movie_parallel(
            spot_movie, low_sigma, high_sigma, client
        )

    if threshold == "triangle":
        spot_threshold = threshold_triangle(bandpassed_movie)
    else:
        spot_threshold = threshold

    if client is None:
        spot_labels = _make_spot_labels(
            bandpassed_movie, spot_threshold, min_size, connectivity
        )
    else:
        spot_labels = _make_spot_labels_parallel(
            bandpassed_movie_futures, spot_threshold, min_size, connectivity, client
        )

    if not return_bandpass:
        bandpassed_movie = None

    return spot_labels, bandpassed_movie

In [15]:
(((31.6 / 63) / 0.207) / 2) / np.sqrt(3)

0.6994965304191464

In [23]:
%%time

spot_mask, bandpassed_movie = detect_spots(
    transcription_channel,
    low_sigma=[0.1, 0.5, 0.5],
    high_sigma=[3, 1.5, 1.5],
    threshold="triangle",
    min_size=8,
    connectivity=1,
    return_bandpass=True,
    client=None,
)

CPU times: user 31.4 s, sys: 55.5 s, total: 1min 26s
Wall time: 1min 22s


In [8]:
%%time

spot_mask, bandpassed_movie = detect_spots(
    transcription_channel,
    low_sigma=[0.1, 0.5, 0.5],
    high_sigma=[3, 1.5, 1.5],
    threshold="triangle",
    min_size=8,
    connectivity=1,
    return_bandpass=True,
    client=client,
)

CPU times: user 9.65 s, sys: 35.3 s, total: 45 s
Wall time: 54 s


In [17]:
viewer.add_image(bandpassed_movie)

<Image layer 'bandpassed_movie' at 0x7fbf820a3e20>

In [18]:
viewer.add_labels(spot_mask)

<Labels layer 'spot_mask' at 0x7fbf813f2da0>

In [27]:
spot_dataframe_test = track_features.segmentation_df(
    spot_mask, transcription_channel, transcription_channel_metadata
)

In [33]:
spot_dataframe_test["raw_spot"] = None

In [37]:
spot_dataframe_test["raw_spot"] = spot_dataframe_test["raw_spot"].astype(object)

In [39]:
spot_dataframe_test.dtypes

label                int64
z                  float64
y                  float64
x                  float64
frame                int64
t_s                float64
t_frame              int64
frame_reverse        int64
t_frame_reverse      int64
raw_spot            object
dtype: object

In [147]:
def extract_neighborhood(image, coordinates, span):
    """ """
    pixel_coordinates = np.floor(np.asarray(coordinates)).astype(int)
    pixel_span = np.floor(np.asarray(span) / 2).astype(int)
    coordinates_start = pixel_coordinates - pixel_span
    box_dimensions = pixel_span * 2 + 1
    box_indices = tuple((np.indices(box_dimensions).T + coordinates_start).T)

    try:
        neighborhood = image[box_indices]
    except IndexError:
        neighborhood = None

    return neighborhood, coordinates_start

In [142]:
test_neighborhood = extract_neighborhood(image, [7, 106, 439], [5, 15, 15])

In [132]:
image = transcription_channel[41]

In [42]:
test_spot = transcription_channel[41, 7, 100:116, 430:446]

In [50]:
ybounds = np.array([100, 116], dtype=int)
xbounds = np.array([430, 446], dtype=int)

In [51]:
image = transcription_channel[41, 7]

In [52]:
image.shape

(256, 512)

In [69]:
x_mask = (np.arange(image.shape[0]) < 117) & (np.arange(image.shape[0]) > 100)

In [70]:
y_mask = (np.arange(image.shape[1]) < 446) & (np.arange(image.shape[1]) > 430)

In [73]:
test_spot2 = image[yx_mask]

In [41]:
import matplotlib.pyplot as plt

In [None]:
plt.imshow(test_spot2)

In [71]:
yy_mask, xx_mask = np.meshgrid(y_mask, x_mask)

In [72]:
yx_mask = yy_mask & xx_mask

In [None]:
test_spot2

In [100]:
indices = np.indices(np.array([40, 40]))

In [95]:
row, col = indices

In [127]:
list_indices = tuple(indices)

In [128]:
image[list_indices].shape

(40, 40)

In [98]:
plt.imshow(image[230 + row, 420 + col])

IndexError: index 256 is out of bounds for axis 0 with size 256

In [146]:
row + 10

array([[10, 10, 10, ..., 10, 10, 10],
       [11, 11, 11, ..., 11, 11, 11],
       [12, 12, 12, ..., 12, 12, 12],
       ...,
       [47, 47, 47, ..., 47, 47, 47],
       [48, 48, 48, ..., 48, 48, 48],
       [49, 49, 49, ..., 49, 49, 49]])

In [103]:
test_start = np.array([5, 10])

In [106]:
indices * test_start

ValueError: operands could not be broadcast together with shapes (2,40,40) (2,) 

In [113]:
indices_roll = np.moveaxis(indices, 0, -1)

In [143]:
test_neighborhood.shape

(5, 15, 15)