# Lineage guesser demo

## Prerequisites

Install `bread` : see https://ninivert.github.io/bread/readme.html#installation

Install packages :

```
# Optional (for this notebook, pandas is used. Matplotlib might be used in the future for visualizations)
pip install pandas
```

Get example data : download and extract https://drive.google.com/file/d/1jJ0lyGsGzDBvqd-zdS485kXeUtrwqqOa/view?usp=sharing

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from bread.algo.lineage import LineageGuesserBudLum, LineageGuesserExpansionSpeed, accuracy
from bread.data import Lineage, Segmentation, Microscopy

Load the data

Note : the segmentations loaded here have been corrected manually. Expect lesser performance for uncorrected segmentations obtained from YeaZ.

In [26]:
# using compressed numpy files (takes less space, but needs to preprocessed to be stored in this format)
# from glob import glob
# segmentation = Segmentation.from_npzs(sorted(glob('data/segmentations/colony003/*.npz')))
# microscopy_budneck = Microscopy.from_npzs(sorted(glob('data/microscopy_budneck/colony003/*.npz')))

# using raw microscopy (takes more space, but can be exported straight from Fiji)
segmentation = Segmentation.from_h5('../data/colony003_segmentation.h5')
microscopy_budneck = Microscopy.from_tiff('../data/colony003_GFP.tif')

In [27]:
# HACK : YeaZ doesn't save changes to the last frame, so the tracking is essentially garbage.
# We ignore the last frame here
# segmentation = Segmentation(segmentation.data[:-1])
# microscopy_budneck = Segmentation(microscopy_budneck.data[:-1])

## Tip : generate an empty lineage file for manual completion

In [28]:
segmentation.find_buds().save_csv('outputs/colony003_lineage_empty.csv')

## Lineage guesser using bud expansion velocities

This method needs only the segmentation file from YeaZ.

Expect ~75% accuracy.

In [5]:
?LineageGuesserExpansionSpeed

[0;31mInit signature:[0m
[0mLineageGuesserExpansionSpeed[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0msegmentation[0m[0;34m:[0m [0mbread[0m[0;34m.[0m[0mdata[0m[0;34m.[0m[0m_data[0m[0;34m.[0m[0mSegmentation[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnn_threshold[0m[0;34m:[0m [0mfloat[0m [0;34m=[0m [0;36m8[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mflexible_nn_threshold[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnum_frames[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;36m5[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mignore_dist_nan[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mTrue[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mbud_distance_max[0m[0;34m:[0m [0mfloat[0m [0;34m=[0m [0;36m7[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m [0;34m->[0m [0;32mNone[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
Guess lineage relations by maximizing the expansion velocity of the 

In [6]:
guesser_expspeed = LineageGuesserExpansionSpeed(
	segmentation=segmentation,
	# see docstring for more options
)
guesser_expspeed

LineageGuesserExpansionSpeed(segmentation=Segmentation(num_frames=181, frame_height=575, frame_width=625), nn_threshold=8, flexible_nn_threshold=False, num_frames=5, ignore_dist_nan=True, bud_distance_max=7)

In [7]:
lineage_expspeed = guesser_expspeed.guess_lineage()



## Lineage guesser using budneck marker

This method needs the segmentation file from YeaZ and the budneck movie (GFP marker).

Expect near perfect (>95%) accuracy.

In [8]:
?LineageGuesserBudLum

[0;31mInit signature:[0m
[0mLineageGuesserBudLum[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mbudneck_img[0m[0;34m:[0m [0mbread[0m[0;34m.[0m[0mdata[0m[0;34m.[0m[0m_data[0m[0;34m.[0m[0mMicroscopy[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0msegmentation[0m[0;34m:[0m [0mbread[0m[0;34m.[0m[0mdata[0m[0;34m.[0m[0m_data[0m[0;34m.[0m[0mSegmentation[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnn_threshold[0m[0;34m:[0m [0mfloat[0m [0;34m=[0m [0;36m8[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mflexible_nn_threshold[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mkernel_N[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;36m30[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mkernel_sigma[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;36m1[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0moffset_frames[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;36m0[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnum_frames[0m

In [9]:
guesser_budlum = LineageGuesserBudLum(
	segmentation=segmentation,
	budneck_img=microscopy_budneck,
	# see docstring for more options
)
guesser_budlum

LineageGuesserBudLum(budneck_img=Microscopy(num_frames=181, frame_height=575, frame_width=625), segmentation=Segmentation(num_frames=181, frame_height=575, frame_width=625), nn_threshold=8, flexible_nn_threshold=False, kernel_N=30, kernel_sigma=1, offset_frames=0, num_frames=5)

In [10]:
lineage_budlum = guesser_budlum.guess_lineage()



## Evaluating accuracy of the guesses

In [11]:
lineage_truth = Lineage.from_csv('../data/colony003_lineage.csv')

In [12]:
lineage_truth

Lineage(parent_ids=array([-1, -1,  1,  2,  4,  2,  1,  3,  6,  2,  4,  5,  6,  2, 10,  9,  4,
       11,  5,  3, 12,  8,  1,  7, 14,  2,  6, 10, 15, 13,  4, 20,  3,  9,
       11, 26, 17,  5, 18,  2, 14, 19,  6, 27, 12,  3, 22, 15,  4, 31,  8,
        1, 25, 33,  5, 11, 23, 38,  6, 10,  2, 24, 16, 20,  7, 19, 21, 42,
       12,  3, 18,  9, 32, 43, 26, 13, -2, 46, 23,  4, 45,  5, 36, 28, 25,
       55, 37, 49,  6, 15, 24, 22, 77, 47, 33, 27,  8]), bud_ids=array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,
       52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68,
       69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85,
       86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97]), time_ids=array([  0,   0,  19,  20,  39,  40,  42,  60,  61,  62,  63,  67,  81,
        82,  84,  85,  86,  

In [13]:
accuracy(lineage_truth, lineage_expspeed, strict=False)

0.7311827956989247

In [14]:
accuracy(lineage_truth, lineage_budlum, strict=False)

0.9894736842105263

## Visualize differences

Note : some ParentID's are special codes documented in `bread.data.Lineage.SpecialParentIDs`

In [15]:
print(Lineage.SpecialParentIDs.__doc__)

Special parent IDs attributed in lineages to specify exceptions.
		
		Attributes
		----------
		PARENT_OF_ROOT : int = -1
			parent of a cell that already exists in first frame of colony
		PARENT_OF_EXTERNAL : int = -2
			parent of a cell that does not belong to the colony
		NO_GUESS : int = -3
			parent of cell for which the algorithm failed to guess
		


In [16]:
from visualize_lineages import visualize_lineages

In [17]:
visualize_lineages(lineage_truth, lineage_budlum)

  return df.style\


ParentID (truth),ParentID (predicted),BudID,FrameID
-1,-1,1,0
-1,-1,2,0
1,1,3,19
2,2,4,20
4,4,5,39
2,2,6,40
1,1,7,42
3,3,8,60
6,6,9,61
2,2,10,62


In [18]:
visualize_lineages(lineage_truth, lineage_expspeed)

  return df.style\


ParentID (truth),ParentID (predicted),BudID,FrameID
-1,-1,1,0
-1,-1,2,0
1,-2,3,19
2,2,4,20
4,4,5,39
2,2,6,40
1,1,7,42
3,3,8,60
6,6,9,61
2,2,10,62


## Saving predicted lineages

In [19]:
lineage_expspeed.save_csv('outputs/colony003_lineage_expspeed.csv')
lineage_budlum.save_csv('outputs/colony003_lineage_budlum.csv')

## Rating accuracy of the algorithms

In [20]:
import warnings

colony_ids = ['001', '002', '003', '007']
accuracies = { 'budlum': [], 'expspeed': [] }

for colony_id in colony_ids:
	print(f'processing {colony_id}')
	lineage_truth = Lineage.from_csv(f'../data/colony{colony_id}_lineage.csv')

	with warnings.catch_warnings():
		warnings.simplefilter("ignore")
		
		segmentation = Segmentation.from_h5(f'../data/colony{colony_id}_segmentation.h5')
		microscopy_budneck = Microscopy.from_tiff(f'../data/colony{colony_id}_GFP.tif')

		segmentation = Segmentation(segmentation.data[:-1])
		microscopy_budneck = Segmentation(microscopy_budneck.data[:-1])

		guesser_budlum = LineageGuesserBudLum(
			segmentation=segmentation,
			budneck_img=microscopy_budneck,
			# see docstring for more options
		)
		lineage_budlum = guesser_budlum.guess_lineage()

		guesser_expspeed = LineageGuesserExpansionSpeed(
			segmentation=segmentation,
			# see docstring for more options
		)
		lineage_expspeed = guesser_expspeed.guess_lineage()

		accuracies['budlum'].append(accuracy(lineage_truth, lineage_budlum, strict=False))
		accuracies['expspeed'].append(accuracy(lineage_truth, lineage_expspeed, strict=False))

processing 001
processing 002
processing 003
processing 007


In [21]:
import pandas as pd
pd.DataFrame(accuracies)

Unnamed: 0,budlum,expspeed
0,1.0,0.746835
1,0.986207,0.805556
2,1.0,0.717391
3,1.0,0.805556


## Tip : using the CLI

Get help by running

```python -m bread.cli lineage --help```

Or for the individual lineage subcommands

```
python -m bread.cli lineage budneck --help
python -m bread.cli lineage expansion_speed --help
```

Note : ``bread.cli tracker`` is another project I was working on, but still a work in progress

In [22]:
!cd ../.. && \
python -m bread.cli lineage \
	--segmentation-file="examples/data/colony003_segmentation.h5" \
	--output-file="examples/lineage/outputs/colony003_lineage_budlum_cli.csv" \
	budneck \
	--budneck-file="examples/data/colony003_GFP.tif"

INFO:bread.cli:Loading segmentation...
INFO:bread.cli:Loaded segmentation Segmentation(num_frames=181, frame_height=575, frame_width=625)
INFO:bread.cli:Loading budneck channel movie...
INFO:bread.cli:Loaded budneck channel movie Microscopy(num_frames=181, frame_height=575, frame_width=625)
INFO:bread.cli:Loading guesser...
INFO:bread.cli:Loaded guesser LineageGuesserBudLum(budneck_img=Microscopy(num_frames=181, frame_height=575, frame_width=625), segmentation=Segmentation(num_frames=181, frame_height=575, frame_width=625), nn_threshold=8, flexible_nn_threshold=False, kernel_N=30, kernel_sigma=1, offset_frames=0, num_frames=5)
INFO:bread.cli:Running guesser...
INFO:bread.cli:Saving lineage...


In [23]:
!diff outputs/colony003_lineage_budlum.csv outputs/colony003_lineage_budlum_cli.csv  # should be the same !

In [24]:
!cd ../.. && \
python -m bread.cli lineage \
	--segmentation-file="examples/data/colony003_segmentation.h5" \
	--output-file="examples/lineage/outputs/colony003_lineage_expspeed_cli.csv" \
	expansion_speed

INFO:bread.cli:Loading segmentation...
INFO:bread.cli:Loaded segmentation Segmentation(num_frames=181, frame_height=575, frame_width=625)
INFO:bread.cli:Loading guesser...
INFO:bread.cli:Loaded guesser LineageGuesserExpansionSpeed(segmentation=Segmentation(num_frames=181, frame_height=575, frame_width=625), nn_threshold=8, flexible_nn_threshold=False, num_frames=5, ignore_dist_nan=True, bud_distance_max=7)
INFO:bread.cli:Running guesser...
INFO:bread.cli:Saving lineage...
