Linajea Tracking Example
=====================


This example show all steps necessary to generate the final tracks, from training the network to finding the optimal hyperparameters on the validation data to computing the tracks on the test data.

- train network
- predict on validation data
- grid search hyperparameters for ILP
  - solve once per set of parameters
  - evaluate once per set of parameters
  - select set with fewest errors
- predict on test data
- solve on test data with optimal parameters
- evaluate on test data

In [1]:
import logging
import os
import sys
import time
import types

import numpy as np
import pandas as pd

from linajea.config import (dump_config,
                            maybe_fix_config_paths_to_machine_and_load,
                            TrackingConfig)
from linajea.utils import (getNextInferenceData,
                           print_time)
import linajea.evaluation
from linajea.process_blockwise import (extract_edges_blockwise,
                                       predict_blockwise,
                                       solve_blockwise)
from linajea.training import train

In [2]:
logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s %(name)s %(levelname)-8s %(message)s')

Configuration
--------------------

All parameters to control the pipeline (e.g. model architecture, data augmentation, training parameters, ILP hyperparameters) are contained in a configuration file (in the TOML format https://toml.io)

You can modify the `config_file` variable to point to the config file you would like to use. Make sure that the file paths contained in it point to the correct destination, for instance that they are adapted to your directory structure.

In [3]:
config_file = "config_example.toml"
config = maybe_fix_config_paths_to_machine_and_load(config_file)
config = TrackingConfig(**config)
#config = TrackingConfig.from_file(config)file
os.makedirs(config.general.setup_dir, exist_ok=True)

Training
------------

To start training simply pass the configuration object to the train function. Make sure that the training data and parameters such as the number of iterations/setps are set correctly.

To train until convergence will take from several hours to multiple days.

In [4]:
train(config)

2022-06-24 06:01:51,511 linajea.training.torch_model INFO     initializing model..
2022-06-24 06:01:54,366 linajea.training.torch_model INFO     getting train/test output shape by running model twice
2022-06-24 06:01:58,372 linajea.training.torch_model INFO     test done
2022-06-24 06:01:58,813 linajea.training.torch_model INFO     train done
2022-06-24 06:01:58,819 linajea.training.train INFO     Model: UnetModelWrapper(
  (unet_cell_ind): UNet(
    (l_conv): ModuleList(
      (0): ConvPass(
        (layers): ModuleList(
          (0): Conv4d(
            (conv3d_layers): ModuleList(
              (0): Conv3d(1, 12, kernel_size=(3, 3, 3), stride=(1, 1, 1))
              (1): Conv3d(1, 12, kernel_size=(3, 3, 3), stride=(1, 1, 1))
              (2): Conv3d(1, 12, kernel_size=(3, 3, 3), stride=(1, 1, 1))
            )
          )
          (1): ReLU()
          (2): Conv4d(
            (conv3d_layers): ModuleList(
              (0): Conv3d(12, 12, kernel_size=(3, 3, 3), stride=(1, 1, 1))

2022-06-24 06:01:58,821 linajea.training.train INFO     Center size: (2, 30, 52, 52)
2022-06-24 06:01:58,821 linajea.training.train INFO     Output size 1: (1, 40, 60, 60)
2022-06-24 06:01:58,821 linajea.training.train INFO     Voxel size: (1, 5, 1, 1)
2022-06-24 06:01:58,824 linajea.training.train INFO     REQUEST: 
	RAW: ROI: [0:7, 0:200, 0:148, 0:148] (7, 200, 148, 148), voxel size: None, interpolatable: None, non-spatial: False, dtype: None, placeholder: False
	CELL_INDICATOR: ROI: [3:4, 80:120, 44:104, 44:104] (1, 40, 60, 60), voxel size: None, interpolatable: None, non-spatial: False, dtype: None, placeholder: False
	CELL_CENTER: ROI: [3:4, 80:120, 44:104, 44:104] (1, 40, 60, 60), voxel size: None, interpolatable: None, non-spatial: False, dtype: None, placeholder: False
	ANCHOR: ROI: [3:4, 85:115, 48:100, 48:100] (1, 30, 52, 52), voxel size: None, interpolatable: None, non-spatial: False, dtype: None, placeholder: False
	RAW_CROPPED: ROI: [3:4, 85:115, 48:100, 48:100] (1, 30, 52

Inference/Tracking
---------------------------

After the training is completed we first have to determine the optimal ILP hyperparameters.
This is achieved by first creating the prediction on the validation data and then performing a grid search by solving the ILP and evaluating the results repeatedly.

`getNextInferenceData` can be used to loop over the samples in the respective dataset, it returns a generator.
If `validation` is set to `True` in `args` the validation data is used, otherwise the test data. Other details (e.g. which training checkpoint to use, which database to store the results in to use) are determined automatically based on the configuration file. Internally it adds an `inference_data` entry that is used by the postprocessing functions such as `*_blockwise` and `evaluate_setup`. This entry is updated automatically after each iteration to point to the correct sample.

MongoDB is used to store the computed results. A `mongod` server has to be running before executing the remaining cells.
See https://www.mongodb.com/docs/manual/administration/install-community/ for a guide on how to install it (Linux/Windows/MacOS)
Alternatively you might want to create a singularity image (https://github.com/singularityhub/mongo). This can be used locally but will be necessary if you want to run the code on an HPC cluster and there is no server installed already.

In [4]:
args = types.SimpleNamespace(
        config=config_file, validation=True, val_param_id=None, param_id=None)

### Predict Validation Data

To predict the `cell_indicator` and `movement_vectors` on the validation data make sure that `args.validation` is set to `True`, then execute the next cell.

Depending on the number of workers used (see config file) and the size of the data this can take a while.

In [6]:
args.validation = True
for inf_config in getNextInferenceData(args):
        predict_blockwise(inf_config)

2022-06-24 09:41:11,157 linajea.utils.check_or_create_db INFO     linajea_celegans_20220624_134111: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb', '_id': ObjectId('62b5bef738f50c4aea2b1c6a')} (created)
2022-06-24 09:41:11,168 linajea.process_blockwise.predict_blockwise INFO     Following ROIs in world units:
2022-06-24 09:41:11,169 linajea.process_blockwise.predict_blockwise INFO     Input ROI       = [45:65, -85:315, -50:690, -50:690] (20, 400, 740, 740)
2022-06-24 09:41:11,170 linajea.process_blockwise.predict_blockwise INFO     Block read  ROI = [-3:4, -85:315, -50:210, -50:210] (7, 400, 260, 260)
2022-06-24 09:41:11,170 linajea.process_blockwise.predict_blockwise INFO     Block write ROI = [0:1, 0:230, 0:160, 0:160] (1, 230, 160, 160)
2022-06-24 09:41:11,171 linajea.process_blockwise.predict_blockwise INFO     Output ROI      = [48:62, 0:230, 0:640, 0:640] (14, 230, 640, 640)
2022-06-24 09:41:11,171 linajea.process_blockwise.predic

2022-06-24 09:45:21,508 daisy.scheduler INFO     

2022-06-24 09:45:21,509 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		33 finished (0 skipped, 33 succeeded, 0 failed), 1 processing, 190 pending
		ETA: 0:13:34.285714
2022-06-24 09:45:31,519 daisy.scheduler INFO     

2022-06-24 09:45:31,520 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		37 finished (0 skipped, 37 succeeded, 0 failed), 2 processing, 185 pending
		ETA: 0:13:12.857143
2022-06-24 09:45:41,531 daisy.scheduler INFO     

2022-06-24 09:45:41,532 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		42 finished (0 skipped, 42 succeeded, 0 failed), 2 processing, 180 pending
		ETA: 0:12:51.428571
2022-06-24 09:45:51,542 daisy.scheduler INFO     

2022-06-24 09:45:51,543 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		47 finished (

2022-06-24 09:50:31,777 daisy.scheduler INFO     

2022-06-24 09:50:31,778 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		184 finished (0 skipped, 184 succeeded, 0 failed), 2 processing, 38 pending
		ETA: 0:01:17.288136
2022-06-24 09:50:41,788 daisy.scheduler INFO     

2022-06-24 09:50:41,790 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		189 finished (0 skipped, 189 succeeded, 0 failed), 1 processing, 34 pending
		ETA: 0:01:09.152542
2022-06-24 09:50:51,800 daisy.scheduler INFO     

2022-06-24 09:50:51,801 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		194 finished (0 skipped, 194 succeeded, 0 failed), 1 processing, 29 pending
		ETA: 0:00:58.983051
2022-06-24 09:51:01,812 daisy.scheduler INFO     

2022-06-24 09:51:01,813 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		199 finish

### Extract Edges Validation Data

For each detected cell, look for neighboring cells in the next time frame and insert an edge candidate for each into the database.

In [5]:
for inf_config in getNextInferenceData(args):
        extract_edges_blockwise(inf_config)

2022-06-24 10:01:13,456 linajea.utils.check_or_create_db INFO     linajea_celegans_20220624_134111: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb'} (accessed)
2022-06-24 10:01:13,462 linajea.process_blockwise.extract_edges_blockwise INFO     Following ROIs in world units:
2022-06-24 10:01:13,463 linajea.process_blockwise.extract_edges_blockwise INFO     Input ROI       = [47:62, -45:250, -45:557, -45:557] (15, 295, 602, 602)
2022-06-24 10:01:13,463 linajea.process_blockwise.extract_edges_blockwise INFO     Block read  ROI = [-1:5, -45:557, -45:557, -45:557] (6, 602, 602, 602)
2022-06-24 10:01:13,464 linajea.process_blockwise.extract_edges_blockwise INFO     Block write ROI = [0:5, 0:512, 0:512, 0:512] (5, 512, 512, 512)
2022-06-24 10:01:13,464 linajea.process_blockwise.extract_edges_blockwise INFO     Output ROI      = [48:62, 0:205, 0:512, 0:512] (14, 205, 512, 512)
2022-06-24 10:01:13,464 linajea.process_blockwise.extract_edges_blockw

### Hyperparameter Grid Search

#### Solve on Validation Data

Make sure that `solve.grid_search` is set to `True`. The parameter sets to try are generated automatically.

In [5]:
config.solve.grid_search = True
args.config = dump_config(config)

In [6]:
import importlib
importlib.reload(linajea.process_blockwise)
parameters_ids = None
for inf_config in getNextInferenceData(args, is_solve=True):
        parameters_ids = linajea.process_blockwise.solve_blockwise(inf_config)

2022-06-24 13:41:09,161 linajea.utils.check_or_create_db INFO     linajea_celegans_20220624_134111: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb'} (accessed)
2022-06-24 13:41:09,186 linajea.utils.candidate_database INFO     Querying ID for parameters {'track_cost': 7, 'weight_node_score': -21, 'selection_constant': 6, 'weight_division': -11, 'division_constant': 6.0, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score': 0.35, 'block_size': [15, 512, 512, 712], 'context': [2, 100, 100, 100], 'max_cell_move': 45, 'feature_func': 'noop', 'val': True, 'cell_cycle_key': {'$exists': False}, 'tag': {'$exists': False}}
2022-06-24 13:41:09,188 linajea.utils.candidate_database INFO     Parameters {'track_cost': 7, 'weight_node_score': -21, 'selection_constant': 6, 'weight_division': -11, 'division_constant': 6.0, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score': 0.35, 'block_size': [15, 512, 512, 712], 'cont

linajea_solving ▶:   0%|          | 0/1 [00:00<?, ?blocks/s]

2022-06-24 13:41:09,561 linajea.process_blockwise.solve_blockwise INFO     Solution roi: [50:60, 0:205, 0:512, 0:512] (10, 205, 512, 512)
2022-06-24 13:41:09,562 linajea.process_blockwise.solve_blockwise INFO     [50:65, 0:512, 0:512, 0:712] (15, 512, 512, 712) linajea_solving/1 with read ROI [48:67, -100:612, -100:612, -100:812] (19, 712, 712, 912) and write ROI [50:65, 0:512, 0:512, 0:712] (15, 512, 512, 712)
2022-06-24 13:41:09,600 linajea.process_blockwise.solve_blockwise INFO     Reading graph with 393 nodes and 414 edges took 0.011182308197021484 seconds
2022-06-24 13:41:09,647 linajea.tracking.track INFO     Solving for key selected_9
2022-06-24 13:41:09,662 linajea.tracking.solver INFO     b'Optimal solution found'
2022-06-24 13:41:09,664 linajea.tracking.solver INFO     costs of solution: -4696.562564
2022-06-24 13:41:09,666 linajea.tracking.track INFO     Solving ILP took 0.01714777946472168 seconds
2022-06-24 13:41:09,672 linajea.tracking.track INFO     Solving for key selec


Execution Summary
-----------------

  Task linajea_solving:

    num blocks : 1
    completed ✔: 1 (skipped 0)
    failed    ✗: 0
    orphaned  ∅: 0

    all blocks processed successfully


#### Evaluate on Validation Data

In [6]:
import importlib
importlib.reload(linajea.evaluation)
args.param_ids = [9, 6, 52, 17, 43]

print(args.param_ids)
#args.param_ids = parameters_ids
for inf_config in getNextInferenceData(args, is_evaluate=True):
        t = linajea.evaluation.evaluate_setup(inf_config)
        print(t)

[9, 6, 52, 17, 43]
[9, 6, 52, 17, 43] 9 6 [9, 6, 52, 17, 43]


2022-06-24 13:44:38,514 linajea.utils.check_or_create_db INFO     linajea_celegans_20220624_134111: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb'} (accessed)
2022-06-24 13:44:38,541 linajea.utils.get_next_inference_data INFO     getting params {'track_cost': 7, 'weight_node_score': -21, 'selection_constant': 6, 'weight_division': -11, 'division_constant': 6.0, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score': 0.35, 'block_size': [15, 512, 512, 712], 'context': [2, 100, 100, 100], 'max_cell_move': 45, 'roi': {'offset': [50, 0, 0, 0], 'shape': [10, 205, 512, 512]}, 'feature_func': 'noop', 'val': True} (id: 9) from database linajea_celegans_20220624_134111 (sample: /nrs/funke/hirschp/mskcc_emb)
2022-06-24 13:44:38,541 linajea.utils.get_next_inference_data INFO     getting params {'track_cost': 7, 'weight_node_score': -13, 'selection_constant': 9, 'weight_division': -8, 'division_constant': 2.5, 'weight_child': 2.0, 'w

2022-06-24 13:44:39,573 linajea.evaluation.evaluate INFO     Matching GT edges to REC edges...


[SolveParametersMinimalConfig(track_cost=7, weight_node_score=-21, selection_constant=6, weight_division=-11, division_constant=6.0, weight_child=1.0, weight_continuation=-1.0, weight_edge_score=0.35, cell_cycle_key=None, block_size=[15, 512, 512, 712], context=[2, 100, 100, 100], max_cell_move=45, roi=DataROIConfig(offset=[50, 0, 0, 0], shape=[10, 205, 512, 512]), feature_func='noop', val=True, tag=None), SolveParametersMinimalConfig(track_cost=7, weight_node_score=-13, selection_constant=9, weight_division=-8, division_constant=2.5, weight_child=2.0, weight_continuation=-1.0, weight_edge_score=0.35, cell_cycle_key=None, block_size=[15, 512, 512, 712], context=[2, 100, 100, 100], max_cell_move=45, roi=DataROIConfig(offset=[50, 0, 0, 0], shape=[10, 205, 512, 512]), feature_func='noop', val=True, tag=None), SolveParametersMinimalConfig(track_cost=7, weight_node_score=-13, selection_constant=12, weight_division=-11, division_constant=2.5, weight_child=2.0, weight_continuation=-1.0, weigh

2022-06-24 13:44:39,742 linajea.evaluation.match INFO     Done matching, found 233 matches and 0 edge fps
2022-06-24 13:44:39,742 linajea.evaluation.evaluate INFO     Done matching. Evaluating
2022-06-24 13:44:39,764 linajea.evaluation.evaluator INFO     Getting AEFTL and ERL
2022-06-24 13:44:39,775 linajea.evaluation.evaluator INFO     Getting perfect segments
2022-06-24 13:44:39,786 linajea.evaluation.evaluator INFO     track range 59 50
2022-06-24 13:44:39,789 linajea.evaluation.evaluator INFO     error free tracks: 27/30 0.9642857142857143
2022-06-24 13:44:39,813 linajea.evaluation.evaluate_setup INFO     Done evaluating results for 9. Saving results to mongo.
2022-06-24 13:44:39,814 linajea.evaluation.evaluate_setup INFO     Result summary: {'gt_tracks': 24, 'rec_tracks': 27, 'gt_matched_tracks': 24, 'rec_matched_tracks': 24, 'gt_edges': 234, 'rec_edges': 252, 'matched_edges': 233, 'gt_divisions': 4, 'rec_divisions': 4, 'fp_edges': 19, 'fn_edges': 0, 'identity_switches': 0, 'fp_di

2022-06-24 13:44:39,930 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:39,931 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 18
2022-06-24 13:44:39,932 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:39,932 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 18
2022-06-24 13:44:39,933 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:39,933 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:39,933 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:39,934 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:39,934 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 5

<linajea.evaluation.report.Report object at 0x1552aab26e20>


2022-06-24 13:44:40,123 linajea.evaluation.match INFO     Done matching, found 234 matches and 0 edge fps
2022-06-24 13:44:40,123 linajea.evaluation.evaluate INFO     Done matching. Evaluating
2022-06-24 13:44:40,145 linajea.evaluation.evaluator INFO     Getting AEFTL and ERL
2022-06-24 13:44:40,155 linajea.evaluation.evaluator INFO     Getting perfect segments
2022-06-24 13:44:40,165 linajea.evaluation.evaluator INFO     track range 59 50
2022-06-24 13:44:40,167 linajea.evaluation.evaluator INFO     error free tracks: 28/30 1.0
2022-06-24 13:44:40,177 linajea.evaluation.evaluate_setup INFO     Done evaluating results for 6. Saving results to mongo.
2022-06-24 13:44:40,178 linajea.evaluation.evaluate_setup INFO     Result summary: {'gt_tracks': 24, 'rec_tracks': 26, 'gt_matched_tracks': 24, 'rec_matched_tracks': 24, 'gt_edges': 234, 'rec_edges': 252, 'matched_edges': 234, 'gt_divisions': 4, 'rec_divisions': 4, 'fp_edges': 18, 'fn_edges': 0, 'identity_switches': 0, 'fp_divisions': 0, 'i

2022-06-24 13:44:40,306 linajea.utils.candidate_database INFO     Parameters {'track_cost': 7, 'weight_node_score': -13, 'selection_constant': 6, 'weight_division': -8, 'division_constant': 2.5, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score': 0.35, 'block_size': [15, 512, 512, 712], 'context': [2, 100, 100, 100], 'max_cell_move': 45, 'feature_func': 'noop', 'val': True, 'cell_cycle_key': {'$exists': False}, 'tag': {'$exists': False}} already in collection with id 17
2022-06-24 13:44:40,324 linajea.evaluation.evaluate_setup INFO     Evaluating mskcc_emb in [50:60, 0:205, 0:512, 0:512] (10, 205, 512, 512)
2022-06-24 13:44:40,344 linajea.evaluation.evaluate_setup INFO     Reading cells and edges in db linajea_celegans_20220624_134111 with parameter_id 17
2022-06-24 13:44:40,355 linajea.evaluation.evaluate_setup INFO     Read 305 cells and 279 edges in 0.009444236755371094 seconds
2022-06-24 13:44:40,364 linajea.evaluation.evaluate_setup INFO     track begin: 50, tra

<linajea.evaluation.report.Report object at 0x155262fd5a00>
False


2022-06-24 13:44:40,562 linajea.evaluation.match INFO     Done matching, found 233 matches and 0 edge fps
2022-06-24 13:44:40,562 linajea.evaluation.evaluate INFO     Done matching. Evaluating
2022-06-24 13:44:40,583 linajea.evaluation.evaluator INFO     Getting AEFTL and ERL
2022-06-24 13:44:40,593 linajea.evaluation.evaluator INFO     Getting perfect segments
2022-06-24 13:44:40,604 linajea.evaluation.evaluator INFO     track range 59 50
2022-06-24 13:44:40,607 linajea.evaluation.evaluator INFO     error free tracks: 27/30 0.9642857142857143
2022-06-24 13:44:40,616 linajea.evaluation.evaluate_setup INFO     Done evaluating results for 17. Saving results to mongo.
2022-06-24 13:44:40,617 linajea.evaluation.evaluate_setup INFO     Result summary: {'gt_tracks': 24, 'rec_tracks': 27, 'gt_matched_tracks': 24, 'rec_matched_tracks': 24, 'gt_edges': 234, 'rec_edges': 252, 'matched_edges': 233, 'gt_divisions': 4, 'rec_divisions': 4, 'fp_edges': 19, 'fn_edges': 0, 'identity_switches': 0, 'fp_d

2022-06-24 13:44:40,737 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:40,737 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 18
2022-06-24 13:44:40,738 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:40,738 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 18
2022-06-24 13:44:40,738 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:40,739 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:40,739 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:40,739 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 59, track len: 10
2022-06-24 13:44:40,739 linajea.evaluation.evaluate_setup INFO     track begin: 50, track end: 5

<linajea.evaluation.report.Report object at 0x1552a80941f0>


2022-06-24 13:44:40,931 linajea.evaluation.match INFO     Done matching, found 234 matches and 0 edge fps
2022-06-24 13:44:40,931 linajea.evaluation.evaluate INFO     Done matching. Evaluating
2022-06-24 13:44:40,953 linajea.evaluation.evaluator INFO     Getting AEFTL and ERL
2022-06-24 13:44:40,963 linajea.evaluation.evaluator INFO     Getting perfect segments
2022-06-24 13:44:40,973 linajea.evaluation.evaluator INFO     track range 59 50
2022-06-24 13:44:40,976 linajea.evaluation.evaluator INFO     error free tracks: 28/30 1.0
2022-06-24 13:44:40,985 linajea.evaluation.evaluate_setup INFO     Done evaluating results for 43. Saving results to mongo.
2022-06-24 13:44:40,986 linajea.evaluation.evaluate_setup INFO     Result summary: {'gt_tracks': 24, 'rec_tracks': 26, 'gt_matched_tracks': 24, 'rec_matched_tracks': 24, 'gt_edges': 234, 'rec_edges': 252, 'matched_edges': 234, 'gt_divisions': 4, 'rec_divisions': 4, 'fp_edges': 18, 'fn_edges': 0, 'identity_switches': 0, 'fp_divisions': 0, '

<linajea.evaluation.report.Report object at 0x1552a80940a0>


### Predict Test Data

Now that we know which ILP hyperparameters to use we can predict the `cell_indicator` and `movement_vectors` on the test data and compute the tracks. Make sure that `args.validation` is set to `False` and `solve.grid_search` and `solve.random_search` are set to `False`.

In [5]:
config.solve.grid_search = False
config.solve.random_search = False
args.config = dump_config(config)
args.validation = False

In [8]:
for inf_config in getNextInferenceData(args):
        predict_blockwise(inf_config)

2022-06-24 10:50:02,323 linajea.utils.check_or_create_db INFO     linajea_celegans_20220623_180937: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb3'} (accessed)
2022-06-24 10:50:02,331 linajea.process_blockwise.predict_blockwise INFO     Following ROIs in world units:
2022-06-24 10:50:02,331 linajea.process_blockwise.predict_blockwise INFO     Input ROI       = [45:65, -85:315, -50:690, -50:690] (20, 400, 740, 740)
2022-06-24 10:50:02,332 linajea.process_blockwise.predict_blockwise INFO     Block read  ROI = [-3:4, -85:315, -50:210, -50:210] (7, 400, 260, 260)
2022-06-24 10:50:02,332 linajea.process_blockwise.predict_blockwise INFO     Block write ROI = [0:1, 0:230, 0:160, 0:160] (1, 230, 160, 160)
2022-06-24 10:50:02,332 linajea.process_blockwise.predict_blockwise INFO     Output ROI      = [48:62, 0:230, 0:640, 0:640] (14, 230, 640, 640)
2022-06-24 10:50:02,332 linajea.process_blockwise.predict_blockwise INFO     Starting block-wise pr

2022-06-24 10:54:02,661 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		169 finished (143 skipped, 26 succeeded, 0 failed), 2 processing, 53 pending
		ETA: 0:04:04.615385
2022-06-24 10:54:12,671 daisy.scheduler INFO     

2022-06-24 10:54:12,672 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		174 finished (143 skipped, 31 succeeded, 0 failed), 2 processing, 48 pending
		ETA: 0:03:41.538462
2022-06-24 10:54:22,683 daisy.scheduler INFO     

2022-06-24 10:54:22,683 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		180 finished (144 skipped, 36 succeeded, 0 failed), 2 processing, 42 pending
		ETA: 0:03:13.846154
2022-06-24 10:54:32,694 daisy.scheduler INFO     

2022-06-24 10:54:32,695 daisy.scheduler INFO     
	BlockwiseTask processing 224 blocks with 1 workers (1 aliases online)
		185 finished (144 skipped, 41 succeeded, 0 failed), 2 proc

### Solve on Test Data

Then we can solve the ILP on the test data. We select the hyperparameters that resulted in the lowest overall number of errors on the validation data.

In [6]:
score_columns = ['fn_edges', 'identity_switches',
                 'fp_divisions', 'fn_divisions']
if not config.general.sparse:
    score_columns = ['fp_edges'] + score_columns

sort_by = "sum_errors"
results = {}
args.validation = True
for sample_idx, inf_config in enumerate(getNextInferenceData(args,
                                                             is_evaluate=True)):
    sample = inf_config.inference.data_source.datafile.filename
    print("getting results for:", sample)

    res = linajea.evaluation.get_results_sorted(
        inf_config,
        filter_params={"val": True},
        score_columns=score_columns,
        sort_by=sort_by)

    results[os.path.basename(sample)] = res.reset_index()
args.validation = False

results = pd.concat(list(results.values())).reset_index()
del results['param_id']
del results['_id']

by = [
    #"cell_cycle_key",
    #"filter_polar_bodies_key",
    "matching_threshold",
    "weight_node_score",
    "selection_constant",
    "track_cost",
    "weight_division",
    "division_constant",
    "weight_child",
    "weight_continuation",
    "weight_edge_score",
]
results = results.groupby(by, dropna=False, as_index=False).agg(
    lambda x: -1 if len(x) != sample_idx+1 else sum(x))

results = results[results.sum_errors != -1]
results.sort_values(sort_by, ascending=False, inplace=True)

#print(results)

config.solve.parameters = [config.solve.parameters[0]]
config.solve.parameters[0].weight_node_score = float(results.at[0, 'weight_node_score'])
config.solve.parameters[0].selection_constant = float(results.at[0, 'selection_constant'])
config.solve.parameters[0].track_cost = float(results.at[0, 'track_cost'])
config.solve.parameters[0].weight_edge_score = float(results.at[0, 'weight_edge_score'])
config.solve.parameters[0].weight_division = float(results.at[0, 'weight_division'])
config.solve.parameters[0].weight_child = float(results.at[0, 'weight_child'])
config.solve.parameters[0].weight_continuation = float(results.at[0, 'weight_continuation'])
print(config.solve.parameters[0], type(config.solve.parameters[0].weight_continuation))
args.config = dump_config(config)

2022-06-27 11:04:01,266 linajea.utils.check_or_create_db INFO     linajea_celegans_20220624_134111: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb'} (accessed)
2022-06-27 11:04:01,267 linajea.evaluation.analyze_results INFO     checking db: linajea_celegans_20220624_134111
2022-06-27 11:04:01,295 linajea.utils.candidate_database INFO     Query: {'val': True, 'matching_threshold': 15, 'validation_score': False, 'window_size': 50, 'filter_short_tracklets_len': -1, 'ignore_one_off_div_errors': False, 'fn_div_count_unconnected_parent': True, 'sparse': False}
2022-06-27 11:04:01,297 linajea.utils.candidate_database INFO     Found 7 scores


[SolveParametersMinimalConfig(track_cost=20.675083335777686, weight_node_score=-16.786125363574673, selection_constant=8.893522542354866, weight_division=-8.260920358667896, division_constant=5.9535746121046165, weight_child=1.003401421433546, weight_continuation=-1.0264581546174496, weight_edge_score=0.4254227576366175, cell_cycle_key=None, block_size=[15, 512, 512, 712], context=[2, 100, 100, 100], max_cell_move=45, roi=None, feature_func='noop', val=False, tag=None)] 1
getting results for: /nrs/funke/hirschp/mskcc_emb
SolveParametersMinimalConfig(track_cost=7.0, weight_node_score=-21.0, selection_constant=6.0, weight_division=-11.0, division_constant=5.9535746121046165, weight_child=1.0, weight_continuation=-1.0, weight_edge_score=0.35, cell_cycle_key=None, block_size=[15, 512, 512, 712], context=[2, 100, 100, 100], max_cell_move=None, roi=None, feature_func='noop', val=False, tag=None) <class 'float'>


  results = results.groupby(by, dropna=False, as_index=False).agg(


In [19]:
for inf_config in getNextInferenceData(args, is_solve=True):
        solve_blockwise(inf_config)

2022-06-24 13:51:58,318 linajea.utils.check_or_create_db INFO     linajea_celegans_20220623_180937: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb3'} (accessed)
2022-06-24 13:51:58,344 linajea.utils.candidate_database INFO     Querying ID for parameters {'track_cost': 7.0, 'weight_node_score': -21.0, 'selection_constant': 6.0, 'weight_division': -11.0, 'division_constant': 5.9535746121046165, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score': 0.35, 'block_size': [15, 512, 512, 712], 'context': [2, 100, 100, 100], 'max_cell_move': 45, 'feature_func': 'noop', 'val': False, 'cell_cycle_key': {'$exists': False}, 'tag': {'$exists': False}}
2022-06-24 13:51:58,348 linajea.utils.candidate_database INFO     Parameters {'track_cost': 7.0, 'weight_node_score': -21.0, 'selection_constant': 6.0, 'weight_division': -11.0, 'division_constant': 5.9535746121046165, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score'

linajea_solving ▶:   0%|          | 0/1 [00:00<?, ?blocks/s]

2022-06-24 13:51:58,561 linajea.process_blockwise.solve_blockwise INFO     Block write roi: [50:65, 0:512, 0:512, 0:712] (15, 512, 512, 712)
2022-06-24 13:51:58,562 linajea.process_blockwise.solve_blockwise INFO     Solution roi: [50:60, 0:205, 0:512, 0:512] (10, 205, 512, 512)
2022-06-24 13:51:58,562 linajea.process_blockwise.solve_blockwise INFO     [50:65, 0:512, 0:512, 0:712] (15, 512, 512, 712) linajea_solving/1 with read ROI [48:67, -100:612, -100:612, -100:812] (19, 712, 712, 912) and write ROI [50:65, 0:512, 0:512, 0:712] (15, 512, 512, 712)
2022-06-24 13:51:58,563 linajea.process_blockwise.solve_blockwise INFO     Write roi: [50:65, 0:512, 0:512, 0:712] (15, 512, 512, 712)
2022-06-24 13:51:58,600 linajea.process_blockwise.solve_blockwise INFO     Reading graph with 426 nodes and 292 edges took 0.012198925018310547 seconds
2022-06-24 13:51:58,649 linajea.tracking.track INFO     Solving for key selected_3
2022-06-24 13:51:58,664 linajea.tracking.solver INFO     b'Optimal solutio


Execution Summary
-----------------

  Task linajea_solving:

    num blocks : 1
    completed ✔: 1 (skipped 0)
    failed    ✗: 0
    orphaned  ∅: 0

    all blocks processed successfully


### Evaluate on Test Data

In [20]:
for inf_config in getNextInferenceData(args, is_evaluate=True):
        linajea.evaluation.evaluate_setup(inf_config)

2022-06-24 13:52:12,065 linajea.utils.check_or_create_db INFO     linajea_celegans_20220623_180937: {'setup_dir': 'mskc_test_1', 'iteration': 10, 'cell_score_threshold': 0.2, 'sample': 'mskcc_emb3'} (accessed)
2022-06-24 13:52:12,067 linajea.evaluation.evaluate_setup INFO     roi None DataROIConfig(offset=(50, 0, 0, 0), shape=(10, 205, 512, 512))
2022-06-24 13:52:12,087 linajea.utils.candidate_database INFO     Querying ID for parameters {'track_cost': 7.0, 'weight_node_score': -21.0, 'selection_constant': 6.0, 'weight_division': -11.0, 'division_constant': 5.9535746121046165, 'weight_child': 1.0, 'weight_continuation': -1.0, 'weight_edge_score': 0.35, 'block_size': [15, 512, 512, 712], 'context': [2, 100, 100, 100], 'max_cell_move': 45, 'feature_func': 'noop', 'val': False, 'cell_cycle_key': {'$exists': False}, 'tag': {'$exists': False}}
2022-06-24 13:52:12,090 linajea.utils.candidate_database INFO     Parameters {'track_cost': 7.0, 'weight_node_score': -21.0, 'selection_constant': 6.

[SolveParametersMinimalConfig(track_cost=7.0, weight_node_score=-21.0, selection_constant=6.0, weight_division=-11.0, division_constant=5.9535746121046165, weight_child=1.0, weight_continuation=-1.0, weight_edge_score=0.35, cell_cycle_key=None, block_size=[15, 512, 512, 712], context=[2, 100, 100, 100], max_cell_move=45, roi=None, feature_func='noop', val=False, tag=None)] 1


2022-06-24 13:52:12,316 linajea.evaluation.match INFO     Done matching, found 147 matches and 0 edge fps
2022-06-24 13:52:12,316 linajea.evaluation.evaluate INFO     Done matching. Evaluating
2022-06-24 13:52:12,335 linajea.evaluation.evaluator INFO     Getting AEFTL and ERL
2022-06-24 13:52:12,342 linajea.evaluation.evaluator INFO     Getting perfect segments
2022-06-24 13:52:12,351 linajea.evaluation.evaluator INFO     track range 59 50
2022-06-24 13:52:12,352 linajea.evaluation.evaluator INFO     error free tracks: 0/0 0.0
2022-06-24 13:52:12,360 linajea.evaluation.evaluate_setup INFO     Done evaluating results for 3. Saving results to mongo.
2022-06-24 13:52:12,361 linajea.evaluation.evaluate_setup INFO     Result summary: {'gt_tracks': 24, 'rec_tracks': 29, 'gt_matched_tracks': 24, 'rec_matched_tracks': 24, 'gt_edges': 226, 'rec_edges': 176, 'matched_edges': 147, 'gt_divisions': 2, 'rec_divisions': 3, 'fp_edges': 29, 'fn_edges': 79, 'identity_switches': 0, 'fp_divisions': 1, 'is