-
Notifications
You must be signed in to change notification settings - Fork 0
Training Dataset
For prediction we recommend stacks of 5 slices with a 0.2µm step, in range [-0.6µm, -1.4µm] (relatively to the focal plane), however for the training procedure we advise to acquire stacks of about 25 slices with a 0.1µm step in the range [-0.1µm, -2µm]. This will allow to improve robustness to loss of focus and to defects in flatness of the agar-pad thanks to a data augmentation step that selects randomly 5 slices with a 0.2µm within the 25 slices.
TaLiSSman network is trained in a supervised way, which means labeled images (i.e. images with one distinct integer value per bacteria) are required. We recommend to use a constitutive cytoplasmic fluorescent marker, that can be segmented using rule-driven algorithm (e.g. as in bacmman's example dataset2 configuration) and manually curated.
In this tutorial we will assume that this fluorescence channel is available as well as a method to segment it.
Note that all those steps are also detailed in the article Ollion et al. Nature Protocols, 2019.. It is recommended to read it, in particular to get familiar with the notions used in bacmman (channel image, object class, pre-processing, processing pipeline, selections, etc...)
- Follow instructions in this page to install BACMMAN.
- In the Deep Learning section, follow the instructions for Tensorflow 2.x with CPU support (GPU is for advanced users)
- Start BACMMAN from the menu
Plugins
>BACMMAN
>BACteria in Mother Machine ANalyzer
- If not already done, from the
Home
tab: right-click onWorking Directory
and choose a folder that will contain the data. - Create a new dataset using the provided configuration template:
- From the menu
Dataset
selectNew dataset from Github
- In the github credentials enter
jeanollion
as username (first text area) and click onConnect
- Choose the configuration:
TaLiSSman
>training
- Click
Ok
. - This configuration includes a deep learning denoising step. To download the model weights:
- From the
Import
menu chooseDL Model Library
- If necessary: In the github credentials enter
jeanollion
as username and pressenter
- Unfold the item:
TaLiSSman
>bacteria denoising
, double click on the link. If a web browser does not open automatically, the link will be copied to the clipboard, paste it into a web browser. - Download and unzip the weights into a folder named
DLModels
located in theWorking Directory
previously set. - Close the
DL Model Library
Note: In this example configuration template is stored on a github account, which allows management of configuration templates, for further details see the documentation on github library.
- Download the example dataset and unzip the example dataset.
- Import images using the command
Run
>Import/re-link Images
and select the folder containing the images. The 4 imported positions will be displayed in thePosition
panel of theHome
tab. - Run Pre-Processing and Segmentation: in the
Home
tab, selectPre-Processing
andSegmentation and Tracking
in theTasks
panel, and use the commandRun
>Run Selected Tasks
Notes:
- If the image format differs, refer to the documentation to configure the import method.
- The configuration template includes a pre-processing step that selects a few slices, that should be configured according to your data.
- This step allows to select 20 slices on one side of the focus. The neural network will use only 5 slices with 0.2µm step in the end but providing more slices for training step allows to include a data augmentation step that improves robustness to imprecision on focus or lack of flatness of the sample.
- Note that in this dataset the first slice is always brighter and should be removed.
From the Data Browsing
tab:
- Right-click on a position, select Open
Kymograph
>BacteriaFluo
. - The position will be displayed and segmented objects can be selected interactively.
- Press
ctrl + A
to display all segmented objects.
In this step, we will correct segmentation errors manually:
- False positives are erased
- False negatives are created
- Merged bacteria are split
- Over-segmented bacteria are merged
Resources:
- wiki page.
- screencast.
- Press
F1
to display all the shortcuts (F1
to hide them).
Selections are sub-populations of segmented object. To export the training dataset, we will create two distinct selections, one for training and one for validation. Validation should represent around 25% of the total dataset. Those two selection should be mutually exclusive. These selections will contain the parent object of bacteria (i.e. the segmented objects that contain the bacteria), in this case the whole viewfield.
To do so, from the Segmentation & Tracking Results
panel:
- select all the positions that will be included in the training selection.
- From the right-click menu, choose
Create Selection
>Viewfield
. A selection namedViewfield
will be created (if a selection with the same named was already existing, it will be overwritten) and displayed in theSelections
panel. - Right-click on the selection, choose
duplicate
and entertrain
as selection name (this name must match the name set in the parametertraining_selection_name
of the training notebook). - Repeat steps 1, 2, and 3 for the validation selection (name it
eval
: this name must match the name set in the parametervalidation_selection_name
of the training notebook)
This step will generate a single hdf5 file that will contain both the transmitted-light stacks and the labeled images for all the positions included in the train
and eval
selections previously defined.
From the Home
panel:
- Right-click in the
Tasks to execute
panel and chooseAdd new dataset extraction Task to List
- Set the same settings as in the screenshot below.
- Right-click on the different items of the right panel to modify them.
- The first extracted feature corresponds to the transmitted-light stack, so the object class associated to the corresponding channel is selected.
- The second one corresponds to the labeled object class, so the object class with segmented bacteria is selected.
- Choose an output file that do not exist yet. Note that on macosx, one may need to create an empty file first and select it.
- Click
OK
- Right-click in the
Tasks to execute
panel and chooseRun all Tasks
- In this tutorial training will be performed using google-colab a service that provides free GPUs for a limited time, thus a google account is required.
- The previously generated dataset should be uploaded to a google drive, and shared publicly.
- Follow this link to open the notebook.
- From the
File
menu chooseSave Copy in Drive
- Follow instructions on the notebook to train a TaLiSSman network
- In the
Export
section the commands will export the trained weights to a zip file, download the zip file and extract it. - See this tutorial to use TaLiSSman in BACMMAN.