Release v0.1.1 and DEMO dataset
Patch of previous issues
This patch adds SwinUNETR to the register and fixes some issues caused by the new DataHandler class.
The DEMO dataset of Biom3d
This release also contains the link toward the demo dataset to experiment with Biom3d: a set of 85 individual nuclei in 3D split in different folders:
- 85nuclei_Colab.zip contains the raw data split into training/testing folders.
- 85nuclei_dataset.csv contains the split description.
The final subsection here describes how to use this DEMO dataset with Google Colab.
Image name, their usage (training, testing or fine-tuning) and their number of classes (1 or 2) are listed in the 85nuclei_dataset.csv file.
Dataset organization
Images are organized in four categories (training, testing1, testing2 and fine-tuning) and for each of them a Raw and Mask directories:
1-Training_Raw (58 raw nuclei),
1-Training_Mask (58 corresponding maks)
2-Test1_Raw (10 raw nuclei),
2-Test1_Mask (10 corresponding masks),
3-Test2_Raw (10 raw nuclei),
3-Test2_Mask (10 corresponding masks),
4-FineTuning_Raw (7 nuclei),
4-FineTuning_Mask (7 corresponding nuclei)
Dataset description
The dataset contains 85 nuclei from Arabidopsis thaliana stained with the intercalating agent DAPI. The 85_nuclei_dataset contains
• 41 nuclei from embryonic cotyledon (from seed embryo),
• 11 embryonic root (from seed embryo),
• 33 cotyledon (plant seedling).
Root nuclei were chosen as they are usually rounded nuclei with very large nucleoli while cotyledon nuclei have a more variable nuclear morphology (rounded to elongated with small to large sizes) with small nucleolus. To optimize the nuclei polymorphism/variability, nuclei were also chosen to have well organized chromocenters (2 class nuclei), complete absence of chromocenters (1 class nuclei) or absence of chromocenters but with large domains of condensed chromatin (1 class nuclei).
Image classes
Raw images have been segmented in 3D by NucleusJ2.0 for nuclear segmentation(Dubos et al 2020) and NODeJ for chromocenter segmentation (Dubos et al 2022). Segmentation masks have been manually refined using Napari by 2 experts. The masks were combined into a single mask using imageJ/fiji to obtain a 8bits multiclass images with the nucleus as class 1 and the chromocenters as class 2. All masks have been verified in Napari. Note that masks can be visualized in Napari or in imageJ/fiji using Process/Math/Multiply using a factor of x125 to keep the background as 0, nucleus as 125 and the chromocentre as 250.
Google Colab procedure
Training is performed using the 58 nuclei contained in the Training directory. To evaluate the tool we recommend to perform 2 epochs while relevant prediction are abtained already with 20 epochs and become optimal for 50 or 100 epochs. Usually, 1 epoch duration is of about 3 min using a T4 GPU in Google colab. Prediction can be evaluated using the Test1 nuclei. Dice index is of about 0.8 to 0.9 depending the number of epochs
The dataset also contains a more divergent set of nuclei contained in the Test2 directories. Dice indexes lower for this dataset and require to fine-tune the initial model. This can be performed using the test2 dataset. The fine-tuned model can then be tested again by the Test2 nuclei to record the Dice index increase after the fine tuning step.