The original dataset used here has been take from the following web site:
SegPC-2021-dataset
SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images
https://www.kaggle.com/datasets/sbilab/segpc2021datasetCitation:
Anubha Gupta, Ritu Gupta, Shiv Gehlot, Shubham Goswami, April 29, 2021, "SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images", IEEE Dataport, doi: https://dx.doi.org/10.21227/7np1-2q42. BibTex @data{segpc2021, doi = {10.21227/7np1-2q42}, url = {https://dx.doi.org/10.21227/7np1-2q42}, author = {Anubha Gupta; Ritu Gupta; Shiv Gehlot; Shubham Goswami }, publisher = {IEEE Dataport}, title = {SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images}, year = {2021} } IMPORTANT: If you use this dataset, please cite below publications- 1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis, vol. 65, Oct 2020. DOI: (2020 IF: 11.148) 2. Shiv Gehlot, Anubha Gupta and Ritu Gupta, "EDNFC-Net: Convolutional Neural Network with Nested Feature Concatenation for Nuclei-Instance Segmentation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1389-1393. 3. Anubha Gupta, Pramit Mallick, Ojaswa Sharma, Ritu Gupta, and Rahul Duggal, "PCSeg: Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma," PLoS ONE 13(12): e0207908, Dec 2018. DOI: 10.1371/journal.pone.0207908 License CC BY-NC-SA 4.0
See also:
Systematic Evaluation of Image Tiling Adverse Effects on Deep Learning Semantic Segmentation
https://www.frontiersin.org/articles/10.3389/fnins.2020.00065/full
- 2023/06/22 Updated TensorflowUNet.py to backup copies of previous eval_dir and model_dir.
- 2023/06/22 Modified TensorflowUNet.py to copy a configuration file to a model saving directory.
- 2023/06/22 Retrained TensorflowUNet model by using two type of model sizes, 256x256 and 512x512.
- 2023/06/30 Added BatchNormalization flag to [model] section of config file to test adverse effects to tiled-image-segmentation .
- 2023/07/03 Added Overlapped-Tiled-Image-Segmentation section.
We use Python 3.8.10 to run tensoflow 2.10.1 on Windows11.
Please install Microsoft Visual Studio Community, which can be ITed to compile source code of cocoapi for PythonAPI.
Please run the following command to create a python virtualenv of name py38-efficientdet.
>cd c:\ >python38\python.exe -m venv py38-efficientdet >cd c:\py38-efficientdet >./scripts/activatePlease create a working folder "c:\google" for your repository, and install the python packages.
>mkdir c:\google >cd c:\google >pip install cython >git clone https://github.com/cocodataset/cocoapi >cd cocoapi/PythonAPI
You have to modify extra_compiler_args in setup.py in the following way:
extra_compile_args=[]
>python setup.py build_ext install
Please clone Tiled-Image-Segmentation-Multiple-Myeloma.git in the working folder c:\google.
>git clone https://github.com/atlan-antillia/Tiled-Image-Segmentation-Multiple-Myeloma.gitYou can see the following folder structure in Tiled-Image-Segmentation-Multiple-Myeloma of the working folder.
Tiled-Image-Segmentation-Multiple-Myeloma ├─asset └─projects └─MultipleMyeloma ├─4k_mini_test ├─4k_tiled_mini_test_output ├─4k_tiled_mini_test_output_512x512 ├─eval ├─eval_512x512 ├─generator ├─mini_test ├─mini_test_output ├─mini_test_output_512x512 ├─models ├─models_512x512 └─MultipleMyeloma ├─train │ ├─images │ └─masks └─valid ├─images └─masks
Please run the following command to install python packages for this project.
>cd ./Image-Segmentation-Multiple-Myeloma >pip install -r requirements.txt
Please download original Multiple Myeloma Plasma Cells dataset from the following link. SegPC-2021-dataset
SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images
https://www.kaggle.com/datasets/sbilab/segpc2021datasetThe folder structure of the dataset is the following.
TCIA_SegPC_dataset ├─test │ └─x ├─train │ ├─x │ └─y └─valid ├─x └─yEach x folder of the dataset contains the ordinary image files of Multiple Myeloma Plasma Cells, and y folder contains the mask files to identify each Cell of the ordinary image files. Both the image size of all files in x and y is 2560x1920 (2.5K), which is apparently too large to use for our TensoflowUNet Model.
Sample images in train/x:
Sample masks in train/y:
This script will perform following image processing.
1 Resize all bmp files in x and y folder to 256x256 square image. 2 Create clear white-black mask files from the original mask files. 3 Create cropped images files corresponding to each segmented region in mask files in y folders.
See also the following web-site on Generation of MultipleMyeloma Image Dataset.
Image-Segmentation-Multiple-Myeloma
└─projects └─MultipleMyeloma └─MultipleMyeloma ├─train │ ├─images │ └─masks └─valid ├─images └─masksWe have trained MultipleMyeloma TensorflowUNet Model by using the following train_eval_infer.config file.
Please move to ./projects/MultipleMyeloma directory, and run the following bat file.
>1.train.bat, which simply runs the following command.
>python ../../TensorflowUNetTrainer.py ./train_eval_infer.configThis python script above will read the following configration file, build TensorflowUNetModel, and start training the model by using
; train_eval_infer.config ; 2023/6/22 antillia.com; Modified to use loss and metric ; Specify loss as a function nams ; loss = "bce_iou_loss" ; Specify metrics as a list of function name ; metrics = ["binary_accuracy"] ; Please see: https://www.tensorflow.org/api_docs/python/tf/keras/Model?version=stable#compile
[model] image_width = 256 image_height = 256 image_channels = 3 num_classes = 1 base_filters = 16 num_layers = 6 dropout_rate = 0.08 learning_rate = 0.001 dilation = (1,1) loss = "bce_iou_loss" metrics = ["binary_accuracy"] show_summary = False
[train] epochs = 100 batch_size = 4 patience = 10 metrics = ["binary_accuracy", "val_binary_accuracy"] model_dir = "./models" eval_dir = "./eval" image_datapath = "./MultipleMyeloma/train/images/" mask_datapath = "./MultipleMyeloma/train/masks/" ;2023/06/22 create_backup = True
[eval] image_datapath = "./MultipleMyeloma/valid/images/" mask_datapath = "./MultipleMyeloma/valid/masks/"
[infer] images_dir = "./mini_test" output_dir = "./mini_test_output"
[tiledinfer] images_dir = "./4k_mini_test" output_dir = "./4k_tiled_mini_test_output"
[mask] blur = True binarize = True threshold = 60
Since
loss = "bce_iou_loss"and
metrics = ["binary_accuracy"]are specified in train_eval_infer.config file, bce_iou_loss and binary_accuracy functions are used to compile our model as shown below.
# Read a loss function name from a config file, and eval it. # loss = "binary_crossentropy" self.loss = eval(self.config.get(MODEL, "loss")) # Read a list of metrics function names from a config file, and eval each of the list, # metrics = ["binary_accuracy"] metrics = self.config.get(MODEL, "metrics") self.metrics = [] for metric in metrics: self.metrics.append(eval(metric)) self.model.compile(optimizer = self.optimizer, loss= self.loss, metrics = self.metrics)
You can also specify other loss and metrics functions in the config file.
Example: basnet_hybrid_loss(https://arxiv.org/pdf/2101.04704.pdf)
loss = "basnet_hybrid_loss" metrics = ["dice_coef", "sensitivity", "specificity"]
On detail of these functions, please refer to losses.py
, and
Semantic-Segmentation-Loss-Functions (SemSegLoss).
The training process has just been stopped at epoch 68 by an early-stopping callback as shown below.
The val_binary_accuracy is very high as shown below from the beginning of the training.
Train metrics line graph:
The val_loss is also very low as shown below from the beginning of the training.
Train losses line graph:
We have evaluated prediction accuracy of our Pretrained MultipleMyeloma Model by using test dataset. Please move to ./projects/MultipleMyeloma directory, and run the following bat file.
>2.evalute.bat, which simply run the following command.
>python ../../TensorflowUNetEvaluator.py train_eval_infer.configThe evaluation result of this time is the following.
By using Python script resize4k.py, we have created 4K size 4k_mini_test dataset, which is a set of 4K size images created from the original 2.5K image bmp dataset in the following x images folder:
TCIA_SegPC_dataset └─test └─x
Please move to ./projects/MultipleMyeloma directory, and run the following bat file.
>4.tiled_infer.bat
, which simply runs the following command.
>python ../../TensorflowUNetTiledInfer.py train_eval_infer.config
This Python script performs Tiled-Image-Inference based on the directory settings in in following tiledinfer section,
[tiledinfer] images_dir = "./4k_mini_test" output_dir = "./4k_tiled_mini_test_output"
The TensorflowUNetTiledInfer.py script performs the following processings for each 4K image file.
1 Read a 4K image file in images_dir folder. 2 Split the image into multiple tiles by image size of Model. 3 Infer for all tiled images. 4 Merge all inferred mask,
Currently, we don't support Seamless Smooth Stitching on the mask merging process.
See also:
Tiling and stitching segmentation output for remote sensing: basic challenges and recommendations
https://arxiv.org/ftp/arxiv/papers/1805/1805.12219.pdf
For example, 4K image file in 4k_mini_test will be split into a lot of pieces of tiled split images as shown below;
4K 405.jpg
Tiled split images
Input 4K images (4k_mini_test)
Infered 4K images (4k_mini_test_output)
Detailed 4K images comarison:
4k_mini_test/405.jpg | Inferred_image |
4k_mini_test/605.jpg | Inferred_image |
4k_mini_test/1735.jpg | Inferred_image |
4k_mini_test/1923.jpg | Inferred_image |
4k_mini_test/2028.jpg | Inferred_image |
How to improve segmentation accuracy of our Tiled-Image-Segmentation Model?
At least, it is much better to increase the image size of our UNet Model from 256x256 to 512x512. We only have to change the configuration file train_eval_infer.config as shown below, and retrain that UNetModel.
; train_eval_infer_512x512.config [model] image_width = 512 image_height = 512Please move to ./projects/MultipleMyeloma directory, and run the following train bat file.
>201.train.bat, which simply runs the following command.
>python ../../TensorflowUNetTrainer.py ./train_eval_infer_512x512.config
>204.tiled_infer.bat, which simply runs the following command.
>python ../../TensorflowUNetTiledInfer.py ./train_eval_infer_512x512.config
We are able to get slightly clear better inference results as shown below.
4k_mini_test/405.jpg | Inferred_image |
4k_mini_test/605.jpg | Inferred_image |
4k_mini_test/1735.jpg | Inferred_image |
4k_mini_test/1923.jpg | Inferred_image |
4k_mini_test/2028.jpg | Inferred_image |
>3.infer.bat, which simply runs the following command.
>python ../../TensorflowUNetInfer.py train_eval_infer.configThis Python script performs Non-Tiled-Image-Inference based on the directory settings in in following infer section,
[infer] images_dir = "./mini_test" output_dir = "./mini_test_output"In this case, the images_dir in that section contains 2.5K image files taken from
TCIA_SegPC_dataset └─test └─x
Input images (mini_test)
Infered images (mini_test_output)
; train_eval_infer_normalized_512x512.config [model] normalization = TrueFurthermore, we have updated create method TensorflowUNet class to insert BatchNormalization layers when that normalization flag is True.
Please move to ./projects/MultipleMyeloma/, and run the following train bat file.
>301.train_normalized_512x512.bat, by which TensorflowUNet Model with BatchNormalization will be created.
Please run the following tiled_infer bat file.
>304.tiled_infer_normalized_512x512.bat
Input 4K images (4k_mini_test)
Output 4K Tiled-inferred (4k_mini_test_output_normalized_512x512)
; train_eval_infer_normalized_512x512.config [tiledinfer] ;Please specify 0 if you don't need any overlapping. overlapping = 64, and modified threshold in [mask] section.
[mask] threshold = 128This is a threshold to binarize a mask image, and will affect to BatchNormalization.
Please move to ./projects/MultipleMyeloma/, and run the following train bat file.
>301.train_normalized_512x512.bat
, by which TensorflowUNet Model with BatchNormalization will be created.
Please run the following tiled_infer bat file.
>304.tiled_infer_normalized_512x512.bat
Input 4K images (4k_mini_test)
Overlapped-Tiled-Image-Segmentation:Output 4K Tiled-inferred (4k_mini_test_output_normalized_overlapped_512x512)
Please see also:
TensorflowMultiResUNet-Segmentation-MultipleMyeloma
1. SegPC-2021-dataset
SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images
https://www.kaggle.com/datasets/sbilab/segpc2021datasetCitation:
Anubha Gupta, Ritu Gupta, Shiv Gehlot, Shubham Goswami, April 29, 2021, "SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images", IEEE Dataport, doi: https://dx.doi.org/10.21227/7np1-2q42. BibTex @data{segpc2021, doi = {10.21227/7np1-2q42}, url = {https://dx.doi.org/10.21227/7np1-2q42}, author = {Anubha Gupta; Ritu Gupta; Shiv Gehlot; Shubham Goswami }, publisher = {IEEE Dataport}, title = {SegPC-2021: Segmentation of Multiple Myeloma Plasma Cells in Microscopic Images}, year = {2021} } IMPORTANT: If you use this dataset, please cite below publications- 1. Anubha Gupta, Rahul Duggal, Shiv Gehlot, Ritu Gupta, Anvit Mangal, Lalit Kumar, Nisarg Thakkar, and Devprakash Satpathy, "GCTI-SN: Geometry-Inspired Chemical and Tissue Invariant Stain Normalization of Microscopic Medical Images," Medical Image Analysis, vol. 65, Oct 2020. DOI: (2020 IF: 11.148) 2. Shiv Gehlot, Anubha Gupta and Ritu Gupta, "EDNFC-Net: Convolutional Neural Network with Nested Feature Concatenation for Nuclei-Instance Segmentation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1389-1393. 3. Anubha Gupta, Pramit Mallick, Ojaswa Sharma, Ritu Gupta, and Rahul Duggal, "PCSeg: Color model driven probabilistic multiphase level set based tool for plasma cell segmentation in multiple myeloma," PLoS ONE 13(12): e0207908, Dec 2018. DOI: 10.1371/journal.pone.0207908 License CC BY-NC-SA 4.0
2. Deep Learning Based Approach For MultipleMyeloma Detection
Vyshnav M T, Sowmya V, Gopalakrishnan E A, Sajith Variyar V V, Vijay Krishna Menon, Soman K P
https://www.researchgate.net/publication/346238471_Deep_Learning_Based_Approach_for_Multiple_Myeloma_Detection
3. EfficientDet-Multiple-Myeloma
Toshiyuki Arai @antillia.com
https://github.com/sarah-antillia/EfficientDet-Multiple-Myeloma
4. Image-Segmentation-Multiple-Myeloma
Toshiyuki Arai @antillia.com
https://github.com/atlan-antillia/Image-Segmentation-Multiple-Myeloma
5. The revolutionary benefits of 4K in healthcare
SONY
https://pro.sony/en_CZ/healthcare/imaging-innovations-insights/operating-room-4k-visualisation
6. Three Reasons to Use 4K Endoscopy and Surgical Monitors
EIZO
https://www.eizo.com/library/healthcare/reasons-to-use-4k-surgical-and-endoscopy-monitors/
7. Systematic Evaluation of Image Tiling Adverse Effects on Deep Learning Semantic Segmentation
G. Anthony Reina1, Ravi Panchumarthy, Siddhesh Pravin Thakur,
Alexei Bastidas1 and Spyridon Bakas
https://www.frontiersin.org/articles/10.3389/fnins.2020.00065/full
8. Tiling and stitching segmentation output for remote sensing: basic challenges and recommendations
Bohao Huang, Daniel Reichman, Leslie M. Collins
, Kyle Bradbury, and Jordan M. Malof
https://arxiv.org/ftp/arxiv/papers/1805/1805.12219.pdf