-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CodeCamp #140 [Feature] Add synapse dataset and data augmentation in dev-1.x. #2372
Conversation
Hi, thanks for your nice PR. We would review it asap. Best wishes, |
Codecov ReportBase: 83.33% // Head: 83.40% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## dev-1.x #2372 +/- ##
===========================================
+ Coverage 83.33% 83.40% +0.06%
===========================================
Files 143 144 +1
Lines 8127 8178 +51
Branches 1211 1219 +8
===========================================
+ Hits 6773 6821 +48
- Misses 1165 1168 +3
Partials 189 189
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
tools/dataset_converters/synapse.py
Outdated
mkdir_or_exist(osp.join(save_path, 'img_dir')) | ||
mkdir_or_exist(osp.join(save_path, 'ann_dir')) | ||
|
||
if osp.exists(osp.join(dataset_path, 'train.txt')) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please provide train.txt
and val.txt
which makes the same with TransUNet train/val split? In this default setting, all datasets are handled into training set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dataset split could be found here: https://github.com/Beckschen/TransUNet/tree/main/lists/lists_Synapse.
mmseg/datasets/synapse.py
Outdated
|
||
def __init__(self, | ||
img_suffix='.jpg', | ||
seg_map_suffix='.png', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In default setting, the label segmentation map would be .jpg
rather than .png
. link of code.
configs/_base_/datasets/synapse.py
Outdated
@@ -0,0 +1,41 @@ | |||
dataset_type = 'SynapseDataset' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setting of this synapse dataset config may be aligned with TransUNet default setting: https://github.com/Beckschen/TransUNet/blob/main/train.py#L19-L33.
mmseg/datasets/synapse.py
Outdated
classes=('background', 'spleen', 'right_kidney', 'left_kidney', | ||
'gallbladder', 'esophagus', 'liver', 'stomach', 'aorta', | ||
'inferior_vena_cava', 'portal_vein_and_splenic_vein', | ||
'pancreas', 'right_adrenal_gland', 'left_adrenal_gland'), | ||
palette=[[0, 0, 0], [255, 127, 127], [224, 231, 161], [138, 204, 132], | ||
[64, 172, 136], [126, 152, 187], [140, 110, 160], | ||
[247, 88, 240], [202, 172, 161], [237, 213, 149], | ||
[139, 182, 139], [111, 192, 185], [82, 107, 163], | ||
[89, 54, 156]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should keep label number the same with TransUNet, i.e., only 'spleen', 'right_kidney', 'left_kidney', 'gallbladder', 'liver', 'stomach', 'aorta', 'pancreas' 8 classes are handled. Because it has been introduced as benchmark in many medical image segmentation paper.
configs/_base_/datasets/synapse.py
Outdated
pipeline=test_pipeline)) | ||
test_dataloader = val_dataloader | ||
|
||
val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In evaluation, the setting of TransUNet and its following works all use DSC based on 3D scan rather than 2D slice which is MMSegmentation default setting. We may use evaluation based on 3D scans to make users more convenient when using Synapse dataset in MMSegmentation.
First, we may use dataset from official repo, which can be downloaded by public http link, rather than TransUNet unofficial processed data. Then we should ensure the our dataset conversion could handle dataset the same with TransUNet, Next, we should check out normalization of TransUNet pretrained model, use its parameters in our backbone config, like here. Thanks for your nice PR again! |
Could you please add doc for how to download the original data and how to convert it in dataset_prepare.md? |
tools/dataset_converters/synapse.py
Outdated
label_3d = read_nii_file( | ||
osp.join(dataset_path, 'label', 'label' + idx + '.nii.gz')) | ||
|
||
img_3d = np.clip(img_3d, -125, 275) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment above this line, give a link: https://github.com/Beckschen/TransUNet/tree/main/datasets to tell users why we clip (-125, 275). Thx.
This PR is wired to be closed automatically when I git push some configs for model training, @Dominic23331 could you please make a new PR agian? Sorry for my wrong operation! Best, |
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Support synapse and data augmentation
dataset link: https://www.synapse.org/#!Synapse:syn3193805/wiki/
TransUnet use this dataset, paper link: https://arxiv.org/pdf/2102.04306.pdf
Modification
Add synapse dataset loader.
Add a python script to transform synapse dataset format to mmseg dataset format.
Add synapse dataset augmentation.
Update 2022-12-13
The result when using 13 classes: