-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Features] Add support for Kitti semantic segmentation dataset #1602
base: master
Are you sure you want to change the base?
Conversation
Per this issue discussed #1599 |
Hi, @AkideLiu thanks for your nice PR. We would review it asap. Please fix the lint error. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1602 +/- ##
==========================================
- Coverage 89.04% 89.04% -0.01%
==========================================
Files 144 145 +1
Lines 8636 8643 +7
Branches 1458 1459 +1
==========================================
+ Hits 7690 7696 +6
- Misses 706 707 +1
Partials 240 240
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Do you have related baseline or sota results on KITTI semantic segmentation dataset? |
Hi @MengzhangLI , I did not have a baseline or SOTA results on this dataset because the methods that have been used in some publications are not implemented in the mmseg. However, this dataset could be directly evaluated by pre-trained models or training from scratch based on mmseg, and I have successfully performed training and evaluation by UNet on this dataset. I could provide some example training configurations but I do not have the resources to perform distributed learning to obtain a pre-trained model. more info refers to baseline : https://paperswithcode.com/sota/semantic-segmentation-on-kitti-semantic |
OK, would you mind updating your training results (using mmseg) and attaching results from other repo/paper in this PR. We could polish up this PR together with training some semantic segmentation models on our side (use our own 4x or 8x V100 GPUs). |
Hi, @MengzhangLI , I am planning to run around 1000 epochs on a single GPU for three famous networks, UNet, DeepLabV3+ and PSPnet as a baseline for this dataset. I will update my local test results once the experiment is finalized. In this stage, I would append the UNet 1000 epochs results as follows:
I will fix lint and update configs soon. Could you please provide an email address where I could send the training logs? Additionally, will you help to modify the training config according to multiple GPUs setups? |
Looking forward to the training and evaluation results for different network architectures on your end. The log of the previous UNet has been attached below. Could you please help to fix the lint issues? |
I do not quite understand why the lint is failed...
|
Seems like caused by unsuccessful installation about Try to follow: https://github.com/open-mmlab/mmsegmentation/blob/master/.github/CONTRIBUTING.md If your OS is Ubuntu/Linux, the installation would be easy. After successful installation, use |
configs/_base_/datasets/kitti_seg.py
Outdated
|
||
img_norm_cfg = dict( | ||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) | ||
crop_size = (864, 256) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AkideLiu ,
Could you provide some references about the crop_size
setting? This doesn't seem to be commonly used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't have the reference for this crop size because this database is not very commonly used ... Just randomly selected values and these values are the multiple of 8. If you have any better suggestions, I am happy to change this crop size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This project is top one on the paper with code for this dataset, there are same reference settings, the followed by commit modified this crop size : https://github.com/NVIDIA/semantic-segmentation/blob/7726b144c2cc0b8e09c67eabb78f027efdf3f0fa/train.py#L149-L150
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @AkideLiu
Thanks for your link, I took a quick look at this repo.
I found that, the crop_size
should be set to 360
according to https://github.com/NVIDIA/semantic-segmentation/blob/2f548ab30ab0d56e91de66a4dea4757a0c64e7e4/scripts/train_kitti_WideResNet38.sh#L16.
And in their paper, the crop_size
on KITTI is set to 368
, you may check it at 4.1 Implementation Details
section.
Besides, they test their model by slide
mode and crop_szie
is set to 368
, see test script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @xiexinch I have changed the crop size to 368, and changed the test_cfg with the mode to slide.
The stride is calculated manually that stride = ceil(tile_size[0] * (1 - overlap)) -> ceil(368 * (1-1/3)) = 246
,
overlap is not specified in the test script, therefore default one (1/3) has been used.
I am not sure about the implementation difference of sliding inference between mmseg and Nvidia project, if there are any remaining problems plz point them out.
configs/_base_/datasets/kitti_seg.py
Outdated
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
img_scale=(1232, 368), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The img_scale
may set to be the largest scale.
mmsegmentation/docs/en/tutorials/config.md
Line 131 in 46326f6
img_scale=(2048, 1024), # Decides the largest scale for testing, used for the Resize pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiexinch a new commit response to the review, if you have any suggestions plz let me know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @AkideLiu
I'm searching baselines on the KITTI dataset, since we'll do some experiments on it.
If you have any suggestions, welcome to let me know.
configs/_base_/datasets/kitti_seg.py
Outdated
@@ -0,0 +1,54 @@ | |||
data_root = 'data/kitti-seg/' | |||
dataset_type = 'KittiSegDataset' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataset_type = 'KittiSegDataset' | |
dataset_type = 'KittiDataset' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May rename to kitti.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @xiexinch, the name of the dataset includes seg
due to the KITTI dataset containing lots of different categories, such as object detection, and depth estimation, using kittiseg can clarify this dataset is especially for segmentation.
See more on the official website: http://www.cvlibs.net/datasets/kitti/
Happy to change if you think KITTI instead of KITTISEG is more reasonable in this project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we will not add 'depth/flow estimation' or 'detection' task in mmseg, I think there may be no misunderstanding about 'KITTI'. Besides, in mmflow and mmdet3d, both of they use 'KITTI' install of 'KITTI flow' or 'KITTI det3d', so 'KITTI' is good to understand.
ref:
https://github.com/open-mmlab/mmflow/blob/master/mmflow/datasets/kitti2015.py
https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/datasets/kitti_dataset.py
@@ -0,0 +1,8 @@ | |||
_base_ = [ | |||
'../_base_/models/deeplabv3plus_r50-d8.py', | |||
'../_base_/datasets/kitti_seg.py', '../_base_/default_runtime.py', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'../_base_/datasets/kitti_seg.py', '../_base_/default_runtime.py', | |
'../_base_/datasets/kitti.py', '../_base_/default_runtime.py', |
@@ -0,0 +1,11 @@ | |||
_base_ = [ | |||
'../_base_/models/fcn_unet_s5-d16.py', '../_base_/datasets/kitti_seg.py', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'../_base_/models/fcn_unet_s5-d16.py', '../_base_/datasets/kitti_seg.py', | |
'../_base_/models/fcn_unet_s5-d16.py', '../_base_/datasets/kitti.py', |
mmseg/datasets/kitti_seg.py
Outdated
@@ -0,0 +1,12 @@ | |||
# Copyright (c) OpenMMLab. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May rename to kitti.py
I do also work on this dataset to find an optimal solution and one suggestion is to use transfer learning by pre-trained weights on cityscapes. |
Hi @AkideLiu , |
Hi @MengzhangLI @AkideLiu |
Hi @AkideLiu,
Hi @AkideLiu, |
Hi @xiexinch , the conversion of the label ids is not required for this dataset. The official format can easily be adapted from the cityscape label policy and the rest of the configuration for the dataset is identical to cityscapes. A suggestion for a quick start that downloads the zipped data from the official website I provided in the PR description, unzip it and run the directories structure conversion scripts provided in this PR and modified the local directory to match the configuration files. Afterwards, you feel free to go and start the training. |
I have briefly gone through this paper, it's a quite good baseline as it has distinct performance report and the implementation is open-sourced for references |
I know what you mean, but your conversion script is just splitting the dataset. I tried to start training, but the official annotation cannot be used directly for training, only if I convert the |
I do not fully understand this problem, would you explain more about this case? |
Image annotations provided by Cityscapes and KITTI are annotated by label id. In training, we must convert the label id to train id. You can read the code from cityscapesscripts. I'd like to know how your training ran if you didn't do this conversion? |
Hi @xiexinch i will try to reproduce the training in a fresh environment and provide update soon. |
Do apologise for the delay in the progress, previously I was taking a competition which highly similar to this dataset (subset). |
@xiexinch solution provided for converting labels, could you review this PR? reference: https://github.com/navganti/kitti_scripts/blob/master/semantics/devkit/kitti_relabel.py |
Thanks for updating this PR, we'll review it asap. :) |
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Please describe the motivation of this PR and the goal you want to achieve through this PR.
Kitti semantic segmentation dataset is a lightweight dataset for semantic segmentation which shares the same label policy as cityscapes. It's an excellent starting point for segmentation and employs the weights pre-trained on cityscapes to perform transfer-learning, do you consider to support this dataset.
http://www.cvlibs.net/datasets/kitti/eval_semseg.php?benchmark=semantics2015
Modification
Please briefly describe what modification is made in this PR.
BC-breaking (Optional)
Does the modification introduce changes that break the backward-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
NO BC
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist