Skip to content

Commit 65b1633

Browse files
authored
Merge pull request #183 from CSAILVision/hang-dev
Use configuration files to store most options which were in argument parser
2 parents 15b333d + 4e5d35d commit 65b1633

16 files changed

+737
-493
lines changed

README.md

+22-49
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ This is a PyTorch implementation of semantic segmentation models on MIT ADE20K s
55
ADE20K is the largest open source dataset for semantic segmentation and scene parsing, released by MIT Computer Vision team. Follow the link below to find the repository for our dataset and implementations on Caffe and Torch7:
66
https://github.com/CSAILVision/sceneparsing
77

8+
If you simply want to play with our demo, please try this link: http://scenesegmentation.csail.mit.edu You can upload your own photo and segment it!
9+
810
All pretrained models can be found at:
911
http://sceneparsing.csail.mit.edu/model/pytorch
1012

@@ -15,6 +17,10 @@ http://sceneparsing.csail.mit.edu/model/pytorch
1517
Color encoding of semantic categories can be found here:
1618
https://docs.google.com/spreadsheets/d/1se8YEtb2detS7OuPE86fXGyD269pMycAWe2mtKUj2W8/edit?usp=sharing
1719

20+
## Updates
21+
- We use configuration files to store most options which were in argument parser. The definitions of options are detailed in ```config/defaults.py```.
22+
23+
1824
## Highlights
1925

2026
### Syncronized Batch Normalization on PyTorch
@@ -43,9 +49,6 @@ Encoder:
4349
- ResNet50dilated
4450
- ResNet101dilated
4551

46-
***Coming soon***:
47-
- ResNeXt101dilated
48-
4952
Decoder:
5053
- C1 (1 convolution module)
5154
- C1_deepsup (C1 + deep supervision trick)
@@ -144,12 +147,6 @@ IMPORTANT: We use our self-trained base model on ImageNet. The model takes the i
144147
<td>Yes</td><td>42.66</td><td>81.01</td><td>61.84</td>
145148
<td>2.3</td>
146149
</tr>
147-
<tr>
148-
<td>UPerNet-ResNext101 (coming soon!)</td>
149-
<td>-</td><td>-</td><td>-</td><td>-</td>
150-
<td>-</td>
151-
<td>-</td>
152-
</tr>
153150
</tbody></table>
154151

155152
The training is benchmarked on a server with 8 NVIDIA Pascal Titan Xp GPUs (12GB GPU memory), ***except for*** ResNet101dilated, which is benchmarked on a server with 8 NVIDIA Tesla P40 GPUS (22GB GPU memory), because of the insufficient memory issue when using dilated conv on a very deep network. The inference speed is benchmarked a single NVIDIA Pascal Titan Xp GPU, without visualization.
@@ -158,6 +155,7 @@ The training is benchmarked on a server with 8 NVIDIA Pascal Titan Xp GPUs (12GB
158155
The code is developed under the following configurations.
159156
- Hardware: 1-8 GPUs (with at least 12G GPU memories) (change ```[--gpus GPUS]``` accordingly)
160157
- Software: Ubuntu 16.04.3 LTS, ***CUDA>=8.0, Python>=3.5, PyTorch>=0.4.0***
158+
- Dependencies: numpy, scipy, opencv, yacs, tqdm
161159

162160
## Quick start: Test on an image using our trained model
163161
1. Here is a simple demo to do inference on a single image:
@@ -167,86 +165,61 @@ chmod +x demo_test.sh
167165
```
168166
This script downloads a trained model (ResNet50dilated + PPM_deepsup) and a test image, runs the test script, and saves predicted segmentation (.png) to the working directory.
169167

170-
2. To test on multiple images or a folder of images, you can simply do something as the following (```$PATH_IMG1, $PATH_IMG2, $PATH_IMG3```are your image paths):
168+
2. To test on an image or a folder of images (```$PATH_IMG```), you can simply do the following:
171169
```
172-
python3 -u test.py \
173-
--model_path $MODEL_PATH \
174-
--test_imgs $PATH_IMG1 $PATH_IMG2 $PATH_IMG3 \
175-
--arch_encoder resnet50dilated \
176-
--arch_decoder ppm_deepsup
170+
python3 -u test.py --imgs $PATH_IMG --gpu $GPU --cfg $CFG
177171
```
178172

179-
3. See full input arguments via ```python3 test.py -h```.
180-
181173
## Training
182174
1. Download the ADE20K scene parsing dataset:
183175
```bash
184176
chmod +x download_ADE20K.sh
185177
./download_ADE20K.sh
186178
```
187-
2. Train a model (default: ResNet50dilated + PPM_deepsup). During training, checkpoints will be saved in folder ```ckpt```.
179+
2. Train a model by selecting the GPUs (```$GPUS```) and configuration file (```$CFG```) to use. During training, checkpoints by default are saved in folder ```ckpt```.
188180
```bash
189-
python3 train.py --gpus GPUS
181+
python3 train.py --gpus $GPUS --cfg $CFG
190182
```
191-
192183
- To choose which gpus to use, you can either do ```--gpus 0-7```, or ```--gpus 0,2,4,6```.
193184

194-
For example:
185+
For example, you can start with our provided configurations:
195186

196187
* Train MobileNetV2dilated + C1_deepsup
197188
```bash
198-
python3 train.py --gpus GPUS \
199-
--arch_encoder mobilenetv2dilated --arch_decoder c1_deepsup \
200-
--fc_dim 320
189+
python3 train.py --gpus GPUS --cfg config/ade20k-mobilenetv2dilated-c1_deepsup.yaml
201190
```
202191

203-
* Train ResNet18dilated + PPM_deepsup
192+
* Train ResNet50dilated + PPM_deepsup
204193
```bash
205-
python3 train.py --gpus GPUS \
206-
--arch_encoder resnet18dilated --arch_decoder ppm_deepsup \
207-
--fc_dim 512
194+
python3 train.py --gpus GPUS --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml
208195
```
209196

210197
* Train UPerNet101
211198
```bash
212-
python3 train.py --gpus GPUS \
213-
--arch_encoder resnet101 --arch_decoder upernet \
214-
--segm_downsampling_rate 4 --padding_constant 32
199+
python3 train.py --gpus GPUS --cfg config/ade20k-resnet101-upernet.yaml
215200
```
216201

217-
3. See full input arguments via ```python3 train.py -h ```.
202+
3. You can also override options in commandline, for example ```python3 train.py TRAIN.num_epoch 10 ```.
218203

219204

220205
## Evaluation
221-
1. Evaluate a trained model on the validation set. ```--id``` is the folder name under ```ckpt``` directory. ```--suffix``` defines which checkpoint to use, for example ```_epoch_20.pth```. Add ```--visualize``` option to output visualizations as shown in teaser.
222-
```bash
223-
python3 eval_multipro.py --gpus GPUS --id MODEL_ID --suffix SUFFIX
224-
```
206+
1. Evaluate a trained model on the validation set. Add ```VAL.visualize True``` in argument to output visualizations as shown in teaser.
225207

226208
For example:
227209

228210
* Evaluate MobileNetV2dilated + C1_deepsup
229211
```bash
230-
python3 eval_multipro.py --gpus GPUS \
231-
--id MODEL_ID --suffix SUFFIX --arch_encoder mobilenetv2dilated --arch_decoder c1_deepsup \
232-
--fc_dim 320
212+
python3 eval_multipro.py --gpus GPUS --cfg config/ade20k-mobilenetv2dilated-c1_deepsup.yaml
233213
```
234214

235-
* Evaluate ResNet18dilated + PPM_deepsup
215+
* Evaluate ResNet50dilated + PPM_deepsup
236216
```bash
237-
python3 eval_multipro.py --gpus GPUS \
238-
--id MODEL_ID --suffix SUFFIX --arch_encoder resnet18dilated --arch_decoder ppm_deepsup \
239-
--fc_dim 512
217+
python3 eval_multipro.py --gpus GPUS --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml
240218
```
241219

242220
* Evaluate UPerNet101
243221
```bash
244-
python3 eval_multipro.py --gpus GPUS \
245-
--id MODEL_ID --suffix SUFFIX --arch_encoder resnet101 --arch_decoder upernet \
246-
--padding_constant 32
247-
```
248-
249-
2. See full input arguments via ```python3 eval_multipro.py -h ```.
222+
python3 eval_multipro.py --gpus GPUS --cfg config/ade20k-resnet101-upernet.yaml
250223

251224
## Reference
252225

config/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from .defaults import _C as cfg
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
DATASET:
2+
root_dataset: "./data/"
3+
list_train: "./data/training.odgt"
4+
list_val: "./data/validation.odgt"
5+
num_class: 150
6+
imgSizes: (300, 375, 450, 525, 600)
7+
imgMaxSize: 1000
8+
padding_constant: 8
9+
segm_downsampling_rate: 8
10+
random_flip: True
11+
12+
MODEL:
13+
arch_encoder: "mobilenetv2dilated"
14+
arch_decoder: "c1_deepsup"
15+
weights_encoder: ""
16+
weights_decoder: ""
17+
fc_dim: 320
18+
19+
TRAIN:
20+
batch_size_per_gpu: 2
21+
num_epoch: 20
22+
start_epoch: 0
23+
epoch_iters: 5000
24+
optim: "SGD"
25+
lr_encoder: 0.02
26+
lr_decoder: 0.02
27+
lr_pow: 0.9
28+
beta1: 0.9
29+
weight_decay: 1e-4
30+
deep_sup_scale: 0.4
31+
fix_bn: False
32+
workers: 16
33+
disp_iter: 20
34+
seed: 304
35+
36+
VAL:
37+
visualize: False
38+
suffix: "_epoch_20.pth"
39+
40+
TEST:
41+
suffix: "_epoch_20.pth"
42+
result: "./"
43+
44+
DIR: "ckpt/ade20k-mobilenetv2dilated-c1_deepsup"

config/ade20k-resnet101-upernet.yaml

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
DATASET:
2+
root_dataset: "./data/"
3+
list_train: "./data/training.odgt"
4+
list_val: "./data/validation.odgt"
5+
num_class: 150
6+
imgSizes: (300, 375, 450, 525, 600)
7+
imgMaxSize: 1000
8+
padding_constant: 32
9+
segm_downsampling_rate: 4
10+
random_flip: True
11+
12+
MODEL:
13+
arch_encoder: "resnet101"
14+
arch_decoder: "upernet"
15+
weights_encoder: ""
16+
weights_decoder: ""
17+
fc_dim: 2048
18+
19+
TRAIN:
20+
batch_size_per_gpu: 2
21+
num_epoch: 40
22+
start_epoch: 0
23+
epoch_iters: 5000
24+
optim: "SGD"
25+
lr_encoder: 0.02
26+
lr_decoder: 0.02
27+
lr_pow: 0.9
28+
beta1: 0.9
29+
weight_decay: 1e-4
30+
deep_sup_scale: 0.4
31+
fix_bn: False
32+
workers: 16
33+
disp_iter: 20
34+
seed: 304
35+
36+
VAL:
37+
visualize: False
38+
suffix: "_epoch_40.pth"
39+
40+
TEST:
41+
suffix: "_epoch_40.pth"
42+
result: "./"
43+
44+
DIR: "ckpt/ade20k-resnet101-upernet"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
DATASET:
2+
root_dataset: "./data/"
3+
list_train: "./data/training.odgt"
4+
list_val: "./data/validation.odgt"
5+
num_class: 150
6+
imgSizes: (300, 375, 450, 525, 600)
7+
imgMaxSize: 1000
8+
padding_constant: 8
9+
segm_downsampling_rate: 8
10+
random_flip: True
11+
12+
MODEL:
13+
arch_encoder: "resnet50dilated"
14+
arch_decoder: "ppm_deepsup"
15+
weights_encoder: ""
16+
weights_decoder: ""
17+
fc_dim: 2048
18+
19+
TRAIN:
20+
batch_size_per_gpu: 2
21+
num_epoch: 20
22+
start_epoch: 0
23+
epoch_iters: 5000
24+
optim: "SGD"
25+
lr_encoder: 0.02
26+
lr_decoder: 0.02
27+
lr_pow: 0.9
28+
beta1: 0.9
29+
weight_decay: 1e-4
30+
deep_sup_scale: 0.4
31+
fix_bn: False
32+
workers: 16
33+
disp_iter: 20
34+
seed: 304
35+
36+
VAL:
37+
visualize: False
38+
suffix: "_epoch_20.pth"
39+
40+
TEST:
41+
suffix: "_epoch_20.pth"
42+
result: "./"
43+
44+
DIR: "ckpt/ade20k-resnet50dilated-ppm_deepsup"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
DATASET:
2+
root_dataset: "./data/"
3+
list_train: "./data/training.odgt"
4+
list_val: "./data/validation.odgt"
5+
num_class: 150
6+
imgSizes: (300, 375, 450, 525, 600)
7+
imgMaxSize: 1000
8+
padding_constant: 8
9+
segm_downsampling_rate: 8
10+
random_flip: True
11+
12+
MODEL:
13+
arch_encoder: "resnet18dilated"
14+
arch_decoder: "ppm_deepsup"
15+
weights_encoder: ""
16+
weights_decoder: ""
17+
fc_dim: 512
18+
19+
TRAIN:
20+
batch_size_per_gpu: 2
21+
num_epoch: 20
22+
start_epoch: 0
23+
epoch_iters: 5000
24+
optim: "SGD"
25+
lr_encoder: 0.02
26+
lr_decoder: 0.02
27+
lr_pow: 0.9
28+
beta1: 0.9
29+
weight_decay: 1e-4
30+
deep_sup_scale: 0.4
31+
fix_bn: False
32+
workers: 16
33+
disp_iter: 20
34+
seed: 304
35+
36+
VAL:
37+
visualize: False
38+
suffix: "_epoch_20.pth"
39+
40+
TEST:
41+
suffix: "_epoch_20.pth"
42+
result: "./"
43+
44+
DIR: "ckpt/ade20k-resnet18dilated-ppm_deepsup"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
DATASET:
2+
root_dataset: "./data/"
3+
list_train: "./data/training.odgt"
4+
list_val: "./data/validation.odgt"
5+
num_class: 150
6+
imgSizes: (300, 375, 450, 525, 600)
7+
imgMaxSize: 1000
8+
padding_constant: 8
9+
segm_downsampling_rate: 8
10+
random_flip: True
11+
12+
MODEL:
13+
arch_encoder: "resnet50dilated"
14+
arch_decoder: "ppm_deepsup"
15+
weights_encoder: ""
16+
weights_decoder: ""
17+
fc_dim: 2048
18+
19+
TRAIN:
20+
batch_size_per_gpu: 2
21+
num_epoch: 20
22+
start_epoch: 0
23+
epoch_iters: 5000
24+
optim: "SGD"
25+
lr_encoder: 0.02
26+
lr_decoder: 0.02
27+
lr_pow: 0.9
28+
beta1: 0.9
29+
weight_decay: 1e-4
30+
deep_sup_scale: 0.4
31+
fix_bn: False
32+
workers: 16
33+
disp_iter: 20
34+
seed: 304
35+
36+
VAL:
37+
visualize: False
38+
suffix: "_epoch_20.pth"
39+
40+
TEST:
41+
suffix: "_epoch_20.pth"
42+
result: "./"
43+
44+
DIR: "ckpt/ade20k-resnet50dilated-ppm_deepsup"

0 commit comments

Comments
 (0)