Fix from Tramac's repositories. 因為是在Google Colab上訓練,擔心被切斷,所以增加了自動上傳到Google Drive的片段。
- \scripts\train.py :
- class Trainer(object):
在self.criterion = get_segmentation_loss
的參數裡面,多傳入一個(nclass = xx)
在其他地方,nclass可以透過 nclass=datasets[dataset].NUM_CLASS 的方式取得,但是這需要調用到 from dataloader import datasets.
這個檔案對 train.py來說太遙遠,暫且用手動輸入帶過他。 - add function
save_to_Gdrive
:
我自己另外增加了上傳到colab的包,修改過的colab+pydrive也放在這個respository裡面。
- \core\data\dataloader\mydata.py:
基於cityscapes.py 修改而成,其中因為手上自己資料集的label檔案,雖是黑白png,卻是以三通到的方式儲存。 書需要轉換成灰階圖,否則評價 (socre.py )分數上,會因為矩陣大小不符合而報錯。
-
def __getitem__():
mask = Image.open(self.mask_paths[index]).convert('L')
File "/content/awesome-semantic-segmentation-pytorch/core/utils/score.py", line 76, in batch_pix_accuracy pixel_correct = torch.sum((predict == target) * (target > 0)).item() RuntimeError: The size of tensor a (640) must match the size of tensor b (3) at non-singleton dimension 3
- \core\models\base_models\resnetv1b.py:
沒有傳入 kwargs 導致 TypeError:class ResNetV1b(nn.Module):
: add**kwargs in __init__()
TypeError: __init__() got an unexpected keyword argument 'local_rank'
使用自己的資料集時
- \scripts\demo.py:
沒有將 arg 的整個參數傳送給get_model,添加**vars(args)
。
因為 get_model 內部使用到的是 kwargs (字典方式取值) ,如果只傳入args,kwargs根本是空的,所以使用上述方式傳入字典形式參數索引。
model = get_model(args.model, **vars(args), pretrained=True, root=args.save_folder).to(device)
- model_zoo.py:
in function get_icnet_resnet50_citys :kwargs.pop("dataset") net = _models[name](**kwargs) return net
因為決定用哪個模型,是由最初的參數 --model 來決定。
如果把 dataset 傳入 _models 的模型中,會產生 dataset 被指定兩次的狀況,所以在這裡先把它刪掉 (_models的模型自己會指定 dataset)。
不能從一開始就少掉這個指令是因為 demo.py 裡面需要這個參數。
mask = get_color_pallete(pred, args.dataset)
PS. --dataset citys
這個指令不知道為何,一直沒辦法輸入成功,我直接把預設改成citys。
This project aims at providing a concise, easy-to-use, modifiable reference implementation for semantic segmentation models using PyTorch.
# semantic-segmentation-pytorch dependencies
pip install ninja tqdm
# follow PyTorch installation in https://pytorch.org/get-started/locally/
conda install pytorch torchvision -c pytorch
# install PyTorch Segmentation
git clone https://github.com/Tramac/awesome-semantic-segmentation-pytorch.git
# the following will install the lib with symbolic links, so that you can modify
# the files if you want and won't need to re-build it
cd awesome-semantic-segmentation-pytorch/core/nn
python setup.py build develop
- Single GPU training
# for example, train fcn32_vgg16_pascal_voc:
python train.py --model fcn32s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 50
- Multi-GPU training
# for example, train fcn32_vgg16_pascal_voc with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --model fcn32s --backbone vgg16 --dataset pascal_voc --lr 0.0001 --epochs 50
- Single GPU evaluating
# for example, evaluate fcn32_vgg16_pascal_voc
python eval.py --model fcn32s --backbone vgg16 --dataset pascal_voc
- Multi-GPU evaluating
# for example, evaluate fcn32_vgg16_pascal_voc with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS eval.py --model fcn32s --backbone vgg16 --dataset pascal_voc
cd ./scripts
python demo.py --model fcn32s_vgg16_voc --input-pic ./datasets/test.jpg
.{SEG_ROOT}
├── scripts
│ ├── demo.py
│ ├── eval.py
│ └── train.py
- FCN
- ENet
- PSPNet
- ICNet
- DeepLabv3
- DeepLabv3+
- DenseASPP
- EncNet
- BiSeNet
- PSANet
- DANet
- OCNet
- CGNet
- ESPNetv2
- CCNet
- DUNet(DUpsampling)
- FastFCN(JPU)
- LEDNet
- Fast-SCNN
- LightSeg
- DFANet
DETAILS for model & backbone.
.{SEG_ROOT}
├── core
│ ├── models
│ │ ├── bisenet.py
│ │ ├── danet.py
│ │ ├── deeplabv3.py
│ │ ├── deeplabv3+.py
│ │ ├── denseaspp.py
│ │ ├── dunet.py
│ │ ├── encnet.py
│ │ ├── fcn.py
│ │ ├── pspnet.py
│ │ ├── icnet.py
│ │ ├── enet.py
│ │ ├── ocnet.py
│ │ ├── ccnet.py
│ │ ├── psanet.py
│ │ ├── cgnet.py
│ │ ├── espnet.py
│ │ ├── lednet.py
│ │ ├── dfanet.py
│ │ ├── ......
You can run script to download dataset, such as:
cd ./core/data/downloader
python ade20k.py --download-dir ../datasets/ade
Dataset | training set | validation set | testing set |
---|---|---|---|
VOC2012 | 1464 | 1449 | ✘ |
VOCAug | 11355 | 2857 | ✘ |
ADK20K | 20210 | 2000 | ✘ |
Cityscapes | 2975 | 500 | ✘ |
COCO | |||
SBU-shadow | 4085 | 638 | ✘ |
LIP(Look into Person) | 30462 | 10000 | 10000 |
.{SEG_ROOT}
├── core
│ ├── data
│ │ ├── dataloader
│ │ │ ├── ade.py
│ │ │ ├── cityscapes.py
│ │ │ ├── mscoco.py
│ │ │ ├── pascal_aug.py
│ │ │ ├── pascal_voc.py
│ │ │ ├── sbu_shadow.py
│ │ └── downloader
│ │ ├── ade20k.py
│ │ ├── cityscapes.py
│ │ ├── mscoco.py
│ │ ├── pascal_voc.py
│ │ └── sbu_shadow.py
- PASCAL VOC 2012
Methods | Backbone | TrainSet | EvalSet | crops_size | epochs | JPU | Mean IoU | pixAcc |
---|---|---|---|---|---|---|---|---|
FCN32s | vgg16 | train | val | 480 | 60 | ✘ | 47.50 | 85.39 |
FCN16s | vgg16 | train | val | 480 | 60 | ✘ | 49.16 | 85.98 |
FCN8s | vgg16 | train | val | 480 | 60 | ✘ | 48.87 | 85.02 |
FCN32s | resnet50 | train | val | 480 | 50 | ✘ | 54.60 | 88.57 |
PSPNet | resnet50 | train | val | 480 | 60 | ✘ | 63.44 | 89.78 |
DeepLabv3 | resnet50 | train | val | 480 | 60 | ✘ | 60.15 | 88.36 |
Note: lr=1e-4, batch_size=4, epochs=80
.
See TEST for details.
.{SEG_ROOT}
├── tests
│ └── test_model.py
- add train script
- remove syncbn
- train & evaluate
- test distributed training
- fix syncbn (Why SyncBN?)
- add distributed (How DIST?)