Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fp16 training for ResNeXt101 #4946

Open
wants to merge 61 commits into
base: release/1.8
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
ddeb925
Update API usage according to 1.8 recommendations (#4657)
FlyingQianMM May 22, 2020
edf1a87
add VideoTag to video (#4666)
huangjun12 May 28, 2020
b0239e3
change some model using data loader (#4595)
chenwhql May 28, 2020
856c428
Makeing script more flexible (#4681)
jczaja Jun 3, 2020
20d1e9b
update new reader for resnet, mobilenet; test=develop (#4685)
phlrain Jun 9, 2020
5c6dded
[dygraph] Polish the timing method and log of some dygraph models. (#…
Xreki Jun 16, 2020
c5bfe4b
Update docs (#4707)
ceci3 Jun 19, 2020
5c244bf
fix paddlers readme (#4717)
frankwhzhang Jun 23, 2020
09f7796
fix fconv in paddle 1.8 (#4705)
LDOUBLEV Jul 1, 2020
2293e33
Update VOT code: add SiamRPN and SiamMask (#4734)
xbsu Jul 2, 2020
365fe58
Update README.md (#4739)
anpark Jul 5, 2020
131a315
add JiebaTokenizer demo (#4747)
Jul 10, 2020
a70288d
Fix concat (#4755)
ceci3 Jul 16, 2020
a7fb45f
update the key word of mobilenet log (#4766)
hysunflower Jul 27, 2020
eb7eb9c
remove unused code in ml (#4781)
Jul 30, 2020
64cde5d
Update run_ernie_classifier.py (#4790)
ChinaLiuHao Aug 6, 2020
f9f0d30
Enable CPU training for DyGraph MNIST Resnet (#4824)
arlesniak Sep 1, 2020
096fa39
Fix logging in transformer dygraph (#4827)
qingqing01 Sep 1, 2020
2c8b76b
add slowfast model to video classification (#4815)
huangjun12 Sep 1, 2020
e320130
support data_parallel training and ucf101 dataset (#4819)
chajchaj Sep 1, 2020
6726ad5
Update Pix2pix_network.py (#4829)
DrRyanHuang Sep 2, 2020
12080a0
fix dygraph reader (#4832)
Sep 3, 2020
bc07a01
Transfer the value of stop_gradient for feeding data. (#4831)
Xreki Sep 3, 2020
7a36ec5
Fix random seed for language model in static mode (#4836)
LiuChiachi Sep 7, 2020
4257b82
add tsn model based on paddle 2.0 platform (#4837)
LiuChaoXD Sep 7, 2020
a00c8af
fix resnet50 usetime statistics (#4838)
wanghuancoder Sep 8, 2020
a33f081
update np.float16 usage (#4851)
luotao1 Sep 12, 2020
08f3c0b
add M3D-RPN model (#4822)
shuluoshu Sep 14, 2020
22cf383
fix slowfast interface bug caused by the movement of hapi dir (#4834)
huangjun12 Sep 14, 2020
bde994e
Refine some configurations in TSN model (#4853)
LiuChaoXD Sep 15, 2020
4d1187d
update tsn Reader using dataloader and pipline (#4856)
huangjun12 Sep 16, 2020
295c16b
Update mpii_reader.py (#4862)
a2824256 Sep 18, 2020
db6ce5e
fix language model time print (#4865)
wanghuancoder Sep 22, 2020
cf186f3
fix ptb_dy time print for benchmark, test=develop (#4866)
wanghuancoder Sep 22, 2020
e07327e
fix mobilenet model time print (#4867)
luotao1 Sep 22, 2020
ba9a787
fix resnet usetime bug (#4869)
wanghuancoder Sep 23, 2020
38ada7f
fix resnet dygraph model time print (#4868)
luotao1 Sep 23, 2020
0739cc7
use pre-commit formate code (#4870)
wanghuancoder Sep 23, 2020
b9b8c88
use pre-commit formate code ptb_dy.py (#4871)
wanghuancoder Sep 23, 2020
93c4daa
Calculate the average time for gan models when benchmarking. (#4873)
Xreki Sep 23, 2020
fa73c7f
add enable_static() (#4879)
zhiqiu Sep 24, 2020
f09c442
add ips for dygraph mobilenet and resnet models (#4883)
luotao1 Sep 24, 2020
00b7796
add sequece/sec; test=develop (#4877)
phlrain Sep 25, 2020
fd2ff20
add words/sec; test=develop (#4878)
phlrain Sep 25, 2020
8a31b1c
add tokens per sec; test=develop (#4875)
phlrain Sep 25, 2020
69557e4
add tokens per sec in transformer (#4874)
phlrain Sep 25, 2020
c91cb2c
add ips print for ptb_lm (#4886)
wanghuancoder Sep 25, 2020
58fe1a3
add ips print for language_model (#4887)
wanghuancoder Sep 25, 2020
4000dfb
refine benchmark log (#4888)
luotao1 Sep 27, 2020
afaf06e
refine resnet benchmard print (#4893)
wanghuancoder Sep 29, 2020
c4ff279
Delete PaddleRec model (#4872)
frankwhzhang Oct 10, 2020
16c1da5
upgrade to API2.0 (#4880)
shippingwang Oct 12, 2020
3fad507
revert PR4893 and use Xreki‘s Code (#4902)
wanghuancoder Oct 13, 2020
5f18785
Update2.0 model (#4905)
frankwhzhang Oct 14, 2020
294ff30
add fuse_bn_add_act_ops args (#4864)
zhangting2020 Oct 15, 2020
a25c065
fix permute api to transpose (#4913)
Oct 21, 2020
2392894
pad input to use tensor core (#4911)
zhangting2020 Oct 22, 2020
60d045d
support enable_addto (#4909)
zhiqiu Oct 22, 2020
b480de5
Revert "add fuse_bn_add_act_ops args" (#4914)
zhangting2020 Oct 27, 2020
8b769e2
Add fp16 training for ResNeXt101
huangxu96 Nov 11, 2020
b3509b8
Added training script
huangxu96 Nov 12, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
72 changes: 72 additions & 0 deletions PaddleCV/3d_vision/M3D-RPN/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# M3D-RPN: Monocular 3D Region Proposal Network for Object Detection



## Introduction


Monocular 3D region proposal network for object detection accepted to ICCV 2019 (Oral), detailed in [arXiv report](https://arxiv.org/abs/1907.06038).




## Setup

- **Cuda & Python**

In this project we utilize PaddlePaddle1.8 with Python 3, Cuda 9, and a few Anaconda packages.

- **Data**

Download the full [KITTI](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d) detection dataset. Then place a softlink (or the actual data) in *M3D-RPN/data/kitti*.

```
cd M3D-RPN
ln -s /path/to/kitti dataset/kitti
```

Then use the following scripts to extract the data splits, which use softlinks to the above directory for efficient storage.

```
python dataset/kitti_split1/setup_split.py
python dataset/kitti_split2/setup_split.py
```

Next, build the KITTI devkit eval for each split.

```
sh dataset/kitti_split1/devkit/cpp/build.sh
sh dataset/kitti_split2/devkit/cpp/build.sh
```

Lastly, build the nms modules

```
cd lib/nms
make
```

## Training


Training is split into a warmup and main configurations. Review the configurations in *config* for details.

```
// First train the warmup (without depth-aware)
python train.py --config=kitti_3d_multi_warmup

// Then train the main experiment (with depth-aware)
python train.py --config=kitti_3d_multi_main
```



## Testing

We provide models for the main experiments on val1 data splits available to download here [M3D-RPN-release.tar](https://pan.baidu.com/s/1VQa5hGzIbauLOQi-0kR9Hg), passward:ls39.

Testing requires paths to the configuration file and model weights, exposed variables near the top *test.py*. To test a configuration and model, simply update the variables and run the test file as below.

```
python test.py --conf_path M3D-RPN-release/conf.pkl --weights_path M3D-RPN-release/iter50000.0_params.pdparams
```
145 changes: 145 additions & 0 deletions PaddleCV/3d_vision/M3D-RPN/config/kitti_3d_multi_main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
"""
config of main
"""
from easydict import EasyDict as edict
import numpy as np


def Config():
"""
config
"""
conf = edict()

# ----------------------------------------
# general
# ----------------------------------------

conf.model = 'model_3d_dilate_depth_aware'

# solver settings
conf.solver_type = 'sgd'
conf.lr = 0.004
conf.momentum = 0.9
conf.weight_decay = 0.0005
conf.max_iter = 50000
conf.snapshot_iter = 10000
conf.display = 20
conf.do_test = True

# sgd parameters
conf.lr_policy = 'poly'
conf.lr_steps = None
conf.lr_target = conf.lr * 0.00001

# random
conf.rng_seed = 2
conf.cuda_seed = 2

# misc network
conf.image_means = [0.485, 0.456, 0.406]
conf.image_stds = [0.229, 0.224, 0.225]
conf.feat_stride = 16

conf.has_3d = True

# ----------------------------------------
# image sampling and datasets
# ----------------------------------------

# scale sampling
conf.test_scale = 512
conf.crop_size = [512, 1760]
conf.mirror_prob = 0.50
conf.distort_prob = -1

# datasets
conf.dataset_test = 'kitti_split1'
conf.datasets_train = [{
'name': 'kitti_split1',
'anno_fmt': 'kitti_det',
'im_ext': '.png',
'scale': 1
}]
conf.use_3d_for_2d = True

# percent expected height ranges based on test_scale
# used for anchor selection
conf.percent_anc_h = [0.0625, 0.75]

# labels settings
conf.min_gt_h = conf.test_scale * conf.percent_anc_h[0]
conf.max_gt_h = conf.test_scale * conf.percent_anc_h[1]
conf.min_gt_vis = 0.65
conf.ilbls = ['Van', 'ignore']
conf.lbls = ['Car', 'Pedestrian', 'Cyclist']

# ----------------------------------------
# detection sampling
# ----------------------------------------

# detection sampling
conf.batch_size = 2
conf.fg_image_ratio = 1.0
conf.box_samples = 0.20
conf.fg_fraction = 0.20
conf.bg_thresh_lo = 0
conf.bg_thresh_hi = 0.5
conf.fg_thresh = 0.5
conf.ign_thresh = 0.5
conf.best_thresh = 0.35

# ----------------------------------------
# inference and testing
# ----------------------------------------

# nms
conf.nms_topN_pre = 3000
conf.nms_topN_post = 40
conf.nms_thres = 0.4
conf.clip_boxes = False

conf.test_protocol = 'kitti'
conf.test_db = 'kitti'
conf.test_min_h = 0
conf.min_det_scales = [0, 0]

# ----------------------------------------
# anchor settings
# ----------------------------------------

# clustering settings
conf.cluster_anchors = 0
conf.even_anchors = 0
conf.expand_anchors = 0

conf.anchors = None

conf.bbox_means = None
conf.bbox_stds = None

# initialize anchors
base = (conf.max_gt_h / conf.min_gt_h)**(1 / (12 - 1))
conf.anchor_scales = np.array(
[conf.min_gt_h * (base**i) for i in range(0, 12)])
conf.anchor_ratios = np.array([0.5, 1.0, 1.5])

# loss logic
conf.hard_negatives = True
conf.focal_loss = 0
conf.cls_2d_lambda = 1
conf.iou_2d_lambda = 1
conf.bbox_2d_lambda = 0
conf.bbox_3d_lambda = 1
conf.bbox_3d_proj_lambda = 0.0

conf.hill_climbing = True

conf.bins = 32

# visdom
conf.visdom_port = 8100

conf.pretrained = 'paddle.pdparams'

return conf
143 changes: 143 additions & 0 deletions PaddleCV/3d_vision/M3D-RPN/config/kitti_3d_multi_warmup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
"""
config of warmup
"""

from easydict import EasyDict as edict
import numpy as np


def Config():
"""
config
"""
conf = edict()

# ----------------------------------------
# general
# ----------------------------------------

conf.model = 'model_3d_dilate'
# solver settings
conf.solver_type = 'sgd'
conf.lr = 0.004
conf.momentum = 0.9
conf.weight_decay = 0.0005
conf.max_iter = 50000
conf.snapshot_iter = 10000
conf.display = 20
conf.do_test = True

# sgd parameters
conf.lr_policy = 'poly'
conf.lr_steps = None
conf.lr_target = conf.lr * 0.00001

# random
conf.rng_seed = 2
conf.cuda_seed = 2

# misc network
conf.image_means = [0.485, 0.456, 0.406]
conf.image_stds = [0.229, 0.224, 0.225]
conf.feat_stride = 16

conf.has_3d = True

# ----------------------------------------
# image sampling and datasets
# ----------------------------------------

# scale sampling
conf.test_scale = 512
conf.crop_size = [512, 1760]
conf.mirror_prob = 0.50
conf.distort_prob = -1

# datasets
conf.dataset_test = 'kitti_split1'
conf.datasets_train = [{
'name': 'kitti_split1',
'anno_fmt': 'kitti_det',
'im_ext': '.png',
'scale': 1
}]
conf.use_3d_for_2d = True

# percent expected height ranges based on test_scale
# used for anchor selection
conf.percent_anc_h = [0.0625, 0.75]

# labels settings
conf.min_gt_h = conf.test_scale * conf.percent_anc_h[0]
conf.max_gt_h = conf.test_scale * conf.percent_anc_h[1]
conf.min_gt_vis = 0.65
conf.ilbls = ['Van', 'ignore']
conf.lbls = ['Car', 'Pedestrian', 'Cyclist']

# ----------------------------------------
# detection sampling
# ----------------------------------------

# detection sampling
conf.batch_size = 2
conf.fg_image_ratio = 1.0
conf.box_samples = 0.20
conf.fg_fraction = 0.20
conf.bg_thresh_lo = 0
conf.bg_thresh_hi = 0.5
conf.fg_thresh = 0.5
conf.ign_thresh = 0.5
conf.best_thresh = 0.35

# ----------------------------------------
# inference and testing
# ----------------------------------------

# nms
conf.nms_topN_pre = 3000
conf.nms_topN_post = 40
conf.nms_thres = 0.4
conf.clip_boxes = False

conf.test_protocol = 'kitti'
conf.test_db = 'kitti'
conf.test_min_h = 0
conf.min_det_scales = [0, 0]

# ----------------------------------------
# anchor settings
# ----------------------------------------

# clustering settings
conf.cluster_anchors = 0
conf.even_anchors = 0
conf.expand_anchors = 0

conf.anchors = None

conf.bbox_means = None
conf.bbox_stds = None

# initialize anchors
base = (conf.max_gt_h / conf.min_gt_h)**(1 / (12 - 1))
conf.anchor_scales = np.array(
[conf.min_gt_h * (base**i) for i in range(0, 12)])
conf.anchor_ratios = np.array([0.5, 1.0, 1.5])

# loss logic
conf.hard_negatives = True
conf.focal_loss = 0
conf.cls_2d_lambda = 1
conf.iou_2d_lambda = 1
conf.bbox_2d_lambda = 0
conf.bbox_3d_lambda = 1
conf.bbox_3d_proj_lambda = 0.0

conf.hill_climbing = True

conf.pretrained = 'pretrained_model/densenet.pdparams'

# visdom
conf.visdom_port = 8100

return conf
20 changes: 20 additions & 0 deletions PaddleCV/3d_vision/M3D-RPN/data/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
init
"""
from . import m3drpn_reader
#from .m3drpn_reader import *

#__all__ = m3drpn_reader.__all__
Loading