Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Add vgg16 aws dist test #27

Closed
wants to merge 78 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
b9c56e1
remove cifar30 shuffle (#19)
Superjomn Apr 13, 2018
b3f7661
evalute [494c262a26a1ff29143491fa60fd6ba546d3bebf]
Apr 16, 2018
e4a20c5
add model vgg16
kolinwei Apr 23, 2018
b510dd0
reset resnet30 train_duration kpi history
Superjomn Apr 23, 2018
f5de753
evalute [504e60a881fd7e72d744e256d90eaec4f52e5c7b]
Superjomn Apr 24, 2018
9145e56
add model seq2seq
kolinwei Apr 24, 2018
b6478aa
add lstm
kolinwei Apr 24, 2018
64d90a6
evalute [44fa823841549f0405f6f55aa8e51560fc0200ce]
Superjomn Apr 24, 2018
1b5a58d
add model image_classification
kolinwei Apr 24, 2018
b2c9714
change image_classification default cudaid
kolinwei Apr 24, 2018
64d0cf1
add object_detection
kolinwei Apr 24, 2018
a478b55
change gpu schedule time
kolinwei Apr 25, 2018
c0b7261
add model ocr_recognition
kolinwei Apr 25, 2018
3e10b30
change ocr model
kolinwei Apr 25, 2018
659ecd8
add transformer
kolinwei Apr 25, 2018
161321d
change diff ratio
kolinwei Apr 25, 2018
4ed9b44
Update continuous_evaluation.py
kolinwei Apr 26, 2018
1a9ed9e
Update flowers_64_gpu_memory_factor.txt
kolinwei Apr 26, 2018
5c1ff88
evalute [c02ba51de015cdfde510543a8cdacf66900f5ee9]
Superjomn Apr 26, 2018
b1305af
Update train_cost_factor.txt
kolinwei Apr 26, 2018
1bd0d96
change model gen gpu memory function
kolinwei Apr 26, 2018
23adafe
run.sh add FLAGS_fraction_of_gpu_memory_to_use=0.9
kolinwei Apr 26, 2018
981f225
change image_classification batch_size
kolinwei Apr 26, 2018
b86895e
evalute [6d934560c75f920ebb618cf71810a07c9dca8e8d]
Superjomn Apr 26, 2018
fe0a80e
change baseline
kolinwei Apr 26, 2018
f35aefb
change image_classification passnum
kolinwei Apr 26, 2018
f207856
evalute [c816121d11f7aed2939c5b859423883ce8bab050]
Superjomn Apr 26, 2018
ee4abc2
update ratio diff
kolinwei Apr 27, 2018
19d8124
change ocr_recognition/ctc_train.py
kolinwei Apr 27, 2018
5938e7e
disable model ocr_recognition
kolinwei Apr 27, 2018
a94a042
Merge branch 'master' into fast
kolinwei Apr 27, 2018
6e4072f
Merge pull request #1 from Superjomn/fast
kolinwei Apr 27, 2018
c6941e3
evalute [01da25845e2c0a45d5ab6ece400c980c199d4412]
Superjomn Apr 27, 2018
fd5ba68
add three NLP model to ce
kolinwei Apr 27, 2018
c1dc3c4
Merge branch 'fast' of https://github.com/Superjomn/paddle-ce-latest-…
kolinwei Apr 27, 2018
24792a1
Merge pull request #2 from Superjomn/fast
panyx0718 Apr 27, 2018
e72f46c
evalute [6e0b47b38c653a383ac2e7d16453536205e15f2d]
Superjomn Apr 27, 2018
50f18e0
update text_classification diff ratio
kolinwei Apr 28, 2018
6c52807
Merge pull request #3 from Superjomn/kolinwei-patch-1
kolinwei Apr 28, 2018
6e8eef4
evalute [a338c7d82a21fcce22af3e03fe6d7c33fe34d9e8]
Superjomn Apr 28, 2018
81253b9
evalute [c93a624b32b9d07298a04fd480686296a6d1229d]
Superjomn Apr 28, 2018
913eb61
add vgg16_aws_dist
putcn May 15, 2018
4e6525c
update run.xsh
putcn May 16, 2018
6803d39
update format and ag
putcn May 21, 2018
772013c
format update
putcn May 21, 2018
657b1f5
format update
putcn May 22, 2018
35277e1
add source dir existence check and more log
putcn May 24, 2018
1646670
switch to regualar bash script
putcn May 24, 2018
78c58a5
moving ce_runner to here
putcn May 24, 2018
f601567
adding base kpi
putcn May 24, 2018
fc313ee
update runner path
putcn May 24, 2018
4e700ae
Update ce_runner.py
guochaorong May 25, 2018
a1acf8a
find paddle path by current bash file path
putcn May 25, 2018
01507d2
Merge branch 'add-vgg16-aws-dist' of https://github.com/putcn/paddle-…
putcn May 25, 2018
4d0db6c
update paddle path
putcn May 26, 2018
c31d604
force start from current folder
putcn May 26, 2018
6b8c122
update all to paddle master (#28)
Superjomn May 28, 2018
0e2ba06
add multi card for text_classification
May 29, 2018
2d97d55
Update continuous_evaluation.py
guochaorong May 29, 2018
2f701dc
Merge pull request #30 from PaddlePaddle/text_classification
guochaorong May 29, 2018
28563ce
Merge pull request #31 from PaddlePaddle/guochaorong-patch-1
guochaorong May 29, 2018
d94b7c1
add cluster spec support
putcn May 30, 2018
b2a7afe
fixed log_processer; more logs; removed docker login
putcn May 30, 2018
ada36a8
move testing py to this repo; added chunk exec;
putcn May 30, 2018
b93e4c8
update cluster spec due to aws limit
putcn May 31, 2018
72785e5
Merge branch 'master' of https://github.com/PaddlePaddle/paddle-ce-la…
putcn May 31, 2018
34ca85e
add __init__ and tracking_kpis for CE
putcn May 31, 2018
6895384
Update model.py
guochaorong Jun 1, 2018
814d93f
Merge pull request #32 from PaddlePaddle/guochaorong-patch-2-1
guochaorong Jun 1, 2018
db4b971
Merge branch 'master' of https://github.com/PaddlePaddle/paddle-ce-la…
putcn Jun 1, 2018
b792339
switch to fluid_benchmark; add multi gpu support
putcn Jun 1, 2018
6c4fc0a
change model to resnet; update trainer count limit
putcn Jun 1, 2018
15be627
add base speed exception handling; switch to mnist
putcn Jun 1, 2018
7dd4b14
change test to vgg; update acc log handling
putcn Jun 2, 2018
38b066c
add cache back
putcn Jun 2, 2018
a918d78
update speedup formula; update training config
putcn Jun 2, 2018
813409a
make continous_eva python 3 complied
putcn Jun 2, 2018
0b07930
remove some kpi; add history data; remove unused model;
putcn Jun 2, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
repos:
- repo: https://github.com/PaddlePaddle/mirrors-yapf.git
sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
hooks:
- id: yapf
files: (.*\.(py|bzl)|BUILD|.*\.BUILD|WORKSPACE)$
- repo: https://github.com/pre-commit/pre-commit-hooks
sha: 5bf6c09bfa1297d3692cadd621ef95f1284e33c0
hooks:
- id: check-added-large-files
- id: check-merge-conflict
- id: check-symlinks
- id: end-of-file-fixer
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

## Howtos

### Contribute
- Run `pre-commit run -a` before your PR, this will help to format code automatically

### Add New Evaluation Task

Reference [mnist task](https://github.com/Superjomn/paddle-ce-latest-kpis/tree/master/mnist),
Expand Down
12 changes: 12 additions & 0 deletions __ocr_recognition/continuous_evaluation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import os
import sys
sys.path.append(os.environ['ceroot'])
from kpi import CostKpi, DurationKpi, AccKpi

train_avg_loss_kpi = CostKpi('train_avg_loss', 0.2, 0)
train_seq_err_kpi = CostKpi('train_seq_err', 0.2, 0)

tracking_kpis = [
train_avg_loss_kpi,
train_seq_err_kpi,
]
221 changes: 221 additions & 0 deletions __ocr_recognition/crnn_ctc_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
import paddle.fluid as fluid


def conv_bn_pool(input,
group,
out_ch,
act="relu",
param=None,
bias=None,
param_0=None,
is_test=False):
tmp = input
for i in xrange(group):
tmp = fluid.layers.conv2d(
input=tmp,
num_filters=out_ch[i],
filter_size=3,
padding=1,
param_attr=param if param_0 is None else param_0,
act=None, # LinearActivation
use_cudnn=True)
tmp = fluid.layers.batch_norm(
input=tmp,
act=act,
param_attr=param,
bias_attr=bias,
is_test=is_test)
tmp = fluid.layers.pool2d(
input=tmp,
pool_size=2,
pool_type='max',
pool_stride=2,
use_cudnn=True,
ceil_mode=True)

return tmp


def ocr_convs(input,
num,
with_bn,
regularizer=None,
gradient_clip=None,
is_test=False):
assert (num % 4 == 0)

b = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.0))
w0 = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.0005))
w1 = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.01))
tmp = input
tmp = conv_bn_pool(
tmp, 2, [16, 16], param=w1, bias=b, param_0=w0, is_test=is_test)

tmp = conv_bn_pool(tmp, 2, [32, 32], param=w1, bias=b, is_test=is_test)
tmp = conv_bn_pool(tmp, 2, [64, 64], param=w1, bias=b, is_test=is_test)
tmp = conv_bn_pool(tmp, 2, [128, 128], param=w1, bias=b, is_test=is_test)
return tmp


def encoder_net(images,
num_classes,
rnn_hidden_size=200,
regularizer=None,
gradient_clip=None,
is_test=False):
conv_features = ocr_convs(
images,
8,
True,
regularizer=regularizer,
gradient_clip=gradient_clip,
is_test=is_test)
sliced_feature = fluid.layers.im2sequence(
input=conv_features,
stride=[1, 1],
filter_size=[conv_features.shape[2], 1])

para_attr = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.02))
bias_attr = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.02),
learning_rate=2.0)
bias_attr_nobias = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.02))

fc_1 = fluid.layers.fc(input=sliced_feature,
size=rnn_hidden_size * 3,
param_attr=para_attr,
bias_attr=bias_attr_nobias)
fc_2 = fluid.layers.fc(input=sliced_feature,
size=rnn_hidden_size * 3,
param_attr=para_attr,
bias_attr=bias_attr_nobias)

gru_forward = fluid.layers.dynamic_gru(
input=fc_1,
size=rnn_hidden_size,
param_attr=para_attr,
bias_attr=bias_attr,
candidate_activation='relu')
gru_backward = fluid.layers.dynamic_gru(
input=fc_2,
size=rnn_hidden_size,
is_reverse=True,
param_attr=para_attr,
bias_attr=bias_attr,
candidate_activation='relu')

w_attr = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.02))
b_attr = fluid.ParamAttr(
regularizer=regularizer,
gradient_clip=gradient_clip,
initializer=fluid.initializer.Normal(0.0, 0.0))

fc_out = fluid.layers.fc(input=[gru_forward, gru_backward],
size=num_classes + 1,
param_attr=w_attr,
bias_attr=b_attr)

return fc_out


def ctc_train_net(images, label, args, num_classes):
regularizer = fluid.regularizer.L2Decay(args.l2)
gradient_clip = None
if args.parallel:
places = fluid.layers.get_places()
pd = fluid.layers.ParallelDo(places, use_nccl=True)
with pd.do():
images_ = pd.read_input(images)
label_ = pd.read_input(label)

fc_out = encoder_net(
images_,
num_classes,
regularizer=regularizer,
gradient_clip=gradient_clip)

cost = fluid.layers.warpctc(
input=fc_out,
label=label_,
blank=num_classes,
norm_by_times=True)
sum_cost = fluid.layers.reduce_sum(cost)

decoded_out = fluid.layers.ctc_greedy_decoder(
input=fc_out, blank=num_classes)

pd.write_output(sum_cost)
pd.write_output(decoded_out)

sum_cost, decoded_out = pd()
sum_cost = fluid.layers.reduce_sum(sum_cost)

else:
fc_out = encoder_net(
images,
num_classes,
regularizer=regularizer,
gradient_clip=gradient_clip)

cost = fluid.layers.warpctc(
input=fc_out, label=label, blank=num_classes, norm_by_times=True)
sum_cost = fluid.layers.reduce_sum(cost)
decoded_out = fluid.layers.ctc_greedy_decoder(
input=fc_out, blank=num_classes)

casted_label = fluid.layers.cast(x=label, dtype='int64')
error_evaluator = fluid.evaluator.EditDistance(
input=decoded_out, label=casted_label)

inference_program = fluid.default_main_program().clone(for_test=True)

optimizer = fluid.optimizer.Momentum(
learning_rate=args.learning_rate, momentum=args.momentum)
_, params_grads = optimizer.minimize(sum_cost)
model_average = fluid.optimizer.ModelAverage(
args.average_window,
params_grads,
min_average_window=args.min_average_window,
max_average_window=args.max_average_window)

return sum_cost, error_evaluator, inference_program, model_average


def ctc_infer(images, num_classes):
fc_out = encoder_net(images, num_classes, is_test=True)
return fluid.layers.ctc_greedy_decoder(input=fc_out, blank=num_classes)


def ctc_eval(images, label, num_classes):
fc_out = encoder_net(images, num_classes, is_test=True)
decoded_out = fluid.layers.ctc_greedy_decoder(
input=fc_out, blank=num_classes)

casted_label = fluid.layers.cast(x=label, dtype='int64')
error_evaluator = fluid.evaluator.EditDistance(
input=decoded_out, label=casted_label)

cost = fluid.layers.warpctc(
input=fc_out, label=label, blank=num_classes, norm_by_times=True)

return error_evaluator, cost
Loading