Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model update: Update the parameter initialization of FC layer of SE-ResNeXt and add a README #825

Merged
merged 20 commits into from
Apr 28, 2018

Conversation

BigFishMaster
Copy link
Contributor

@BigFishMaster BigFishMaster commented Apr 10, 2018

fix #826

@BigFishMaster BigFishMaster changed the title Model update Model update: Update the parameter initialization of FC layer of SE-ResNeXt and add a README Apr 10, 2018
The current code can successfully implement the result of SE-ResNeXt-50.
# prepare directory
mkdir ILSVRC2012/
tar zxf XXX
tar zxf YYY
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to give the download URL.

n01440764/n01440764_13602.JPEG 0
n01440764/n01440764_13625.JPEG 0
...
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these file lists download in somewhere ? like https://github.com/BVLC/caffe/blob/master/data/ilsvrc12/get_ilsvrc_aux.sh


```
python train.py --num_layers=50 --batch_size=256 --with_mem_opt=True --parallel_exe=True
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to give the convergent curve.

|- | :-: |:-: | -:
|SE-ResNeXt-50 | 77.6%/- | 77.71%/93.63% | 77.42%/93.50%

## Finetune a model
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add the usage here to tell the users how to fine-tune a model.

```
## Inference

The inference process is conducted after each training epoch.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a infer.py to tell the users how to do inference.

Copy link
Collaborator

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need scripts: infer.py and eval.py.

#if global_step % step_each_epoch == 0:
# print("epoch={0}, global_step={1},decayed_lr={2} \
# (step_each_epoch={3})".format( \
# epoch,global_step,decayed_lr,step_each_epoch))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused code.

@@ -19,17 +50,28 @@ def conv_bn_layer(input, num_filters, filter_size, stride=1, groups=1,
def squeeze_excitation(input, num_channels, reduction_ratio):
pool = fluid.layers.pool2d(
input=input, pool_size=0, pool_type='avg', global_pooling=True)
### initializer parameter
#print >> sys.stderr, "pool shape:", pool.shape
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused code.

param_attr=fluid.param_attr.ParamAttr(
initializer=fluid.initializer.Uniform(-stdv,
stdv)))
#print >> sys.stderr, "squeeze shape:", squeeze.shape
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused code.

scale = fluid.layers.elementwise_mul(x=input, y=excitation, axis=0)
return scale


def shortcut(input, ch_out, stride):
def shortcut_old(input, ch_out, stride):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If shortcut_old is not used, please to remove it.

else:
drop = pool
out = fluid.layers.fc(input=drop, size=class_dim, act='softmax')
#print >> sys.stderr, "drop shape:", drop.shape
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unused code.

import math


def cosine_decay(learning_rate, step_each_epoch, epochs=120):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个函数移动到train.py里吧。

@@ -314,12 +327,15 @@ def train_parallel_exe(args,
# layers: 50, 152
layers = args.num_layers
method = train_parallel_exe if args.parallel_exe else train_parallel_do
init_model = args.init_model if args.init_model else None
pretrained_model = args.pretrained_model if args.pretrained_model else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上面学习率调整测绿没加cosine_decay

Copy link
Collaborator

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

先approve了,为了方便cloud验证。 但是文档和代码后续还需要再提升下。

@qingqing01 qingqing01 merged commit f60005c into PaddlePaddle:develop Apr 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update the parameter initialization of FC layer of SE-ResNeXt and add a README
2 participants