Skip to content
This repository has been archived by the owner on Aug 5, 2022. It is now read-only.

optimize batch normalization and relu layer #3

Closed
wants to merge 4 commits into from

Conversation

hjchen2
Copy link

@hjchen2 hjchen2 commented Sep 9, 2016

Merged the BatchNorm and Scale layer.
Optimize most elapsed time of batch normalization by re-writing the forward and backward process using OpenMP rather than BLAS to avoid additional multiply operation.
The elapsed time of ResNet-50 with batch size set by 128 fells from 32523.77ms to 16982.93ms on Intel(R) Xeon(R) CPU E5-2450 0 @ 2.10GHz.

@jczaja
Copy link
Contributor

jczaja commented Sep 15, 2016

Thanks very much we will have it reviewed shortly!

@tpatejko
Copy link

tpatejko commented Oct 21, 2016

Thank you for your effort.

We reviewed the pull request and we are not going to merge it right now.

The code changes functionality of Intel Caffe:

  • If I understand correctly, the pull request implements shift and scale in batch normalization. Original implementation separates these two operations;
  • The pull request removes code guarded by variable use_global_stats. It might have a negative impact on the performance of trained network.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants