Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold batch norm to weights and biases #779

Merged
merged 7 commits into from Mar 6, 2019

Conversation

Projects
None yet
7 participants
@Ttl
Copy link
Member

commented Mar 4, 2019

Handles negative and zero gammas. This is a proper fix for #778.

Ttl added some commits Mar 4, 2019

Quick patch for negative gamma
Still wrong, but at least it's consistently wrong.

@Ttl Ttl added the wip label Mar 4, 2019

Show resolved Hide resolved src/neural/cuda/network_cudnn.cc Outdated
Show resolved Hide resolved src/neural/cuda/network_cudnn.cc Outdated
@borg323

This comment has been minimized.

Copy link
Member

commented Mar 4, 2019

Running ./lc0 -w net41306 --backend=check --backend-opts=mode=histo,cudnn,blas

Absolute error histogram for a batch of 65
      |                                                                                         |
      |                 #                                                                       |
      |                 #                                                                       |
      |                 ##                                                                      |
 0.15 +                 ##                                                                      +
      |                ###                                                                      |
      |                ###                                                                      |
      |                ####                                                                     |
      |                ####                                                                     |
  0.1 +                ####                                                                     +
      |               #####                                                                     |
      |               ######                                                                    |
      |               ######                                                                    |
      |              #######                                                                    |
 0.05 +              ########                                                                   +
      |              ########                                                                   |
      |             ##########                                                                  |
      |            #############                                                                |
      |# #######################################################                                |
      +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
   -inf  -15  -14  -13  -12  -11  -10   -9   -8   -7   -6   -5   -4   -3   -2   -1    0    1  +inf 

In contrast, with ./lc0 -w net41306 --backend=check --backend-opts=mode=histo,cudnn,opencl

Absolute error histogram for a batch of 78
      |                                                                                         |
  0.2 +                                        ##                                               +
      |                                        ##                                               |
      |                                        ##                                               |
      |                                        ##                                               |
      |                                        ##                                               |
 0.15 +                                        ###                                              +
      |                                        ###                                              |
      |                                        ###                                              |
      |                                       ####                                              |
      |                                       ####                                              |
  0.1 +                                       #####                                             +
      |                                       #####                                             |
      |                                       #####                                             |
      |                                       ######                                            |
      |                                       ######                                            |
 0.05 +                                      #######                                            +
      |                                      ########                                           |
      |                                      #########                                          |
      |                                     ############                                        |
      |                               ##################################################        |
      +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
   -inf  -15  -14  -13  -12  -11  -10   -9   -8   -7   -6   -5   -4   -3   -2   -1    0    1  +inf 

Unfortunately, there are large differences between opencl and cudnn also with old nets.
./lc0 -w net11248 --backend=check --backend-opts=mode=histo,cudnn,opencl

Absolute error histogram for a batch of 58
      |                                                                                         |
      |                                                           #                             |
      |                                                           #                             |
 0.25 +                                                           #                             +
      |                                                           #                             |
      |                                                           #                             |
      |                                                           #                             |
      |                                                           ##                            |
  0.2 +                                                          ###                            +
      |                                                          ###                            |
      |                                                          ###                            |
      |                                                          ###                            |
      |                                                          ###                            |
 0.15 +                                                          ###                            +
      |                                                          ####                           |
      |                                                          ####                           |
      |                                                          ####                           |
      |                                                          ####                           |
  0.1 +                                                          ####                           +
      |                                                          ####                           |
      |                                                          ####                           |
      |                                                          #####                          |
      |                                                          #####                          |
 0.05 +                                                          #####                          +
      |                                                          ######                         |
      |                                                         ########                        |
      |                                                         ##########                      |
      |                                                       ##########################        |
      +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
   -inf  -15  -14  -13  -12  -11  -10   -9   -8   -7   -6   -5   -4   -3   -2   -1    0    1  +inf 

Ttl added some commits Mar 5, 2019

Fold batch norm in network_legacy.
No need to do duplicate work in the backends.
Cleanup and small fixes
Remove unused functions. Fix blas. Disable tensorflow backend.

@Ttl Ttl removed the wip label Mar 5, 2019

@Ttl

This comment has been minimized.

Copy link
Member Author

commented Mar 5, 2019

All backends except for Tensorflow should be working now. I disabled it at the moment since it wouldn't give correct output.

@Mardak

This comment has been minimized.

Copy link
Contributor

commented Mar 5, 2019

As a sanity check for mac/opencl, looks like this PR behaves the same way as #778 in my 7 piece value test:
screen shot 2019-03-05 at 7 50 46 am

To be clear, there were some very slight differences in ~5% of positions, but that might just be opencl:

< g6h6  (612 ) N:       1 (+ 0) (P: 39.90%) (Q: -0.94676) 
> g6h6  (612 ) N:       1 (+ 0) (P: 39.90%) (Q: -0.94675) 
< c6e8  (470 ) N:       1 (+ 0) (P: 25.18%) (Q:  0.45683) 
> c6e8  (470 ) N:       1 (+ 0) (P: 25.18%) (Q:  0.45682) 
< d5b5  (999 ) N:       1 (+ 0) (P: 29.16%) (Q: -0.49130) 
> d5b5  (999 ) N:       1 (+ 0) (P: 29.16%) (Q: -0.49131) 
< f5e4  (1063) N:       1 (+ 0) (P: 45.85%) (Q:  0.59645) 
> f5e4  (1063) N:       1 (+ 0) (P: 45.85%) (Q:  0.59644) 
< g8g6  (1756) N:       1 (+ 0) (P:  9.16%) (Q: -0.21372) 
> g8g6  (1756) N:       1 (+ 0) (P:  9.16%) (Q: -0.21371) 
< c5c3  (956 ) N:       1 (+ 0) (P: 77.93%) (Q:  0.05652) 
> c5c3  (956 ) N:       1 (+ 0) (P: 77.93%) (Q:  0.05653) 
< d4a7  (769 ) N:       1 (+ 0) (P: 46.78%) (Q:  0.03609) 
> d4a7  (769 ) N:       1 (+ 0) (P: 46.78%) (Q:  0.03610) 
< e5a1  (1018) N:       1 (+ 0) (P: 51.61%) (Q:  0.36652) 
> e5a1  (1018) N:       1 (+ 0) (P: 51.61%) (Q:  0.36653) 
< f4e4  (825 ) N:       1 (+ 0) (P: 11.14%) (Q: -0.68938) 
> f4e4  (825 ) N:       1 (+ 0) (P: 11.14%) (Q: -0.68937) 
< e7f7  (1510) N:       1 (+ 1) (P:  0.67%) (Q:  0.99900) 
> e7f7  (1510) N:       1 (+ 0) (P:  0.67%) (Q:  0.99900) 
< c4b3  (714 ) N:       1 (+ 0) (P: 39.17%) (Q:  0.56492) 
> c4b3  (714 ) N:       1 (+ 0) (P: 39.17%) (Q:  0.56491) 
< c7g7  (1453) N:       1 (+ 0) (P: 23.93%) (Q:  0.99614) 
> c7g7  (1453) N:       1 (+ 0) (P: 23.92%) (Q:  0.99614) 
< e3d3  (545 ) N:       1 (+ 0) (P: 25.81%) (Q: -0.24854) 
> e3d3  (545 ) N:       1 (+ 0) (P: 25.81%) (Q: -0.24853) 
< b4a6  (698 ) N:       1 (+ 0) (P: 36.33%) (Q:  0.66649) 
> b4a6  (698 ) N:       1 (+ 0) (P: 36.33%) (Q:  0.66650) 
< h6h7  (1376) N:       1 (+ 0) (P:  9.97%) (Q: -0.18643) 
> h6h7  (1376) N:       1 (+ 0) (P:  9.97%) (Q: -0.18642) 
< d3c3  (1245) N:       1 (+ 0) (P: 48.62%) (Q:  0.72213) 
> d3c3  (1245) N:       1 (+ 0) (P: 48.62%) (Q:  0.72212) 
< h7g6  (1586) N:       1 (+ 0) (P: 15.78%) (Q:  0.99578) 
> h7g6  (1586) N:       1 (+ 0) (P: 15.77%) (Q:  0.99578) 
< c3b3  (1211) N:       1 (+ 0) (P: 58.35%) (Q:  0.74017) 
> c3b3  (1211) N:       1 (+ 0) (P: 58.35%) (Q:  0.74016) 
< h7g7  (397 ) N:       1 (+ 0) (P: 86.55%) (Q: -0.55760) 
> h7g7  (397 ) N:       1 (+ 0) (P: 86.55%) (Q: -0.55759) 
< g2h2  (371 ) N:       1 (+ 0) (P: 54.44%) (Q: -0.64584) 
> g2h2  (371 ) N:       1 (+ 0) (P: 54.44%) (Q: -0.64583) 
< f3e3  (579 ) N:       1 (+ 0) (P: 56.96%) (Q:  0.34620) 
> f3e3  (579 ) N:       1 (+ 0) (P: 56.96%) (Q:  0.34621) 
< f4g5  (831 ) N:       1 (+ 0) (P: 16.60%) (Q: -0.99523) 
> f4g5  (831 ) N:       1 (+ 1) (P: 16.60%) (Q: -0.99523) 
< b3d3  (447 ) N:       1 (+ 0) (P: 56.42%) (Q: -0.63309) 
> b3d3  (447 ) N:       1 (+ 0) (P: 56.42%) (Q: -0.63308) 
< b7b2  (1406) N:       1 (+ 0) (P: 11.30%) (Q: -0.46790) 
> b7b2  (1406) N:       1 (+ 0) (P: 11.30%) (Q: -0.46789) 
< b8a8  (23  ) N:       1 (+ 0) (P: 83.08%) (Q: -0.89343) 
> b8a8  (23  ) N:       1 (+ 0) (P: 83.08%) (Q: -0.89344) 
@gsobala

This comment has been minimized.

Copy link
Contributor

commented Mar 5, 2019

I confirm that the OpenCL backend is working correctly on macos on both SE and non-SE nets when compared to blas using backend=check

@Tilps

Tilps approved these changes Mar 5, 2019

if (biases) {
const float bias = vload_net_t(o, biases);
sum = sum + bias;

This comment has been minimized.

Copy link
@Tilps

Tilps Mar 5, 2019

Contributor

This changes whether relu is defined from being based on existance of means, to existance of biases.
As far as I can tell this has no effect since all convole1's in the net post past and present always have relu - but it could be worth someone double checking.

} else {
processConvBlock(weights.policy, true, 1);
// 0. Check for SE.
if (weights.residual[0].has_se) {

This comment has been minimized.

Copy link
@Tilps

Tilps Mar 5, 2019

Contributor

I assume no one will ever try a net with 0 residual layers...

@Tilps Tilps merged commit 226acf5 into LeelaChessZero:master Mar 6, 2019

2 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@oscardssmith

This comment has been minimized.

Copy link
Contributor

commented Mar 6, 2019

Time for version bump now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.