Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
[MXNET-614] Adding Synchronized Batch Normalization #11502
Adding Synchronized Batch Normalization
Please feel free to remove inapplicable items for your PR.
Thanks @szha , I down graded to cu90 as cu92 doesn't have clean support on my hardware yet, and it works.
However while I train ADE20K with GluonCV I get "socket.error: [Errno 111] Connection refused" after a few (@551) iterations, I have raised a separate issue for the same. And this happens with/without SyncBatchNorm.
* sync batch norm * global rank and barrier * lint * cpplint * pylint * doc * add ref * customized barrier * cpplint * get rid of pthread * address comments * warning * pylint * gpu unitest * gpu 0 * mv to cpu test * Revert "mv to cpu test" This reverts commit 24543c9. * ndev = 2 * debuging * sum prod * lint * contrib, ngpu * code style * code style * forward backward * test * cpu test * fix deconstruction * doc indent * doc * doc * address comments * typo * asnumpy
Hello, @RogerChern. I also met a deadlock issue while training PSPNet on
Hello, @zhanghang1989. Thank you for your reply. I will try it tomorrow morning and update the result with you.
Hello, @zhanghang1989. I am not quite sure about whether you suggested me to explicitly set