Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
libnd4j: Layer norm only supports largest axis, backprop doesn't support 4d? #8008
This broadcast will only work when gain/bias is the last value? i.e., [mb,h,w,c] + [c] works, but obviously [mb,c,h,w] + [c] doesn't... we need [mb,c,h,w]+[c,1,1] at least (or equivalently +[1,c,1,1])
Backprop looks off too for the 4D case:
Consider the [n,c,h,w] axis 1 case - we should be summing over axes [0,2,3] not just .
And I can't find any 4d test cases in libnd4j either.
Layer norm 4d case forward pass confirmed working and correct for both NCHW and NHWC cases (java tests added here: SkymindIO#174)
However, backprop is throwing an exception for both NCHW and NHWC cases. Reproducible with the following test case: