Standard definitions:

$$c(x)=x-\mu(x).$$
$$u_\epsilon(x)=\frac{x}{\sqrt{||x||^2+\epsilon}}.$$ 
$$\mathrm{E}[x]=\mu(x)=\frac{1}{n}\sum_{i=1}^nx_i$$
$$\mathrm{Var}[x]=\sigma^2(x)=\frac{1}{n}\sum_{i=1}^n (x_i-\mu(x_i))^2.$$

Pytorch Layer Norm:
$$LN = \frac{x-\textrm{E}[x]}{\sqrt{\textrm{Var}[x]+\epsilon}}*\gamma+\beta.$$

Reformulated Layer Norm:
$$LN = \sqrt{n} \cdot U_{n \epsilon}(c(x))$$


In [1]:
using Symbolics
using SymbolicTransformer
v = [3,1,-2,5]
LN(v)

4-element Vector{Float64}:
  0.48336788312641876
 -0.29002072987585126
 -1.4501036493792563
  1.2567564961286888

Writing these in terms of operations between vectors rather than operations on euclidean vector components:

$$c(\vec{x})=\vec{x}-\mu(\vec{x})=\vec{x}-\frac{1}{n} \vec{x} \cdot \vec{1}$$

$$u_\epsilon(\vec{x})=\frac{\vec{x}}{\sqrt{||\vec{x}||^2+\epsilon}}$$

In [1]:
using Symbolics
using Grassmann

In [9]:
@variables ones::Number, n::Integer, a1::Number, a2::Number, b::Number
ϵ = 1e-6

function ⋅(x,y)
    return (1/2)*(x*y + y*x)
end

function c(x)
    return x - ((x ⋅ ones) * (1/n))    
end

function u_ϵ(x)
    return x / sqrt(x ⋅ x + ϵ)
end

u_ϵ(c(a))

(a + (-a*ones) / n) / sqrt(1.0e-6 + (a + (-a*ones) / n)^2)

In [10]:
(b) ⋅ (u_ϵ(c(a1 + a2)))

(b*(a1 + a2 + (-ones*(a1 + a2)) / n)) / sqrt(1.0e-6 + (a1 + a2 + (-ones*(a1 + a2)) / n)^2)