Skip to content

Neural Network with ReLU and Sigmoid Activations

ZicolinPower edited this page Apr 30, 2023 · 2 revisions

#Problem:

(a) Let $Φ_{W,b,V,b′} (x)$ be a neural network with ReLU activations and one hidden layer and let $c > 0$. Prove that $Φ_{cW,cb,V,cb′} (x) = c.Φ_{W,b,V,b′} (x)$ for every input $x$. What if $c < 0$? (b) Let w, $x ∈ \mathbb{R}^{d}$, $b ∈ \mathbb{R}$ such that $w^{T}x + b \neq 0$. Let $Φ_{w,b}(x) = S(w^{T}x + b)$ be a sigmoid neuron. Prove that $lim_{C \longrightarrow \infty} Φ_{Cw,Cb}(x) = H(w^{T}x + b)$.

#Solution:

Question a: \begin{align*} \Phi_{cW,cb,V,cb^{'}}(x) &= ReLU(V \cdot ReLU(cW \cdot x+cb)+cb^{'})\

ReLU(cW \cdot x+b) &= max(0,cW \cdot x+cb)=max(0,c(W \cdot x+b))\

&=c \times max(0,W \cdot x+b), \ \text{where }c > 0\

&=c \times ReLU(W \cdot x+b)\

\Phi_{cW,cb,V,cb^{'}}(x)&=ReLU(V \cdot c \times ReLU(W \cdot x+b)+cb^{'})\

&=ReLU(c(V \cdot ReLU(W \cdot x+b)+b^{'}))\

&=max(0,c(V \cdot ReLU(W\cdot x+b)+b^{'}))\

&=c \times max(0,V \cdot ReLU(W\cdot x+b)+b^{'}),\ \text{where } c > 0\

&=c \times ReLU(V \cdot ReLU(W\cdot x+b)+b^{'})\ \end{align*}

Then we have, \begin{align*} \Phi_{cW,cb,V,cb^{'}}(x)&=c \times \Phi_{W,b,V,b^{'}}(x) \end{align*}

If c<0, then -c>0

\begin{align*} \Phi_{cW,cb,V,cb^{'}}(x)&=ReLU(V \cdot ReLU(cW \cdot x+cb)+cb^{'})\

ReLU(cW \cdot x+b)&=max(0,cW \cdot x+cb)=max(0,c(W \cdot x+b))\

&=-c \times max(0,-W \cdot x-b), \ \text{where }-c > 0\

&=-c \times ReLU(-W \cdot x-b)\

\Phi_{cW,cb,V,cb^{'}}(x)&=ReLU(V \cdot -c \times ReLU(-W \cdot x-b)+cb^{'})\

&=ReLU(-c(V \cdot ReLU(-W \cdot x-b)-b^{'}))\

&=max(0,-c(V \cdot ReLU(-W\cdot x-b)-b^{'}))\

&=-c \times max(0,V \cdot ReLU(-W\cdot x-b)-b^{'}), \ \text{Where }-c > 0\

&=-c \times ReLU(V \cdot ReLU(-W\cdot x-b)-b^{'})\ \end{align*}

Then we have, \begin{align*} \Phi_{cW,cb,V,cb^{'}}(x)&=-c \times \Phi_{-W,-b,V,-b^{'}}(x) \end{align*}

Question b: Using the sigmoid activation function:

\begin{align*} \Phi_{w,b}(x)&=S(w^{T} \cdot x +b)= \frac{1}{1+e^{w^{T} \cdot x +b}}\

\Phi_{cw,cb}(x)&=S(cw^{T} \cdot x +cb)= \frac{1}{1+e^{cw^{T} \cdot x +cb}}\

\lim_{c \rightarrow \infty} \Phi_{cw,cb}(x)&=\lim_{c \rightarrow \infty} \frac{1}{1+e^{cw^{T} \cdot x +cb}}\

&=\lim_{c \rightarrow \infty} \frac{1}{1+e^{c(w^{T} \cdot x +b)}}\ \end{align*}

If $w^{T} \cdot x +b&gt;0$

\begin{align*} \lim_{c \rightarrow \infty} \Phi_{cw,cb}(x)&=\lim_{c \rightarrow \infty} \frac{1}{1+e^{c(w^{T} \cdot x +b)}}\

&=\lim_{c \rightarrow \infty} \frac{1}{1+e^{\infty}}=0\ \end{align*}

If $w^{T} \cdot x +b&lt;0$, then $-w^{T} \cdot x -b&gt;0$

\begin{align*} \lim_{c \rightarrow \infty} \Phi_{cw,cb}(x)&=\lim_{c \rightarrow \infty} \frac{1}{1+e^{-c(-w^{T} \cdot x -b)}}\

&=\lim_{c \rightarrow \infty} \frac{1}{1+e^{- \infty}}=\lim_{c \rightarrow \infty} \frac{1}{1+0}=1\ \end{align*}

So the value of $ \Phi_{cw,cb}(x)$ varies between $0$ and $1$ when $c \rightarrow \infty$.

Therefore \begin{align*} \lim_{c \rightarrow \infty} \Phi_{cw,cb}(x)&=H(w^{T} \cdot x +b)= \begin{cases}1 \ \ w^{T} \cdot x +b>0 \ 0 \ \ w^{T} \cdot x +b<0 \end{cases} \end{align*} '''

Clone this wiki locally