<h1>A Neuron</h1>

<h3>What's a Neuron?</h3>

<p>
    An <strong>artificial neuron</strong> (or <strong>node</strong>) is a simple object that can take input, do some calculations with the input, and produce an output. An example is shown below where x1 and x2 a re two inputs and y1 is the produced output.
</p>

<p>Neurons can take any number of inputs and can also produce any number of outputs.</p>

<p>Just like a biological neuron, each neuron is only capable of a small computation, but when working together they become capable of solving large and complicated problems.</p>

In [1]:
from IPython.display import Image, display
display(Image(filename='./images/artificial_neuron.jpg'))

<IPython.core.display.Image object>

<h3>Neuron Computations</h3>

<p>
    To do the computation to produce the output, we first put the inputs into the following equation (just like in logistic regression).
</p>

$${w1\:. x1 \: + w2\:. x2 \: +b}$$

<ul>
    <li>x1 and x2 are the inputs</li>
    <li>In neural networks, we refer to w1 and w2 as the weights, and b as the bias.</li>
    <li>We plug this value into what is called an <strong>activation function</strong>.<br/>The role of the activation function is to condense the value of the above equation  into a fixed range (often between 0 and 1).</li>
</ul>

<p>A commonly used activation function is the sigmoid function shown below (Which is a function we often use in logistic regression).</p>

$$
p = \frac{1}{1 + e^{-x}}
$$

<p>The sigmoid has the following shape.</p>

In [2]:
display(Image(filename='./images/sigmoid_shape.jpg'))

<IPython.core.display.Image object>

<p>To get the output from the inputs we do the following computation.</p>
<ul>
    <li>The weights, w1 and w2, and the bias, b, control what the neuron does.</li>
    <li>We call these values (w1, w2, b) the parameters.</li>
    <li>The function f is the activation function (in this case the sigmoid function).</li>
    <li>The value y is the neuron’s output.</li>
</ul>

$$
y = f(w1\:. x1 \: + w2\:. x2 \: +b) = \frac{1}{1 + e^{-(w1\:. x1 \: + w2\:. x2 \: +b)}}
$$

<strong>This function can be generalized to have any number of inputs (xi) and thus the corresponding number of weights (wi).</strong>

<h3>Activation Functions</h3>

<p>There are three commonly used <strong>activation functions</strong>: <strong>sigmoid</strong>, <strong>tanh</strong>, and <strong>ReLU</strong>.</p>

<strong>tanh</strong> has a similar form to sigmoid, though ranges from -1 to 1 instead of 0 to 1. Tanh is the hyperbolic tan function and is defined as follows:
$$
f(x) = tanh(x) = \frac{sinh(x)}{cosh(x)} = \frac{e^{x} - e^{x}}{e^{x} + e^{-x}}
$$
<p>And its graph looks like this:</p>

In [3]:
display(Image(filename='./images/tanh_shape.jpg'))

<IPython.core.display.Image object>

<p>ReLU stands for Rectified Linear Unit. It is the identity function for positive numbers and sends negative numbers to 0.</p>
$$
ReLU(x) = 
\begin{cases}
0 & \text{if x <= 0} \\
x & \text{if x > 0}
\end{cases}
$$
<p>And its graph looks like this:</p>

In [4]:
display(Image(filename='./images/ReLU_shape.jpg'))

<IPython.core.display.Image object>

<strong>
    Any of these activation functions will work well.<br/>
    Which one to use will depend on specifics of our data.<br/>
    In practice, we figure out which one to use by comparing the performance of different neural networks.
</strong>

<h3>An Example</h3>

<p>
    Assume we have a neuron that takes 2 inputs and produces 1 output and whose activation function is the <strong>sigmoid</strong>. The parameters are:
</p>
<ul>
    <li>Weights (w1, w2) = [0, 1]</li>
    <li>Bias (b) = 2</li>
</ul>
<p>If we give the neuron input (1, 2) we get the following calculation.</p>

$$
y = f(w1\:. x1 \: + w2\:. x2 \: +b) = f(0\:. 1 \: + 1\:. 2 \: +2) = \frac{1}{1 + e^{-(0\:. 1 \: + 1\:. 2 \: +2)}} = \frac{1}{1 + e^{-4}} = 0.9820
$$

<strong>The neuron yields an output of 0.9820.</strong>

<p>Alternatively, if we give the neuron input (2, -2) we get the following calculation.</p>


$$
y = f(w1\:. x1 \: + w2\:. x2 \: +b) = f(0\:. 2 \: + 1\:. (-2) \: +2) = \frac{1}{1 + e^{-(0\:. 2 \: + 1\:. (-2) \: +2)}} = \frac{1}{1 + e^{0}} = 0.5
$$

<strong>The neuron with this input yields an output of 0.5.</strong>

<strong>
    A neuron by itself does not have much power, but when we build a network of neurons, we can see how powerful they are together.
</strong>