# Perceptrons
---

- It is a type of artificial neuron, developed in the 50s by Frank Rosenblatt.
- Used to understand **Sigmoid neurons**

## Working

- Perceptrons take several binary inputs and produces a single output.
- Rosenblatt introduced a simple rule to compute the output
- **Weights** - w1, w2,... are real numbers expressing the importance of the respective inputs to the output.
- The neuron's output 0 or 1 is determined by whether the weighted sum $\sum_j w_j x_j$ is less than or greater than some threshold value (real number, parameter of the neuron).
i.e.

$\begin{eqnarray}
  \mbox{output} & = & \left\{ \begin{array}{ll}
      0 & \mbox{if } \sum_j w_j x_j \leq \mbox{ threshold} \\
      1 & \mbox{if } \sum_j w_j x_j > \mbox{ threshold}
      \end{array} \right.
\tag{1}\end{eqnarray}
$

> It is a device that makes decisions by weighing up evidence.

E.g - Let's say we want to go to the park, but there are certain conditions which could influence the decision.
Such as - Is the weather good? Is the park closed? Are my friends going?
We can represent these binary factors with inputs as x1, x2, x3.
Now, when the perceptron tries to make decision we emphasize that the weather is probably the most important factor in our decision making process and thus we give it a large weight value. Also, a threshold is chosen.

### Simplify the notations -

- The weighted sum is changed to a matrix dot product, $w \cdot x \equiv \sum_j w_j x_j$. Here, w and x are vectors whose components are the weights and inputs. 
- Also, the threshold is moved to the other side of the inequality. and is replaced by the  perceptron **bias**. $b \equiv -\mbox{threshold}$

The new rules are

$
   \begin{eqnarray}
  \mbox{output} = \left\{ 
    \begin{array}{ll} 
      0 & \mbox{if } w\cdot x + b \leq 0 \\
      1 & \mbox{if } w\cdot x + b > 0
    \end{array}
  \right.
\tag{2}\end{eqnarray}
$

- Bias is a measure of how easy it is to get the perceptron to output a 1.

Perceptrons can also be used to compute elementary logical functions such as AND, OR and NAND.

Most interesting thing is when a **learning algorithm** can tune the weights and biases in response to external stimuli without direct intervention. It can learn to solve problems.

# Sigmoid Neurons
---

> Small change in some weight or bias in the network. It will cause the output to change slightly. This is what makes the learning algorithms possible.

- We can modify the weights and biases using the above to get the network to behave more in the manner we want.
- The problem with perceptrons is that changing a weight or bias can completely flip the neuron output. This could have very different affects in the rest of the network. So, it is very difficult to gradually modify the weights and biases so that the network gets closer to the desired behavior.
- This problem is solved by using a new type of neuron called a **Sigmoid neuron**.
- They are similar to perceptrons only modified so that small change in weight or bias causes a small change in their output.

## Working

Similar to a perceptron, it has inputs x1, x2,... but instead of binary they can take any value between 0 and 1. It also has weights for each input and an overall bias, b.
Also, the output is not 0 or 1. Instead it is  $\sigma(w \cdot x+b)$, where $\sigma$ is called the sigmoid function and is defined by
$\begin{eqnarray} 
  \sigma(z) \equiv \frac{1}{1+e^{-z}}.
\end{eqnarray}$

Therefore, the outputs of sigmoid neuron is $\begin{eqnarray} 
\frac{1}{1+\exp(-\sum_j w_j x_j-b)}.
\end{eqnarray}$

## Similarity to the perceptron model

Suppose $z \equiv w \cdot x + b$ is a large positive value. Then $ e^{-z}\approx 0 $ and so $\sigma(z) \approx 1$. If $z$ is large and positive, output of sigmoid neuron is 1, same like perceptrons. Similarly if $z$ is very negative, then $e^{-z} \rightarrow \infty$ and $\sigma(z) \approx 0$. Same as perceptrons.
When the size of $z$ increases there's much deviation from the perceptron model.