## Derivative of the sigmoid function

### Summary
Simple derivation of the derivative of the sigmoid function.

### Why write this?
Some machine learning and deep learning courses introduce models which use the sigmoid function and its derivative (e.g., logistic regression, neural networks with sigmoid activation functions) and may initially gloss over the derivation of the derivative of the sigmoid function, presenting it as a given. This is written for those interested in the details of that derivation.

Though there are certainly other examples elsewhere, this notebook is written to be as clear as possible, including reference to related calculus concepts for convenience. The end goal is a step-by-step derivation of the derivative of the sigmoid function for you to reference and use elsewhere.

### Derivation
First, define the [sigmoid function](https://en.wikipedia.org/wiki/Sigmoid_function) to be
<br>
<div align="center"><b> $g(z)=\frac{1}{1+e^{-z}}$ </b></div>
<br>

**Note:** Another form you may have seen, $g(z)=\frac{e^z}{1+e^z}$, is equivalent to the above as the product of $\frac{e^z}{1+e^z}$ and $\frac{e^{-z}}{e^{-z}}$ gives $\frac{1}{1+e^{-z}}$.

<br>
The derivative of the sigmoid function is then
<br>
<div align="center"><b> $g'(z)=\frac{d}{dz}\frac{1}{1+e^{-z}}$ </b></div>
<br>

The above can be rewritten using the [reciprocal rule](https://en.wikipedia.org/wiki/Differentiation_rules#The_reciprocal_rule) as
<br>
<div align="center"><b> $g'(z)=-\frac{\frac{d}{dz}(1+e^{-z})}{(1+e^{-z})^2}$ </b></div>
<br>

The numerator above can then be expanded using the [sum rule](https://en.wikipedia.org/wiki/Differentiation_rules#Differentiation_is_linear) as
<br>
<div align="center"><b> $g'(z)=-\frac{\frac{d}{dz}1+\frac{d}{dz}e^{-z}}{(1+e^{-z})^2}$ </b></div>
<br>

The numerator above can then be simplified as the first term, $\frac{d}{dz}1$, evaluates to $0$ given the [constant rule](https://en.wikipedia.org/wiki/Differentiation_rules#Constant_term_rule), and the second term, $\frac{d}{dz}e^{-z}$, evaluates to $-e^{-z}$ given the [exponential rule](https://en.wikipedia.org/wiki/Differentiation_rules#Derivatives_of_exponential_and_logarithmic_functions)
<br>
<div align="center"><b> $g'(z)=-\frac{-e^{-z}}{(1+e^{-z})^2}$ </b></div>
<br>


The leading negative and negative in the numerator cancel as
<br>
<div align="center"><b> $g'(z)=\frac{e^{-z}}{(1+e^{-z})^2}$ </b></div>
<br>

The denominator can then be expanded as
<br>
<div align="center"><b> $g'(z)=\frac{e^{-z}}{(1+e^{-z})(1+e^{-z})}$ </b></div>
<br>

The above can then be rewritten as the following product of two fractions
<br>
<div align="center"><b> $g'(z)=\frac{1}{1+e^{-z}}\cdot\frac{e^{-z}}{1+e^{-z}}$ </b></div>
<br>

The second fraction above can be expanded further, first by adding "$0$" to its numerator
<br>
<div align="center"><b> $g'(z)=\frac{1}{1+e^{-z}}\cdot\frac{1+e^{-z}-1}{1+e^{-z}}$ </b></div>
<br>

Then by expanding the (second) fraction as the difference between two fractions with a common denominator $1+e^{-z}$
<br>
<div align="center"><b> $g'(z)=\frac{1}{1+e^{-z}}\Big(\frac{1+e^{-z}}{1+e^{-z}}-\frac{1}{1+e^{-z}}\Big)$ </b></div>
<br>

Finally, given the now three fractions, note the first and last are themselves the sigmoid function, $g(z)=\frac{1}{1+e^{-z}}$, while the second, $\frac{1+e^{-z}}{1+e^{-z}}$, is simply equal to $1$, so that the derivative of the sigmoid function can be written as
<br>
<div align="center"><b> $g'(z)=g(z)\big(1-g(z)\big)$ </b></div>
<br>