### Derivative of the sigmoid function

#### First the math
As we saw in `sigmoid-function-on-array.ipynb`, the sigmoid function is given by
$$\sigma(x)=\frac{1}{1+e^{-x}}$$

- For the derivative let us say $u=1+e^{-x}$
- Then $\frac{d\sigma}{dx}(\sigma(x))=\frac{d\sigma}{dx}\frac{1}{u}=\frac{d\sigma}{dx}u^{-1}$
- Using the chain rule, this gives us $\frac{d\sigma}{dx}\cdot\frac{du}{dx}=-u^{-2}\cdot\frac{du}{dx}$
- Which expands to $-(1+e^{-x})^{-2}\cdot -e^{-x}$
- Cancelling out the negative signs we get $(1+e^{-x})^{-2}\cdot{e^{-x}}$
- Now, we note that the power $-2$ is distributed (distributes the denominator) 
- Which gives us $\frac{1}{1+e^{-x}}\cdot\frac{e^{-x}}{1+e^{-x}}$
- We then note that $1-\sigma(x)=\frac{e^{-x}}{1+e^{-x}}$
- Therefore, the final expression for the derivative of the sigmoid function is $\sigma(x)\cdot(1-\sigma(x)$
 
We can therefore use this to simplify our code

In [1]:
# Here we build an array and define the sigma function

# Import the libraries we'll use
import numpy as np
from sklearn.preprocessing import normalize

# Here we're going to build a matrix

# Create some arrays
_first_array = np.linspace(0, 10, 20)
_second_array = np.linspace(10, 20, 20)
_third_array = np.linspace(20, 30, 20)
_fourth_array = np.linspace(40, 50, 20)
_fifth_array = np.linspace(50, 60, 20)

# Stack the arrays together
new_matrix = np.stack((_first_array, _second_array, _third_array, _fourth_array, _fifth_array), axis=1)
print("Our newly created matrix: \n\n", new_matrix)

# Let's normalize the matrix
normal_matrix = normalize(new_matrix, axis=1, norm='l1')

# Let's vectorize the matrix
vector_matrix = normal_matrix.reshape(-1)
print("\n\nOur vectorized matrix: \n\n", vector_matrix)

# Let's now compute the sigmoid function on the vectorized array
s = 1 / (1 + np.exp(-new_matrix.reshape(-1)))
print("\n\nFinally, a vectorized matrix after the sigmoid operation\n\n", s)

Our newly created matrix: 

 [[ 0.         10.         20.         40.         50.        ]
 [ 0.52631579 10.52631579 20.52631579 40.52631579 50.52631579]
 [ 1.05263158 11.05263158 21.05263158 41.05263158 51.05263158]
 [ 1.57894737 11.57894737 21.57894737 41.57894737 51.57894737]
 [ 2.10526316 12.10526316 22.10526316 42.10526316 52.10526316]
 [ 2.63157895 12.63157895 22.63157895 42.63157895 52.63157895]
 [ 3.15789474 13.15789474 23.15789474 43.15789474 53.15789474]
 [ 3.68421053 13.68421053 23.68421053 43.68421053 53.68421053]
 [ 4.21052632 14.21052632 24.21052632 44.21052632 54.21052632]
 [ 4.73684211 14.73684211 24.73684211 44.73684211 54.73684211]
 [ 5.26315789 15.26315789 25.26315789 45.26315789 55.26315789]
 [ 5.78947368 15.78947368 25.78947368 45.78947368 55.78947368]
 [ 6.31578947 16.31578947 26.31578947 46.31578947 56.31578947]
 [ 6.84210526 16.84210526 26.84210526 46.84210526 56.84210526]
 [ 7.36842105 17.36842105 27.36842105 47.36842105 57.36842105]
 [ 7.89473684 17.89473684 

In [2]:
# Let's define the derivative of the function
ds = s * (1-s)
print(ds)

[2.50000000e-01 4.53958077e-05 2.06115369e-09 0.00000000e+00
 0.00000000e+00 2.33456016e-01 2.68198189e-05 1.21768329e-09
 0.00000000e+00 0.00000000e+00 1.91784003e-01 1.58448938e-05
 7.19379888e-10 0.00000000e+00 0.00000000e+00 1.41722552e-01
 9.36092834e-06 4.24993374e-10 0.00000000e+00 0.00000000e+00
 9.67953324e-02 5.53026834e-06 2.51076493e-10 0.00000000e+00
 0.00000000e+00 6.26265922e-02 3.26717297e-06 1.48330459e-10
 0.00000000e+00 0.00000000e+00 3.91182114e-02 1.93017749e-06
 8.76303474e-11 0.00000000e+00 0.00000000e+00 2.39012619e-02
 1.14030726e-06 5.17699217e-11 0.00000000e+00 0.00000000e+00
 1.44078022e-02 6.73668516e-07 3.05846459e-11 0.00000000e+00
 0.00000000e+00 8.61458765e-03 3.97988430e-07 1.80686577e-11
 0.00000000e+00 0.00000000e+00 5.12569571e-03 2.35122692e-07
 1.06745723e-11 0.00000000e+00 0.00000000e+00 3.04095543e-03
 1.38905226e-07 6.30628882e-12 0.00000000e+00 0.00000000e+00
 1.80102148e-03 8.20620936e-08 3.72568643e-12 0.00000000e+00
 0.00000000e+00 1.065575