# Softmax

The Softmax Function is the final step in a CNN, it takes the outputs from the final layer of neurons and transforms them to be probability values between 0 and 1. In this way, instead of a vector of voting weights, we end up with a vector of probabilities where there is a single probability for each label/class. This is how our CNN makes its final classification predictions.

From Wikipedia: [Softmax Function](https://en.wikipedia.org/wiki/Softmax_function)

"The Softmax Function... is a generalization of the logistic function that "squashes" a K-dimensional vector of arbitrary real values to a K-dimensional vector of real values, where each entry is in the range (0, 1), and all the entries add up to 1."

$S(y_i) = \frac{e^{y_i}}{\sum_{j=1}^Je^{y_j}}$
---





# Do It

Write your own softmax function from scratch that will work on both 1-D and 2-D matrices. 

The following inputs should yield the given outputs:

### 1-D input: 

$\begin{bmatrix}
  1 & 2 & 3 & 6 \\
\end{bmatrix}$

### 1-D output: 

 Notice that the values in this matrix add up to 1 and are scaled exponentially.

$\begin{bmatrix}
  0.00626879 & 0.01704033 & 0.04632042 & 0.93037047 \\
\end{bmatrix}$

### 2-D input:

$\begin{bmatrix}
  1 & 2 & 3 & 6 \\
  2 & 4 & 5 & 6 \\
  3 & 8 & 7 & 6 \\
\end{bmatrix}$

### 2-D output:

Notice that each row in the 2-D output adds up to 1. 

$\begin{bmatrix}
  0.00626879 & 0.01704033 &  0.04632042 & 0.93037045] \\
  0.01203764 & 0.08894681 & 0.24178252 & 0.65723302 \\
   0.00446236 & 0.66227241 & 0.24363641 & 0.08962882 \\
\end{bmatrix}$

In [0]:
import numpy as np

In [0]:
def softmax(X):
    if type(X)!=np.ndarray:
        X = np.array(X)
    if len(X.shape)==1:
        X = X.reshape(1, -1)
    
        
    top = np.exp(X)
    bottom = np.sum(top, axis=1).reshape(-1, 1)
    
    return top/bottom

In [3]:
X = [1, 2, 3, 6]
print('X:', X)
print('softmax(X):', softmax(X))

('X:', [1, 2, 3, 6])
('softmax(X):', array([[0.00626879, 0.01704033, 0.04632042, 0.93037047]]))


In [4]:
X = np.array([[1, 2, 3, 6], [2, 4, 5, 6], [3, 8, 7, 6]])
print('X:', X)
print('softmax(X):', softmax(X))

('X:', array([[1, 2, 3, 6],
       [2, 4, 5, 6],
       [3, 8, 7, 6]]))
('softmax(X):', array([[0.00626879, 0.01704033, 0.04632042, 0.93037047],
       [0.01203764, 0.08894682, 0.24178252, 0.65723302],
       [0.00446236, 0.66227241, 0.24363641, 0.08962882]]))
