## Softmax
(based on a tutorial by Python Engineer in Youtube)

**Softmax**

The softmax function, also known as softargmax or normalized exponential function is a generalization of the logistic function to multiple dimensions. It is used in multinomial logistic regression and is often used as the last activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes.

The softmax function takes as input a vector z of K real numbers, and normalizes it into a probability distribution consisting of K probabilities proportional to the exponentials of the input numbers. That is, prior to applying softmax, some vector components could be negative, or greater than one; and might not sum to 1; but after applying softmax, each component will be in the interval (0,1), and the components will add up to 1, so that they can be interpreted as probabilities. Furthermore, the larger input components will correspond to larger probabilities.

The standard (unit) softmax function:
![title](img/softmax.jpg)

source: wikipedia

**Softmax Implementation with Numpy**

In [1]:
import torch
import torch.nn as nn
import numpy as np

In [3]:
def softmax(x):
    return np.exp(x)/np.sum(np.exp(x), axis=0)  

In [4]:
x = np.array([2.0, 1.0, 0.1])
outputs = softmax(x)
print('softmax numpy', outputs)


softmax numpy [0.65900114 0.24243297 0.09856589]


**Softmax Implementation with Pytorch**

In [5]:
x = torch.tensor([2.0, 1.0, 0.1])
outputs = torch.softmax(x, dim=0) #we want to compute it based on the first axis
print(outputs)

tensor([0.6590, 0.2424, 0.0986])
