# Large Language Models (LLMs)
## The softmax function
The **softmax** is a function that converts a vector $\boldsymbol{z}=[z_0,z_1,...,z_{q-1}]^T$ of real values into a vector of probabilities $\boldsymbol{p}=[p_0,p_1,...,p_{q-1}]^T$ using the following formula:
<br> $\large p_i=\frac{exp(z_i)}{\sum_{j=0}^{q-1}exp(z_j)}$, for $i=0,1,...,q-1$
<br> **Hint:** For numerical stability, we subtract the maximum of components of vector $z$ from all its components before using the softmax.
<br> **Reminder:** The softmax **amplifies differences**. Thus, it gives more emphasis to the components with larger values.
<br> In the following, we implement the formula of softmax for a given vector.
<br>The code is at : https://github.com/ostad-ai/Large-Language-Models
<br>Explanation: https://www.pinterest.com/HamedShahHosseini/Deep-Learning/Large-Language-Models

In [1]:
# Importing the required module
import numpy as np

In [2]:
# softmax function
# input: a numpy-array, output: a numpy-array
def softmax(z,safe=True):
    if safe: # subtracting maximum from components
        exp_z=np.exp(z-np.max(z))
    else:
        exp_z=np.exp(z)
    return exp_z/np.sum(exp_z)

In [3]:
# Example
z_in=np.array([1,3,.5])
probs=softmax(z_in)
print('Example with Softmax:')
print(f'Input vector: {z_in}')
print(f'output of softmax (Probabilities): {probs}')
print(f'Sum of probabilities (for checking): {np.sum(probs)}')

Example with Softmax:
Input vector: [1.  3.  0.5]
output of softmax (Probabilities): [0.11116562 0.82140902 0.06742536]
Sum of probabilities (for checking): 0.9999999999999999
