# B"H


## Softmax Activation Function

<br>

![](https://drive.google.com/uc?id=1OngR478lvQM7OWBvvKOo_5lX_J8TKf8U)

<br>

### Logits

In [None]:
logits = [.9, .7]

In [None]:
cat_logit = logits[0]
dog_logit = logits[1]

### Cat/dog example

In [5]:
import math

In [None]:
denominator = math.pow(math.e, cat_logit) + math.pow(math.e, dog_logit)

In [None]:
denominator

4.473355818627425

In [None]:
cat_prob = math.pow(math.e, cat_logit) / denominator

cat_prob

0.549833997312478

In [None]:
dog_prob = math.pow(math.e, dog_logit) / denominator

dog_prob

0.4501660026875221

In [None]:
cat_prob + dog_prob

1.0

### Create softmax function


In [4]:
def softmax(x_array):

    denominator = 0
    for x in x_array:
        denominator += math.pow(math.e, x)

    probs = []
    for x in x_array:
        probs.append(
            math.pow(math.e, x) / denominator
        )

    return probs


### Cat/dog example

In [None]:
logits = [.9, .7]

CAT_IDX = 0
DOG_IDX = 1

probs = softmax(logits)

print(f'Cat: {probs[CAT_IDX]}, Dog: {probs[DOG_IDX]}')

Cat: 0.549833997312478, Dog: 0.4501660026875221


### Example with 4 logits

In [None]:
logits = [.9, .7, .6, .0001]

HOUSE_IDX = 0
TREE_IDX = 1
LIZARD_IDX = 2
SKUNK_IDX = 3

probs = softmax(logits)

print(f'House: {probs[HOUSE_IDX]}, Tree: {probs[TREE_IDX]}, Lizard: {probs[LIZARD_IDX]}, Skunk: {probs[SKUNK_IDX]}')

House: 0.3371363104229755, Tree: 0.276023865322535, Lizard: 0.24975672161474802, Skunk: 0.13708310263974147


In [None]:
sum_val = 0

for prob in probs:
    sum_val += prob

sum_val

1.0

### Translational Invariance

See: https://stats.stackexchange.com/questions/208936/what-is-translation-invariance-in-computer-vision-and-convolutional-neural-netwo

In [None]:
list_of_logits = [
    [1, 4],
    [101, 104],
    [-101, -104],                      
]

In [None]:
for logits in list_of_logits:
    print(
        softmax(logits)
    )

[0.04742587317756678, 0.9525741268224331]
[0.04742587317756679, 0.9525741268224331]
[0.9525741268224333, 0.04742587317756679]


### Note

Why use softmax as opposed to standard normalization?
- https://stackoverflow.com/questions/17187507/why-use-softmax-as-opposed-to-standard-normalization

### With 1 logit = 1 ... duh!

In [1]:
import numpy as np

In [14]:
for x in np.arange(-10, 10):
    print(softmax([x])[0])

1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
