# Softmax and Sigmoid

Task: more practice using the `softmax` function, and connect it with the `sigmoid` function.

## Setup

In [1]:
import torch
from torch import tensor
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
def softmax(x):
    return torch.softmax(x, axis=0)

## Task

Try this example:

In [3]:
x1 = tensor([0.1, 0.2, 0.3])
x2 = tensor([0.1, 0.2, 100])

In [4]:
softmax(x1)

tensor([0.3006, 0.3322, 0.3672])

1. Write a block of code that assigns `p = softmax(x1)` then evaluates `p.sum()`. **Before you run it**, predict what the output will be.

In [6]:
p = softmax(x1)
p.sum()

tensor(1.0000)

2. Write a block of code that evaluates `p2 = softmax(x2)` and displays the result. **Before you run it**, predict what it will output.

In [21]:
p2 = softmax(x2)
p2

tensor([4.0638e-44, 4.4842e-44, 1.0000e+00])


3. Evaluate `torch.sigmoid(tensor(0.1))`. Write an expression that uses `softmax` to get the same output. *Hint*: Give `sigmoid` a two-element `tensor([num1, num2])`, where one of the elements is 0.

In [16]:
torch.sigmoid(tensor([0.1, 0.0]))

tensor([0.5250, 0.5000])

In [18]:
x3 = tensor([0.1, 0.0])
p3 = softmax(x3)
p3

tensor([0.5250, 0.4750])

## Analysis

1. A valid probability distribution has no negative numbers and sums to 1. Is `softmax(x)` a valid probability distribution? Why or why not?

softmax(x) is a valid probability distribution because all the input values are rescaled to positive values that sum up to 1.

2. Jargon alert: sometimes `x` is called the "logits" and `x.softmax(axis=0).log()` (or `x.log_softmax(axis=0)`) is called the "logprobs", short for "log probabilities". Complete the following expressions for `x1` (from the example above).

In [23]:
logits = x1
logprobs = x1.softmax(axis=0)
probabilities = softmax(x1)

3. In light of your observations about the difference between `softmax(x1)` and `softmax(x2)`, why might `softmax` be an appropriate name for this function?

softmax is an appropriate name because it is a smoothed, softened approximation of the the actual maximum function.