# Converting Outputs to Probabilities Using the Logistic Function

We don't always have to use the softmax function to create a probability distrbution over the outputs.

A logistic function with normalization can also give us a good probability distribution from the output.

The probability distribution that can be obtained from the normalized logistic function (also called the sigmoid function) is less extreme than that of the softmax.

So, it might not work as well as a loss function, but oftentimes yields better thresholdable probabilities.

Here we compute the cross-entropy loss using the normalized sigmoid function instead of the softmax.

In [1]:
import torch
import torch.nn.functional as F

data = torch.Tensor([[6, 2, 1.9]])

weights = torch.Tensor([[1, 0],[0, 1],[0, 1]])

c = torch.mm(data, weights)
print("c1 and c2: " + str(c))

result = F.sigmoid(torch.autograd.Variable(c))
print("sigmoid: " + str(result))

# Normalize it

result = result / torch.sum(result)
print("normalized sigmoid: " + str(result))


c1 and c2: tensor([[ 6.0000,  3.9000]])
sigmoid: tensor([[ 0.9975,  0.9802]])
normalized sigmoid: tensor([[ 0.5044,  0.4956]])


The rest of the code remains the same as before.

In [17]:
result = torch.log(result)
print("log(sigmoid): " + str(result))

# The correct category
target = torch.LongTensor(1)
target[0] = 0 

loss = F.nll_loss(result, torch.autograd.Variable(target))
print("Loss: "+str(loss.data.item()))


log(sigmoid): Variable containing:
-0.6844 -0.7020
[torch.FloatTensor of size 1x2]

Loss: 0.6844037175178528


Compare the loss obtained using the sigmoid function with the loss from the softmax function.

In [18]:
loss = F.cross_entropy(torch.autograd.Variable(c), torch.autograd.Variable(target))
print("Loss: "+str(loss.data.item()))


Loss: 0.11551953107118607


As you can see, the loss is lower for the softmax function reflecting the higher (extreme) confidence of the classifier in the category it has correctly selected.

The confidence of the classifier and the loss are less extreme for the normalized sigmoid function, making it a rather useful function for thresholding.