This notebook explains what is softmax and what is logit, and how to use calculate softmax cross entropy in TensorFlow

#### The softmax function is defined as:
$
\text{softmax}(z)_i = \dfrac{e^{z_i}}{\sum_j e^{z_j}},
$
where $z$ is a `k-`dimension vector.

In the context of the softmax, $z$ is known as the "logits".


#### Softmax entropy
Given a probability vector `P` and the corresponding label vector `y`, the softmax cross entropy is defined as:

$
H(y, P) = -\sum_i y^{(i)} log P^{(i)}
$

#### softmax_cross_entropy_with_logits in TensorFlow
The above definiton of cross entropy is defined on the label `y` and probability vector `P`. 

But in `tensorflow.nn.softmax_cross_entropy_with_logits`, the fuction that calculates softmax cross entropy, takes `labels` and `logits`. 

So, that means internally, it uses the logits to calculate the probability tensor, and then calculate the corss entropy.

In [1]:
import numpy as np
import tensorflow as tf

In [2]:
def softmax(z):
    exp_z = np.exp(z)
    return exp_z / np.sum(exp_z)

In [3]:
def softmax_cross_entropy_with_logits(z, y):
    p = softmax(z)
    return -np.sum(np.log(p) * y)

In [4]:
y = np.array([1.0, 0.0, 1.0, 1.0])
z = np.array([1.0, 2.0, 3.0, 4.0], dtype=np.float32)
cross_entropy = softmax_cross_entropy_with_logits(z, y)
print(cross_entropy)

5.32056906819


In [5]:
tf_y = tf.constant(y, dtype=tf.float32)
tf_z = tf.constant(z, dtype=tf.float32)
tf_cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=tf_y, logits=tf_z)

init = tf.global_variables_initializer()
with tf.Session() as ss:
    ss.run(init)
    result = ss.run(tf_cross_entropy)
    print(result)

5.32057
