## Sparse, Categorical and Binary Cross Entropy Examples

#### Important Things
1. Output Shape (How many neurons in the last layer that has softmax)
2. Loss in model.compile 
        a.sparse_categorical_crossentropy - y should be a flattened vector in the shape (n,) or (n,1)
        b.categorical_crossentropy - y should be one hot encoded vector (encoder.fit_transform(y).toarray()) .. shape should be (n,noofclasses)
        c.binary_crossentrpy - if y is flattened vector and output softmax neurons =1
        d.binary_crossentrpy - if y is one hot encoded , ouput neurons =2 

In [1]:
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder(handle_unknown='ignore')

## Sparse Categorical CrossEntropy

In [2]:
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10,input_shape=(100,)))
model.add(tf.keras.layers.Dense(5,activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.summary()


Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 10)                1010      
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 55        
Total params: 1,065
Trainable params: 1,065
Non-trainable params: 0
_________________________________________________________________


In [3]:
X=np.random.random([500,100])
y=np.concatenate([np.array([1]*100),np.array([0]*100),np.array([3]*100),np.array([2]*100),np.array([4]*100)])
y=y.reshape(-1,)   # both work
y=y.reshape(-1,1)  # both work
X.shape,y.shape

((500, 100), (500, 1))

In [4]:
model.fit(X,y,epochs=3) 

Train on 500 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x172c946a0>

In [5]:
onehot_y=enc.fit_transform(y).toarray()

In [6]:
onehot_y.shape

(500, 5)

## Categorical CrossEntropy

In [7]:
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10,input_shape=(100,)))
model.add(tf.keras.layers.Dense(5,activation='softmax'))
model.compile(loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 10)                1010      
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 55        
Total params: 1,065
Trainable params: 1,065
Non-trainable params: 0
_________________________________________________________________


In [8]:
X=np.random.random([500,100])
y=np.concatenate([np.array([1]*100),np.array([0]*100),np.array([3]*100),np.array([2]*100),np.array([4]*100)])
y=y.reshape(-1,1)  # Has to be reshaped for one hot encoder.. only accepts 2-D shape (n,1)
onehot_y=enc.fit_transform(y).toarray()
X.shape,y.shape,onehot_y.shape

((500, 100), (500, 1), (500, 5))

In [9]:
model.fit(X,onehot_y,epochs=3) 

Train on 500 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x172f23c18>

## Binary CrossEntropy

### Since we are using 2 output neurons, y has to be one hot encoded

In [10]:
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10,input_shape=(100,)))
model.add(tf.keras.layers.Dense(2,activation='softmax'))
model.compile(loss='binary_crossentropy',metrics=['accuracy'])
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 10)                1010      
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 22        
Total params: 1,032
Trainable params: 1,032
Non-trainable params: 0
_________________________________________________________________


In [11]:
X=np.random.random([500,100])
y=np.concatenate([np.array([1]*250),np.array([0]*250)])
y=y.reshape(-1,1)  # Has to be reshaped for one hot encoder.. only accepts 2-D shape (n,1)
onehot_y=enc.fit_transform(y).toarray()
X.shape,y.shape,onehot_y.shape

((500, 100), (500, 1), (500, 2))

In [12]:
model.fit(X,onehot_y,epochs=3) 

Train on 500 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x173bad6d8>

### Since we are using 1 output neuron, y has to be flat not one hot encoded

In [13]:
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10,input_shape=(100,)))
model.add(tf.keras.layers.Dense(1,activation='softmax'))
model.compile(loss='binary_crossentropy',metrics=['accuracy'])
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_6 (Dense)              (None, 10)                1010      
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 11        
Total params: 1,021
Trainable params: 1,021
Non-trainable params: 0
_________________________________________________________________


In [14]:
X=np.random.random([500,100])
y=np.concatenate([np.array([1]*250),np.array([0]*250)])
y=y.reshape(-1,1)  # Has to be reshaped for one hot encoder.. only accepts 2-D shape (n,1)
X.shape,y.shape

((500, 100), (500, 1))

In [15]:
model.fit(X,y,epochs=3) 

Train on 500 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x174319128>