# Multiclass Classification and Implementation of Softmax Activation 

### Activation Functions
1. <b>Sigmoid</b> - use for logistic
2. <b>Rectified linear unit (ReLU)</b> - most commonly used
3. <b>Linear</b> - no activation function
4. <b>Softmax</b> - for multiclass classification

<b>1. Specify architecture of the Neuaral Network</b>

In [10]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(units=25, activation='relu'),
    Dense(units=15, activation='relu'),
    Dense(units=10, activation='softmax')
    
])

<b>2. Compile the model, specify the lost and cost</b>

In [12]:
from tensorflow.keras.losses import SparseCategoricalCrossentropy

model.compile(loss=SparseCategoricalCrossentropy())

<b>3. Train on data to minimize cost</b>

In [None]:
model.fit(x, y, epochs=100)

### Better implementation of Softmax to reduce numerical error is:

In [13]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(units=25, activation='relu'),
    Dense(units=15, activation='relu'),
    Dense(units=10, activation='linear')
    
])

In [14]:
from tensorflow.keras.losses import SparseCategoricalCrossentropy

model.compile(loss=SparseCategoricalCrossentropy(from_logits=True))

In [None]:
model.fit(x, y, epochs=100)

### For advanced optimization use ADAMS Algorithm
Adams Algo controls the learning rate and therefore increases or decreases it on the basis of the how the Gradient Descent is proceeding

In [21]:
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow.keras.optimizers import Adam

model.compile(optimizer=Adam(learning_rate=1e-3),
              loss=SparseCategoricalCrossentropy(from_logits=True))