## Multi-Class Classification

Multi-class classification is a type of supervised machine learning problem where the goal is to categorize input data into one of three or more classes or categories. Unlike binary classification, which deals with only two possible outcomes, multi-class classification involves multiple classes that are mutually exclusive.

### Key Characteristics of Multi-Class Classification:
- **Multiple Classes**: The model predicts one class label from a set of multiple possible classes.
- **Single Output**: For each input instance, only one class label can be assigned, meaning the classes are typically exclusive (e.g., an email can be classified as either "spam," "promotional," or "important," but not more than one at a time).
- **Common Applications**: Multi-class classification is widely used in various applications, including:
  - Image recognition (e.g., classifying objects in images)
  - Document classification (e.g., categorizing news articles)
  - Sentiment analysis (e.g., classifying reviews as positive, negative, or neutral)

### Evaluation Metrics:
To evaluate the performance of multi-class classification models, several metrics can be used, including:
- **Accuracy**: The proportion of correctly predicted instances over the total number of instances.
- **Confusion Matrix**: A table that summarizes the performance of the model by comparing predicted and actual class labels.
- **F1 Score**: The harmonic mean of precision and recall, useful for understanding the balance between false positives and false negatives.

In multi-class classification tasks, models often use techniques like Softmax activation in the output layer to predict probabilities for each class, allowing the selection of the most likely class based on the computed scores.

### Building a 2-Layer Neural Network for Multi-Class Classification

Constructed a simple 2-layer neural network using TensorFlow and Keras for multi-class classification. The main components of this implementation are as follows:

### 1. Model Definition
- We defined a function `create_model` that constructs a neural network with one hidden layer and one output layer. 
- The hidden layer consists of 128 neurons with ReLU (Rectified Linear Unit) activation, allowing the network to learn non-linear relationships in the data.
- The output layer has a linear activation function, producing raw scores (logits) for each class without applying a softmax function. This design choice is made for numerical stability during training.

### 2. Input and Output Specifications
- The model is designed to accept input samples with 10 features, which is indicated by the input shape `(10,)`.
- The output layer is configured to predict probabilities for four classes, making this network suitable for multi-class classification tasks.

### 3. Model Compilation
- The model is compiled using the Adam optimizer, a popular choice for training neural networks due to its adaptive learning rate properties.
- We used `sparse_categorical_crossentropy` as the loss function, which is appropriate for multi-class classification problems where class labels are provided as integers.

### 4. Data Generation and Training
- Random training data is generated to simulate a dataset for training the model. The dataset consists of 1000 samples, each with 10 features, along with random class labels from 0 to 3.
- The model is trained for 10 epochs with a batch size of 32, allowing it to learn from the training data.

### 5. Making Predictions
- After training, the model is used to make predictions on a new input example. 
- The linear outputs are processed using the Softmax function to convert them into probabilities, representing the model's confidence in each class.
- The predicted class is determined by selecting the class with the highest probability.

In [2]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

In [5]:
# Define the model
def create_model(input_shape, num_classes):
    model = keras.Sequential([
        layers.Dense(128, activation='relu', input_shape=input_shape),  # Hidden layer
        layers.Dense(num_classes)  # Output layer with linear activation
    ])
    return model

# Set parameters
input_shape = (10,)  # Correct shape for 10 features
num_classes = 4  # Number of output classes

# Create the model
model = create_model(input_shape, num_classes)

# Compile the model
model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',  # Suitable loss for multi-class classification
              metrics=['accuracy'])

# Print the model summary
model.summary()

In [6]:
# Example: Generate some random data for training

# Random training data (1000 samples, 10 features)
X_train = np.random.random((1000, 10))
y_train = np.random.randint(num_classes, size=(1000,))  # Random class labels

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Example of making predictions
example_input = np.random.random((1, 10))  # A new input example
linear_output = model.predict(example_input)  # Get linear outputs

# Apply softmax to get probabilities
probabilities = tf.nn.softmax(linear_output)

# Get the predicted class
predicted_class = tf.argmax(probabilities, axis=1)

print("Linear Output:", linear_output)
print("Softmax Probabilities:", probabilities.numpy())
print("Predicted Class:", predicted_class.numpy())

Epoch 1/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 601us/step - accuracy: 0.2560 - loss: 1.7734
Epoch 2/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 401us/step - accuracy: 0.2266 - loss: 1.3863
Epoch 3/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 465us/step - accuracy: 0.2347 - loss: 1.3863
Epoch 4/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 441us/step - accuracy: 0.2558 - loss: 1.3863
Epoch 5/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.2136 - loss: 1.3863
Epoch 6/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 435us/step - accuracy: 0.2309 - loss: 1.3863
Epoch 7/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 448us/step - accuracy: 0.2397 - loss: 1.3863
Epoch 8/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 464us/step - accuracy: 0.2484 - loss: 1.3863
Epoch 9/10
[1m32/32[0m [32m━━━━━━━━━━━━