In [1]:
import tensorflow as tf  #Imports the TensorFlow library to access its deep learning functionalities.
from tensorflow.keras.models import Sequential #Imports the Sequential model, a linear stack of layers in Keras.
from tensorflow.keras.layers import Dense, Dropout #Imports the Dense and Dropout layers to build the neural network.
from tensorflow.keras.optimizers import SGD #Imports the Stochastic Gradient Descent (SGD) optimizer for training the model.
import matplotlib.pyplot as plt #Imports matplotlib for data visualization (for plotting graphs).
import pandas as pd #Imports the pandas library for data manipulation (reading and processing data).
import numpy as np

In [2]:
test = pd.read_csv("./test_data.csv")
train = pd.read_csv("./train_data.csv")

In [None]:
x_train = train.drop("label", axis=1).values
x_test = test.drop("label", axis=1).values

In [None]:
shape = x_train.shape[1]

In [None]:
x_train = x_train.reshape((-1, shape))
x_test = x_test.reshape((-1, shape))


In [None]:
x_train = x_train / 255.0
x_test = x_test / 255.0


In [None]:
y_train = train["label"].values
y_test = test["label"].values

In [None]:
y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)

#Converts the label data into categorical format using one-hot encoding. This is required for multi-class classification problems

In [None]:

# c. Define the network architecture using Keras
model = Sequential([
    Dense(shape, "relu"),
    Dense(64, "relu"),
    Dense(10, "softmax")
])


In [None]:
model.compile(optimizer="SGD", loss="categorical_crossentropy", metrics=["accuracy"])


In [None]:
M = model.fit(x_train, y_train, batch_size=128, epochs=10)


In [None]:
# e. Evaluate the network
test_loss,test_acc=model.evaluate(x_test,y_test)
print('Test accuracy:',test_acc)
print('Test Loss:',test_loss)

In [None]:
plt.plot(M.history["accuracy"])

In [None]:
plt.plot(M.history["loss"])


In [None]:
classes=['airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck']


In [None]:
n = 15
plt.imshow(x_test[n].reshape((32,32,3)))

predictions = model.predict(x_test)

print("actual: ", classes[np.argmax(y_test[n])])
print("predicted: ", classes[np.argmax(predictions[n])])

In [None]:
'''
### Line-by-line Explanation of the Code

#### Imports and Setup
```python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import SGD
import matplotlib.pyplot as plt

import pandas as pd
import numpy as np
```
- `import tensorflow as tf`: Imports the TensorFlow library to access its deep learning functionalities.
- `from tensorflow.keras.models import Sequential`: Imports the `Sequential` model, a linear stack of layers in Keras.
- `from tensorflow.keras.layers import Dense, Dropout`: Imports the `Dense` and `Dropout` layers to build the neural network.
- `from tensorflow.keras.optimizers import SGD`: Imports the Stochastic Gradient Descent (SGD) optimizer for training the model.
- `import matplotlib.pyplot as plt`: Imports `matplotlib` for data visualization (for plotting graphs).
- `import pandas as pd`: Imports the `pandas` library for data manipulation (reading and processing data).
- `import numpy as np`: Imports the `numpy` library for numerical operations.

#### Data Loading and Preprocessing
```python
test = pd.read_csv("./test_data.csv")
train = pd.read_csv("./train_data.csv")
```
- `pd.read_csv("./test_data.csv")`: Reads the CSV file `test_data.csv` into a pandas DataFrame.
- `pd.read_csv("./train_data.csv")`: Reads the CSV file `train_data.csv` into a pandas DataFrame.

```python
x_train = train.drop("label", axis=1).values
x_test = test.drop("label", axis=1).values
```
- `train.drop("label", axis=1)`: Drops the "label" column (which is the target label) from the `train` dataset.
- `.values`: Converts the DataFrame into a NumPy array, which is easier to work with in machine learning.
- Similarly for the `test` dataset, we remove the "label" column and extract the features into `x_test`.

```python
shape = x_train.shape[1]
x_train = x_train.reshape((-1, shape))
x_test = x_test.reshape((-1, shape))
```
- `x_train.shape[1]`: Gets the number of features (columns) in the `x_train` dataset.
- `x_train.reshape((-1, shape))`: Reshapes `x_train` into a 2D array where each row is a flattened image. The `-1`
automatically calculates the number of rows.
- The same reshaping process is applied to `x_test`.

```python
x_train = x_train / 255.0
x_test = x_test / 255.0
```
- This normalizes the pixel values of the images 
(assuming pixel values are in the range 0-255) by dividing by 255.0 to bring the values into the range [0, 1].

```python
y_train = train["label"].values
y_test = test["label"].values
```
- Extracts the labels (target variable) from both the `train` and `test` datasets.

```python
y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)
```
- Converts the label data into categorical format using one-hot encoding. This is required for multi-class classification problems.

#### Building the Model
```python
model = Sequential([
    Dense(shape, "relu"),
    Dense(64, "relu"),
    Dense(10, "softmax")
])
```
- `Sequential()`: Initializes a sequential model where layers are added one after the other.
- `Dense(shape, "relu")`: Adds a dense (fully connected) layer with `shape` neurons (the number of input features) and \
uses the ReLU activation function (`"relu"`).
- `Dense(64, "relu")`: Adds another dense layer with 64 neurons and ReLU activation.
- `Dense(10, "softmax")`: Adds the output layer with 10 neurons 
(corresponding to the 10 classes in the classification task) and uses the softmax activation function to output probabilities for each class.

#### Compiling the Model
```python
model.compile(optimizer="SGD", loss="categorical_crossentropy", metrics=["accuracy"])
```
- `optimizer="SGD"`: Specifies the use of Stochastic Gradient Descent as the optimizer.
- `loss="categorical_crossentropy"`: Specifies the loss function, categorical crossentropy, used for multi-class classification.
- `metrics=["accuracy"]`: Tracks the accuracy metric during training.

#### Training the Model
```python
M = model.fit(x_train, y_train, batch_size=128, epochs=10)
```
- `model.fit(x_train, y_train, batch_size=128, epochs=10)`: Trains the model on the training data. It uses a batch size of 128 and trains for 10 epochs.
- `M`: Stores the training history, which contains the loss and accuracy values at each epoch.

#### Evaluating the Model
```python
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)
print('Test Loss:', test_loss)
```
- `model.evaluate(x_test, y_test)`: Evaluates the model on the test dataset and returns the loss and accuracy.
- Prints the test accuracy and test loss.

#### Plotting Training History
```python
plt.plot(M.history["accuracy"])
plt.plot(M.history["loss"])
```
- Plots the training accuracy and loss from the history object returned by `model.fit()`.

#### Class Names and Predictions
```python
classes=['airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck']
n = 15
plt.imshow(x_test[n].reshape((32,32,3)))
```
- `classes`: Defines the names of the 10 classes in the dataset.
- `n = 15`: Sets the index of the test image to be displayed.
- `plt.imshow(x_test[n].reshape((32,32,3)))`: Reshapes the selected test image and displays it.

```python
predictions = model.predict(x_test)
```
- `model.predict(x_test)`: Makes predictions on the test dataset and stores the output in `predictions`.

#### Displaying Actual and Predicted Results
```python
print("actual: ", classes[np.argmax(y_test[n])])
print("predicted: ", classes[np.argmax(predictions[n])])
```
- `np.argmax(y_test[n])`: Gets the index of the actual class for the `n`-th test image.
- `np.argmax(predictions[n])`: Gets the predicted class index for the `n`-th test image.
- Prints the actual and predicted class names.

---

### Possible Questions with Answers

1. **What is the purpose of normalizing the images (`x_train` and `x_test`) by dividing by 255.0?**
   - **Answer**: Normalizing the pixel values to a range of [0, 1] helps speed up the training process and improves model convergence. 
   Without normalization, the large pixel values (from 0 to 255) could hinder the gradient descent optimization process.

2. **Why do we use one-hot encoding for the labels?**
   - **Answer**: One-hot encoding transforms categorical labels into a binary matrix, 
   which is suitable for training multi-class classification models. It allows the model to treat
   each class as a distinct entity and enables the softmax output function to work effectively.

3. **What is the significance of the `softmax` activation function in the output layer?**
   - **Answer**: The `softmax` function converts raw output scores (logits) into probabilities that sum to 1. 
   This is essential for multi-class classification problems, where the output represents the probability of each class.

4. **What is the purpose of using the Stochastic Gradient Descent (SGD) optimizer?**
   - **Answer**: SGD is used for updating the model weights during training. 
   It updates the weights iteratively after processing each batch of data.
   Although it is slower than some other optimizers, it is computationally efficient and commonly used for large datasets.

5. **Explain the role of the `Dense` layers in the model.**
   - **Answer**: `Dense` layers are fully connected layers,
   meaning each neuron is connected to every neuron in the previous layer. These layers allow the model to learn complex 
   relationships between input features and output classes.

6. **What is the significance of the loss function `categorical_crossentropy`?**
   - **Answer**: `categorical_crossentropy` is the appropriate loss function for multi-class classification problems
   , where each instance belongs to exactly one class. It measures the difference between the predicted probability 
   distribution and the true distribution (one-hot encoded labels).

7. **Why are we using `plt.imshow()` to display the test images?**
   - **Answer**: `plt.imshow()` is used to visualize the image in the test dataset to help confirm the model’s 
   predictions by comparing the predicted label to the actual image.

8. **What does `model.evaluate()` do?**
   - **Answer**: `model.evaluate()` computes the loss and accuracy of the model on a given dataset (in this case, the test data). It helps to assess the model's performance after training.

9. **What is the role of the activation function 'relu' in the hidden layers?**
   - **Answer**: The `relu` (Rectified Linear Unit) activation function introduces non-linearity to the model,
   enabling it to learn complex patterns in the data. It is computationally efficient and widely used in hidden layers.

10. **What is the significance of `np

.argmax()` in the final prediction comparison?**
    - **Answer**: `np.argmax()` returns the index of the maximum value in an array, which corresponds to the predicted class
    with the highest probability. It is used to determine the final predicted class for a sample.


'''