## Change Log 

#### Dataset & Model Overview
- After getting a decent understanding of how a supervised neural network like the one I did in predicting wine quality with a FFNN, I wanted to try working with image recognition and detection.
- I am using this dataset (https://www.cs.toronto.edu/~kriz/cifar.html), this dataset is a subset of the 80 million tiny images dataset.
- This dataset has 60,000 images with 10 classes and 6000 images per class.
- I want this model to be able to correctly classify the class that the image belongs to.
- I am planning to use a Convolutional Neural Network (CNN) for this, I chose a CNN because they are very good at pattern recognition and image detection with classification which is exactly what is needed for this task.

#### Learnings & Findings
- I am extracting the dataset in the way specified here (https://www.cs.toronto.edu/~kriz/cifar.html) alongside extra code to extract and combine them into a singular csv file.
- I am going to try the CNN structure similar to what is found here (https://www.tensorflow.org/tutorials/images/cnn), I used Tensorflow for my last model and it worked well so I am going to try and use it again.

In [None]:
import tensorflow as tf
import numpy as np
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import pandas as pd

Here I am just importing the necessary libraries and technologies that I may need to build this model.

In [11]:
import pickle

def unpickle(file):
    with open(file, 'rb') as fo:
        batch = pickle.load(fo, encoding='bytes')
    return batch

data_list = []
labels_list = []

# Loop through all batch files
for i in range(1, 6):
    file = f'data_batch_{i}'  # Modify if needed
    batch = unpickle(file)
    
    data_list.append(batch[b'data'])  # Image data
    labels_list.append(batch[b'labels'])  # Labels

# Convert lists to NumPy arrays
data = np.vstack(data_list)
labels = np.hstack(labels_list)

# Convert to DataFrame
columns = [f'pixel_{i}' for i in range(data.shape[1])]  # Column names for pixels
df = pd.DataFrame(data, columns=columns)
df.insert(0, 'label', labels)  # Insert labels as the first column

# Save to CSV
df.to_csv('cifar10_data.csv', index=False)
print("CSV file saved successfully!")


CSV file saved successfully!


ChatGPT generated this for me as I wanted to store this as a CSV file rather than the way they were initially being handled. 

What it is doing is, looping through each of the files which has 10,000 images in it and labels of what each image is of so the model can begin to build up an idea of what should be in each class, it then extracts the data and stores it in a pandas dataframe which can then be converted to a singular csv file.

In [None]:
df = pd.read_csv('cifar10_data.csv')
