## Project: Machine Learning Classifier for Sorting Lego Pieces  
## Description:  
Developing and training algorithms for image-based classification of four different Lego categories in a conveyor system.

## Authors:  
- Amer Alhamwi 
- Wael Hamid 
University of British Columbia Okanagan (UBCO)  

## Date:  
November 9, 2024  


In [50]:
# Import necessary libraries for image processing,
# logistic regression modeling, and evaluating with confusion matrix and accuracy score
import os
import numpy as np
from PIL import Image  
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score
import matplotlib.pyplot as plt


In [51]:
# The following code loads images from the specified folder, applies image cropping 
# to enhance training accuracy, and assigns classification labels based on the first 
# letter of each image's name. The processed images and labels are stored in arrays 
# for subsequent use in the training process.


# Declare the two folders names for image files
path = "C:\\Users\\Admin\\Desktop\\ENGR 418\\Assignments\\Project_Stage_1"

def test_function(path):
    folder_training = os.path.join(path, "training")
    folder_testing = os.path.join(path, "testing")
    
    return folder_training, folder_testing

folder_training, folder_testing = test_function(path)

# Define the function before calling it
def get_image_data(folder):
    x = []
    y = []

    for pic in os.listdir(f"{folder}/"):
        # opening image using image from PIL
        image = Image.open(f"{folder}/{pic}").convert("L")
        # Crop the images to the center to make it more accurate
        top = int(np.floor(image.height / 4))
        bottom = int(np.ceil(image.height * (3 / 4)))
        left = int(np.floor(image.width / 4))
        right = int(np.ceil(image.width * (3 / 4)))
        #print(image.size)

        # new image is about [500*500] pixels
        cropped_image = image.crop((left, top, right, bottom))

        # resizing images to 64*64 pixels (4096)
        resized_image = cropped_image.resize((64, 64))
        
        data = np.asarray(resized_image)  # Convert image to an array
        
        vec = np.hstack(data)  # Flatten 2D image to a 1D array
        
        x.append(vec)  # Append the flattened image data to the list

        # Assign classes based on first letter/number of the image name
        if str.lower(pic[0]) == "c":
            y.append(0)  # Cir
        elif str.lower(pic[0]) == "r":
            y.append(1)  # rec
        elif str.lower(pic[0]) == "2":
            y.append(2)  # 2b1
        else:
            y.append(3)  # Squ

    x = np.array(x)  # Convert to numpy array
    y = np.array(y)  # Convert to numpy array
 
    return x, y



In [52]:
# This cell loads the training data, trains a Logistic 
# Regression model with the training set, and then
# evaluates the model by predicting on the same training data.

#Set paths for training and testing folders based on the project path
folder_training, folder_testing = test_function(path)

x1_train, y1_train = get_image_data(folder_training)

# Train the Logistic Regression model 
model = LogisticRegression(solver='lbfgs', max_iter=1000)  # Increase max_iter to 1000 or more

#print(f"Data from first directory: {x1_train.shape}, Labels: {y1_train.shape}")

# Train the Logistic Regression model 
model = LogisticRegression(solver='lbfgs', max_iter=1000)  # Increase max_iter to 1000 
model.fit(x1_train, y1_train)

y_pred1 = model.predict(x1_train)

print("Accuracy:", np.round(accuracy_score(y1_train, y_pred1),3))
print("Confusion Matrix:\n", confusion_matrix(y1_train, y_pred1))

Accuracy: 1.0
Confusion Matrix:
 [[36  0  0  0]
 [ 0 36  0  0]
 [ 0  0 36  0]
 [ 0  0  0 36]]


In [53]:
# This code block loads the testing data, uses the trained Logistic Regression model 
# to make predictions on the test set, and evaluates the model's performance by 
# displaying the accuracy score and a confusion matrix.

# Load the testing data
x1_test, y1_test = get_image_data(folder_testing)

# Predict on the test set
y_pred2 = model.predict(x1_test)

# Print the accuracy and confusion matrix
print("Accuracy:", np.round(accuracy_score(y1_test, y_pred2), 3))
print("Confusion Matrix:\n", confusion_matrix(y1_test, y_pred2))

Accuracy: 0.972
Confusion Matrix:
 [[17  0  0  1]
 [ 0 18  0  0]
 [ 0  0 18  0]
 [ 1  0  0 17]]


- The model achieved **perfect accuracy on the training data**, demonstrating an excellent fit to the data it was trained on, and also achieved **a high level of accuracy on the test data**.
- This indicates **strong generalization capabilities** with minimal errors when classifying new data, showing that the model is well-suited for the task.
