# Applied Machine Learning 
## Project – Stage 1 (Sorting Lego Pieces Using Raw Images)

We have been hired by a company to develop machine learning algorithms for a sorting facility. The requirement is that the sorting device takes images of items on a conveyor belt, and then uses and machine learning algorithm to classify the items into classes. Then, the items get routed through different routes on the conveyor belt depending on their class.

For the sake of this project, we are given RGB images, and our focus will be on developing the classification algorithms. Also, for simplicity, it is assumed that items are Lego pieces of three different types with the following shapes (top view): Rectangles (2x4), squares (2x2), and circles (2x2). Examples are shown below.

<img align="center" src="Lego.JPG">

This algorithm classifies these three classes with an acceptable level of accuracy. It uses the raw image (grayscale conversion and scaling/cropping are acceptable) as an input to a single neuron classifier with < 𝟒𝟎𝟗𝟕 weights (trainable parameters). We are given two datasets to achieve this goal, each containing multiple images of each class. The dataset in the folder ‘training’ for training, and the one in the folder ‘testing’ for testing, are used. The names of the folders or files should not be changed.


In [1]:
# The required libraries are imported here
import os
from pathlib import Path
import numpy as np
import PIL
from matplotlib import pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.model_selection import train_test_split
%matplotlib inline

In [2]:
# This cell contains the functions defined to read the images from the given directories,
# separate them into classes, and preprocess them so they can be used by the model.

def image_cropper(img: np.array, height_percentage: int = 40, width_percentage: int = 40) -> np.array:
    """
    the image is cropped from both sides of the height and width
    :param img: the images as an np.array
    :param width_percentage: the percentage of the pixels from the width of the image we would like to crop
    :param height_percentage: the percentage of the pixels from the height of the image we would like to crop
    :return: cropped image as np.array
    """
    height, width = img.shape[0], img.shape[1]
    ### print(img.shape)
    cropped_height_amount = height // 100 * height_percentage
    cropped_width_amount = height // 100 * width_percentage
    img = img[cropped_height_amount // 2:height - cropped_height_amount // 2,
          cropped_width_amount // 2:width - cropped_width_amount // 2]
    return img


def image_scaler(img, new_height=20, new_width=20):
    """
    changes the image scale
    :param img: the images as an np.array
    :param new_height: the new height of the scaled image
    :param new_width: the new width of the scaled image
    :return: scaled image as np.array
    """
    img = PIL.Image.fromarray(img)
    img = img.resize((new_width, new_height))
    img = np.array(img)
    return img


def image_standardization(img, max_value=255):
    """
    changes the values of the image pixels to be between 0 and max_value
    :param max_value: max_value of the new pixels
    :param img: the images as an np.array
    :return: standardized image
    """
    ### print(type(img))
    img = (img - np.min(img)) / (np.max(img) - np.min(img)) * max_value
    ### print(type(img))
    #A# applying numpy operations changes the image format from 'PIL.Image.Image' to 'numpy.ndarray'
    ### print(img.shape)
    return img


def separate_classes(img_files, name_separation_chars=3, test_flag=False, classes_dict=None):
    """
    separates the images into classes based on their file names
    :param img_files: list of image file names
    :param name_separation_chars: number of characters in the name that should get used for separation
    :return: a list of classes for image names, a dict containing the name and the corresponding class number of images
    """
    names = list(set(img_name[:name_separation_chars] for img_name in img_files))
    ### print(names); print(len(names))
    if not test_flag:
        classes_dict = dict(enumerate(names))
        ### print(enumerate(names)); print(classes_dict)
        # changing the order of keys and items in the dict
        classes_dict = {v: k for k, v in classes_dict.items()}
        ### print(classes_dict)
    classes = []
    for img_name in img_files:
        classes.append(classes_dict[img_name[:name_separation_chars]])    
    ### print(classes); print(len(classes))
    return classes, classes_dict


def get_files_list(data_dir, name_separation_chars=3, test_flag=False, classes_dict=None):
    """
    gets the list of image names from the directory
    :param data_dir: path to the images file
    :return: list of image names, their classes and the dict for classes
    """
    data = os.listdir(data_dir)
    ### print(data, len(data))
    classes, classes_dict = separate_classes(data, name_separation_chars=name_separation_chars, test_flag=test_flag,
                                             classes_dict=classes_dict)
    return data, classes, classes_dict


def image_reader(img_dir):
    """
    reads image in grayscale
    :param img_dir: path to the images
    :return: img as np.array
    """
    img = PIL.Image.open(img_dir)
    gray_img = img.convert("L")
    ### print(np.array(gray_img)); print(type(np.array(gray_img))); print(np.array(gray_img).shape)
    ### print(gray_img); print(type(gray_img))
    return gray_img


def get_data(path_to_training_data,
             class_name_chars=3,
             test_flag=False,
             classes_dict=None,
             standardization_flag=True,
             standardization_max_value=255,
             crop_flag=True,
             crop_height_percentage=50,
             crop_width_percentage=50,
             scaling_flag=True,
             scaling_height=64,
             scaling_width=64):
    data = []
    image_names, classes, classes_dict = get_files_list(path_to_training_data, class_name_chars, test_flag=test_flag,
                                                        classes_dict=classes_dict)
    for name in image_names:
        img = image_reader(str(Path(path_to_training_data) / name))
        ### print(str(Path(path_to_training_data) / name))
        ### img.show()
        if standardization_flag:
            img = image_standardization(img, standardization_max_value)
        if crop_flag:
            img = image_cropper(img, crop_height_percentage, crop_width_percentage)
        if scaling_flag:
            img = image_scaler(img, scaling_height, scaling_width)
        ### print(img)
        data.append(img.flatten())
    # converting classes list to a column array
    classes = np.array(classes).reshape(-1, 1)
    return np.array(data), classes, classes_dict

In [14]:
# The train function uses the get_data function as a base to read all the images from the given directory, create the classes, prepare the dataset, and then train a logistic regression model to classify the images. In the end, it returns the trained model and classes_dict to be passed to the test function

def train_function(path_to_data,
                   class_name_chars=3,
                   test_flag=False,
                   classes_dict=None,
                   standardization_flag=True,
                   standardization_max_value=255,
                   crop_flag=True,
                   crop_height_percentage=50,
                   crop_width_percentage=50,
                   scaling_flag=True,
                   scaling_height=50,
                   scaling_width=50, results = None):
    X_train, y_train, classes_dict = get_data(path_to_data, class_name_chars, test_flag, classes_dict,
                                              standardization_flag, standardization_max_value, crop_flag,
                                              crop_height_percentage, crop_width_percentage, scaling_flag,
                                              scaling_height, scaling_width)
    # train LogisticRegression model
    model = LogisticRegression(max_iter=10000)
    model.fit(X_train, y_train.ravel())

    print(f'\nX_train.shape = {X_train.shape}, y_train.shape = {y_train.shape}')
    y_pred = model.predict(X_train)

    # Reporting the stats from the model
    print(f'\nConfusion Matrix = \n{confusion_matrix(y_train, y_pred)}')
    print(f'\nAccuracy Score = {accuracy_score(y_train, y_pred)}')
    print(f'Model Coefficient Shape = {model.coef_.shape}')
    if results is not None:
        results.append(accuracy_score(y_train, y_pred))
    return model, classes_dict, results


model, classes_dict, results = train_function('data/Lego_dataset_1/training/')


X_train.shape = (108, 2500), y_train.shape = (108, 1)

Confusion Matrix = 
[[36  0  0]
 [ 0 36  0]
 [ 0  0 36]]

Accuracy Score = 1.0
Model Coefficient Shape = (3, 2500)


In [15]:
# The test function uses the get_data function as a base to read all the images from the given directory, get their classes based on the classes_dict from the training function, prepare the dataset, and then predict the classes of the images.

def test_function(path_to_data,
                  model,
                  class_name_chars=3,
                  test_flag=False,
                  classes_dict=None,
                  standardization_flag=True,
                  standardization_max_value=255,
                  crop_flag=True,
                  crop_height_percentage=50,
                  crop_width_percentage=50,
                  scaling_flag=True,
                  scaling_height=50,
                  scaling_width=50, results=None):
    X_test, y_test, classes_dict = get_data(path_to_data, class_name_chars, test_flag, classes_dict,
                                            standardization_flag, standardization_max_value, crop_flag,
                                            crop_height_percentage, crop_width_percentage, scaling_flag, scaling_height,
                                            scaling_width)
    # train LogisticRegression model
    print(f'\nX_test.shape = {X_test.shape}, y_test.shape = {y_test.shape}')
    y_pred = model.predict(X_test)

    # Reporting the stats from the model
    print(f'\nConfusion Matrix = \n{confusion_matrix(y_test, y_pred)}')
    print(f'\nAccuracy Score = {accuracy_score(y_test, y_pred)}')
    print(f'Model Coefficient Shape = {model.coef_.shape}')
    if results is not None:
        results.append(accuracy_score(y_test, y_pred))
    return results


results = test_function('data/Lego_dataset_1/testing/', model, test_flag=True, classes_dict=classes_dict)


X_test.shape = (54, 2500), y_test.shape = (54, 1)

Confusion Matrix = 
[[17  0  1]
 [ 0 17  1]
 [ 0  1 17]]

Accuracy Score = 0.9444444444444444
Model Coefficient Shape = (3, 2500)


In [None]:
# Training the model with different number of inputs
train_results = []
test_results = []
parameters = []
for scale in range(5, 64, 5):
    for crop in range(0, 80, 5):
        print(f'crop: {crop},scale: {scale}')
        parameters.append(([crop,scale]))
        model, classes_dict, results = train_function('data/Lego_dataset_1/training/', crop_height_percentage=crop,
                                             crop_width_percentage=crop, scaling_width=scale, scaling_height=scale, results = train_results)
        test_results = test_function('data/Lego_dataset_1/testing/', model, test_flag=True, classes_dict=classes_dict,
                      crop_height_percentage=crop, crop_width_percentage=crop, scaling_width=scale,
                      scaling_height=scale, results=test_results)
        print('----------------------------------------------------------------------')

In [None]:
# This cell was just used to get the training and testing data, for creating the excel file

train_results_arr = np.array(train_results).reshape(-1,1)
test_results_arr = np.array(test_results).reshape(-1,1)
parameters_arr = np.array(parameters)