## Loading Libraries
All Python capabilities are not loaded to our working environment by default (even they are already installed in your system). So, we import each and every library that we want to use.

We chose alias names for our libraries for the sake of our convenience (numpy --> np and pandas --> pd, tensorlow --> tf).

Note: You can import all the libraries that you think will be required or can import it as you go along.

In [13]:
import pandas as pd                                     # Data analysis and manipulation
import numpy as np                                      # Fundamental package for linear algebra and multi-dimensional arrays
import os                                               # OS module in Python provides a way of using operating system dependent
import cv2 # Library for image processing
from sklearn.model_selection import train_test_split    # For splitting the data into train and validation set
from sklearn.metrics import f1_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
import matplotlib.pyplot as plt                         # For plotting the graph
%matplotlib inline

## Loading and preparing training data


In [14]:
labels = pd.read_csv(r"C:\Users\Mr.Hassan\DataspellProjects\eye_gender_data\Training_set.csv")   # loading the labels
file_paths = [[fname, 'C:/Users/Mr.Hassan/DataspellProjects/eye_gender_data/train/' + fname] for fname in labels['filename']]
images = pd.DataFrame(file_paths, columns=['filename', 'filepaths'])
train_data = pd.merge(images, labels, how = 'inner', on = 'filename')
train_data

Unnamed: 0,filename,filepaths,label
0,Image_1.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,male
1,Image_2.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,female
2,Image_3.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,female
3,Image_4.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,female
4,Image_5.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,male
...,...,...,...
9215,Image_9216.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,male
9216,Image_9217.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,male
9217,Image_9218.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,male
9218,Image_9219.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...,male


In [15]:
# Initialize data and labels lists
num_classes = 2
image_size = 64
data = []
labels = []

# Iterate through rows of the train data file

for index, row in train_data.iterrows():

    # Read image file and convert to grayscale
    image = cv2.imread(row['filepaths'], cv2.IMREAD_GRAYSCALE)

    # Resize image
    image = cv2.resize(image, (image_size, image_size))
    
    # Append image and label to data and labels lists
    data.append(image)
    labels.append(row['label'])

## Data Pre-processing
It is necessary to bring all the images in the same shape and size, also convert them to their pixel values because all machine learning or deep learning models accepts only the numerical data. Also we need to convert all the labels from categorical to numerical values.

In [16]:
# Convert data and labels lists to numpy arrays
data = np.array(data)
labels = np.array(labels)

In [17]:
# Normalize data
data = data / 255.0

## Building Model & Hyperparameter tuning
Now we are finally ready, and we can train the model.


In [18]:
# Split data into train and validation sets
x_train, x_val, y_train, y_val = train_test_split(data, labels, test_size=0.2)

In [19]:
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical

# Create a label encoder
label_encoder = LabelEncoder()

# Fit the label encoder to the labels
label_encoder.fit(y_train)

# Encode the labels
y_train_encoded = label_encoder.transform(y_train)
y_val_encoded = label_encoder.transform(y_val)

# Convert the labels to one-hot encoded vectors
y_train_one_hot = to_categorical(y_train_encoded)
y_val_one_hot = to_categorical(y_val_encoded)

In [20]:
# Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(image_size, image_size, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

## Validate the model


In [21]:
# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model to the training data
model.fit(x_train, y_train_one_hot, epochs=10, batch_size=32, validation_data=(x_val, y_val_one_hot))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1b1af2d10d0>

## Predict The Output For Testing Dataset 😅
We have trained our model, evaluated it and now finally we will predict the output/target for the testing data (i.e. Test.csv).

#### Load Test Set
Load the test data on which final submission is to be made.

In [35]:
test_labels = pd.read_csv(r"C:\Users\Mr.Hassan\DataspellProjects\eye_gender_data\Testing_set.csv")   # loading the labels
test_file_paths = [[fname, 'C:/Users/Mr.Hassan/DataspellProjects/eye_gender_data/test/' + fname] for fname in test_labels['filename']]
test_images = pd.DataFrame(test_file_paths, columns=['filename', 'filepaths'])
test_data = pd.merge(test_images, test_labels, how = 'inner', on = 'filename')
test_data

Unnamed: 0,filename,filepaths
0,Image_1.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
1,Image_2.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
2,Image_3.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
3,Image_4.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
4,Image_5.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
...,...,...
2300,Image_2301.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
2301,Image_2302.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
2302,Image_2303.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...
2303,Image_2304.jpg,C:/Users/Mr.Hassan/DataspellProjects/eye_gende...


## Data Pre-processing on test_data


In [37]:
# Repeat the same process for the test data
testdata = []
for index, row in test_data.iterrows():
    image = cv2.imread(row['filepaths'], cv2.IMREAD_GRAYSCALE)
    image = cv2.resize(image, (image_size, image_size))
    testdata.append(image)

In [38]:
testdata = np.array(testdata)
testdata = testdata / 255.0

### Make Prediction on Test Dataset
Time to make a submission!!!

In [56]:
# Make predictions on the test data
predictions = model.predict(testdata)



In [None]:
# Classify predictions as male or female
predicted_classes = []
for prediction in predictions:
    if prediction[0] > prediction[1]:
        predicted_classes.append('male')
    else:
        predicted_classes.append('female')

In [67]:
#save predictions to a dataframe
predictions_df = pd.DataFrame(predicted_classes, columns=['label'])
predictions_df.head()

Unnamed: 0,label
0,female
1,male
2,female
3,female
4,female


## **How to save prediciton results locally via jupyter notebook?**
If you are working on Jupyter notebook, execute below block of codes. A file named 'submission.csv' will be created in your current working directory.

In [71]:
# prediction is nothing but the final predictions of your model on input features of your new unseen test data
res = pd.concat([test_data['filename'], predictions_df['label']], axis=1)

# Save the new dataframe to a CSV file
predictions_df.to_csv("submission.csv", index = False) 

# **Well Done! 👍**
You are all set to make a submission. Let's head to the **[challenge page](https://dphi.tech/challenges/4-week-deep-learning-online-bootcamp-final-assignment-sex-determination-by-morphometry-of-eyes/144/submit)** to make the submission.