# Personality Prediction System (Model Testing)

This notebook will perform the following tasks:

1. <b>Preprocessing</b> the testing data
2. <b>Deserializing</b> the trained model
3. <b>Making predictions</b> using the testing data and the model
4. <b>Evaluating</b> the performance of the model

Based on the performance of the model, its hyperparameters can be fine-tuned to ensure high levels of accuracy, precision and recall.

This notebook continues off from `model_training.ipynb`.

---

## Initialization

In [1]:
# General imports

import numpy as np
import pandas as pd

In [2]:
# Creating a Pandas dataframe using the training data

df_test = pd.read_csv('./data/test.csv')

df_test

Unnamed: 0,Gender,Age,openness,neuroticism,conscientiousness,agreeableness,extraversion,Personality (class label)
0,Female,20,7,9,9,5,5,dependable
1,Male,17,5,4,5,2,4,serious
2,Female,25,5,5,7,2,4,serious
3,Female,18,6,2,7,4,7,serious
4,Female,19,2,4,7,1,3,responsible
...,...,...,...,...,...,...,...,...
310,Female,19,6,5,6,4,3,extraverted
311,Male,18,2,5,8,3,7,dependable
312,Male,18,7,5,6,2,7,serious
313,Male,23,6,7,5,4,3,extraverted


## Exploratory Data Analysis

In [3]:
# Checking whether the dataset has missing values:

missing_values = df_test.isnull().sum()

missing_values = missing_values.apply(lambda x: f'{x} missing values' if x > 0 else 'No missing values')

missing_values

Gender                       No missing values
Age                          No missing values
openness                     No missing values
neuroticism                  No missing values
conscientiousness            No missing values
agreeableness                No missing values
extraversion                 No missing values
Personality (class label)    No missing values
dtype: object

In [7]:
# Checking how many unique targets are in the dataframe:

targets = df_test.iloc[:, -1]
unique_targets = np.unique(targets)
print(unique_targets)
print(f'There are {len(unique_targets)} unique targets in the dataframe.')

['dependable' 'extraverted' 'lively' 'responsible' 'serious']
There are 5 unique targets in the dataframe.


Both the training data and the testing data have the same number of unique targets, so we have no issues regarding targets.