- As I look for the most capable classification algorithm, models like randomForests, SVM and gradient
boosted algorithms like XGBoost will be tested to find one which gives the highest accuracy.

- Ensemble techniques like bagging and pasting, voting classification & adaptive boost will also be tested
- Models will be evaluated using metrics like the confusion matrix, and accuracy 
- The most capable model will be selected and fine-tuned for improved performance

### IMPORTING LIBRARIES

In [8]:
import numpy as np
import pandas as pd

### SPLITTING THE DATASETS

In [9]:
train_data = pd.read_csv('../datasets/mnist_train.csv')
test_data = pd.read_csv('../datasets/mnist_test.csv')

X_train, y_train = train_data.iloc[:, 1:].values, train_data.iloc[:, 0].values
X_test, y_test = test_data.iloc[:, 1:].values, test_data.iloc[:, 0].values 

# TRAINING & EVALUATING CLASSIFICATION ALGORITHMS

- ### IMPORTING CLASSIFICATION ALGORITHMS

In [14]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC

forestClassifier = RandomForestClassifier() # The Random Forests Classifier  
vectorClassifier = SVC(kernel='rbf') # A Support Vector Machine with rbf kernel

# A list of models
raw_models = [forestClassifier, vectorClassifier]

- ### MODEL TRAINING FUNCTION
Takes a list of models & the training dataset, trains and returns a list of trained models

In [15]:
def trainer(model_list, X_train_, y_train_):
    _trained_models = []
    for model in model_list:
        _trained_models.append(model.fit(X_train_, y_train_))
    
    return _trained_models


trained_models = trainer(raw_models, X_train, y_train)

- ### MODEL PREDICTION AND EVALUATION ON TEST DATA
A class that takes a list of trained models, make predictions on the test data and evaluates their peformance

In [ ]:
from sklearn.metrics import accuracy_score, confusion_matrix

class model_eval:
    def __init__(self, X_test_, y_test_, trained_models_):
        self.X_test_ = X_test_
        self.y_test_ = y_test_
        self.trained_models_ = trained_models_
        self.scores_ = []
        self.accuracies_ = []
    
    def model_predict(self):
        for model in self.trained_models_:
            self.accuracies_.append(accuracy_score(model.predict(self.X_test_), self.y_test_))