# Finding the best meta-learner.

This document shows how to find the best meta-learner by iterating over multiple models.

### Load the dataset.

First, let's load the dataset into a pandas data frame and display the first rows.
The feature names have a prefix of **v1_*** or **v2_***.* The features prefixed with v1_ are mel frequency cepstral coefficients extracted from audio signals. Features prefixed with v2_ are summary statistics extracted from accelerometer signals. Note that column names can be anything. But to make things easier, in this case a prefix was added so we can get the corresponding views' column indices.


In [None]:
import pandas as pd
import numpy as np
from multiviewstacking import load_example_data

(X_train,y_train,X_test,y_test,ind_v1,ind_v2,le) = load_example_data()

X_train.head()

### Defining the first-level-learners

Let's define the first level learners for each of the views and the meta-learner. The `multiviewstacking` library supports most of `scikit-learn` classifiers. A `MultiViewStacking` model is not limited to a single type of model but supports heterogenous types of models. For example, if you know that a KNN classifier is more suitable for audio classification and Gaussian Naive Bayes is better for the accelerometer view, you can specify a different model for each view.

In [None]:
from multiviewstacking import MultiViewStacking
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC

# Define the first-level-learner for the audio view.
# In this case, a KNN classifier with k=3. 
m_v1 = KNeighborsClassifier(n_neighbors=3)

# Define the first-level-learner for the accelerometer view.
# In this case, a Naive Bayes classifier.
m_v2 = GaussianNB()

### Define the list of possible meta-learners

We will store the set of possible meta-learners as a list of tuples. Each tuple stores the name and the meta-learner itself.

In [None]:
list_meta_learners = [
    ("RandomForest", RandomForestClassifier(n_estimators=50, random_state=123)),
    ("Naive Bayes", GaussianNB()),
    ("SVM", SVC(probability=True,random_state=123))
]

### Iterate through the list of meta-learners to find the best one



In [None]:
for m in list_meta_learners:
    
    model = MultiViewStacking(views_indices = [ind_v1, ind_v2],
                      first_level_learners = [m_v1, m_v2],
                      meta_learner = m[1],
                      k = 10,
                      random_state = 100)
    
    model.fit(X_train, y_train)
    predictions = model.predict(X_test)
    accuracy = np.sum(y_test == predictions) / len(y_test)
    print(m[0] + " accuracy " + str(accuracy))