<a href="https://colab.research.google.com/github/andreacini/ml-19-20/blob/master/06_other_methods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Machine Learning

Prof. Cesare Alippi  
Andrea Cini  ([`andrea.cini@usi.ch`](mailto:daniele.grattarola@usi.ch)  )    
Daniele Zambon ([`daniele.zambon@usi.ch`](mailto:daniele.zambon@usi.ch)  )

---
# Lab 06: Other methods

In this lab we will see how to use some of the more advanced methods that we saw in the last lectures.


---
# Trees

Let's start defining our usual helper functions.

In [0]:
# first we define some helper functions to generate data and plot results
import numpy as np
from sklearn.datasets import make_classification, make_circles, make_moons
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap


# function to generate classification problems
def get_data(n, ctype='simple'):
  if ctype == 'simple':
    x, y = make_classification(n_features=2, 
                               n_redundant=0, 
                               n_informative=2, 
                               n_clusters_per_class=1)
    x += np.random.uniform(size=x.shape) # add some noise
  elif ctype == 'circles':
    x, y = make_circles(n, noise=0.1, factor=0.5)
  
  elif ctype == 'moons':
    x, y = make_moons(n, noise=0.1)
  else:
    raise ValueError
  return x, y

# function to plot decision boundaries
def plot_decision_surface(model, x, y, transform=lambda x:x):    
  #init figure
  fig = plt.figure()

  # Create mesh
  h = .01  # step size in the mesh
  x_min, x_max = x[:, 0].min() - .5, x[:, 0].max() + .5
  y_min, y_max = x[:, 1].min() - .5, x[:, 1].max() + .5
  xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                        np.arange(y_min, y_max, h))

  # plot train data
  plt.scatter(x[:, 0], x[:, 1], c=y, edgecolors='k')

  plt.xlim(xx.min(), xx.max())
  plt.ylim(yy.min(), yy.max())

  plt.xlabel(r'$x_1$')
  plt.ylabel(r'$x_2$');

  y_pred = model.predict(transform(np.c_[xx.ravel(), yy.ravel()]))

  y_pred = y_pred.reshape(xx.shape)
  plt.contourf(xx, yy, y_pred, alpha=.5)

We are going to use the [Iris](https://archive.ics.uci.edu/ml/datasets/iris) dataset where the objective is to classify flowers based on some features:

1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm

In [0]:
from sklearn.datasets import load_iris

 
iris = load_iris()

x = iris.data
y = iris.target

# import pandas as pd
# data = pd.DataFrame(np.c_[iris.data, iris.target], columns=iris.feature_names + ['Class',])

# data.sample(10)

Let's give a look at the data.

In [0]:
x_prime = x[:, :2]

plt.scatter(x_prime[:, 0], x_prime[:, 1], c=y)
plt.xlabel('sepal length (cm)')
plt.ylabel('sepal width (cm)')

Again we can use `scikitlearn` to build Decision Trees preatty easily in python.

In [0]:
from sklearn.tree import DecisionTreeClassifier

classifier = DecisionTreeClassifier() # create an instance of the model
classifier.fit(x_prime, y)            # fit the data

plot_decision_surface(classifier, x_prime, y)

Watch out for overfitting!

Now let's try to build a tree using all the features. To visualize the result we'll look directly at the tree.


In [0]:
from sklearn.tree import plot_tree

classifier = DecisionTreeClassifier(criterion='entropy')
classifier.fit(x, y)  

plt.figure(figsize=(16,8))
plot_tree(classifier, filled=True, feature_names=iris.feature_names, rounded=True, class_names=iris.target_names);

Decision Trees are easy to interpret and that's way they are really popular in financial applications.

# Support Vector Machines

Let's try to use Support Vector Machines to solve our problem, and check how the results change using different kernels.

Remeber, a kernel (oversimplifying a lot) is a function that gives us a particular measure of affinity between two points. We can use kernels in the dual formulation of the SVM problem to project the input space in an high (possibly infinite) dimensional space.

In [0]:
from sklearn.svm import SVC

classifier = SVC(kernel='linear')

classifier.fit(x_prime, y)

plot_decision_surface(classifier, x_prime, y)

Tuning correctly the hyperparameters is foundamental (check [here]()).

Let's use the XOR problem to have a better intuation fo the decision boundaries found by SVMs.

In [0]:
np.random.seed(10)

x = np.random.randn(150, 2) # sample some points from a bivariate diagonal gaussian
y = np.logical_xor(x[:, 0] > 0., x[:, 1] > 0.)

plt.scatter(x[:, 0], x[:, 1], c=y)
plt.xlabel(r'$x_1$')
plt.ylabel(r'$x_2$')

In [0]:
from sklearn.svm import SVC

classifier = SVC(kernel='rbf')

classifier.fit(x, y)

plot_decision_surface(classifier, x, y)