<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/marco-canas/didactica_ciencia_datos/blob/main/2_referentes/geron/part_1/c_5/chap_5_svm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
  </td>
</table>

# Chapter 5. Support Vector Machines

# Guido Van Rossum Creador de Python
<img src = 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/66/Guido_van_Rossum_OSCON_2006.jpg/300px-Guido_van_Rossum_OSCON_2006.jpg'>

A Support Vector Machine (SVM) is a powerful and versatile Machine Learning model, capable of performing linear or nonlinear classification, regression, and even outlier detection. It is one of the most popular models in Machine Learning, and anyone interested in Machine Learning should have it in their toolbox. SVMs are particularly well suited for classification of complex small- or medium-sized datasets.
This chapter will explain the core concepts of SVMs, how to use them, and how they work.

Una máquina de soporte vectorial (SVM) es un modelo de aprendizaje automático potente y versátil, capaz de realizar 
* clasificación lineal o no lineal, 
* regresión e 
* incluso detección de valores atípicos.

* Es uno de los modelos más populares en Machine Learning, y 
* cualquier persona interesada en Machine Learning debería tenerlo en su caja de herramientas.

Las SVM son particularmente adecuadas para la clasificación de conjuntos de datos complejos de tamaño pequeño o mediano.

Este capítulo explicará 
* los conceptos básicos de las SVM, 
* cómo usarlas y 
* cómo funcionan.

## Linear SVM Classification


The fundamental idea behind SVMs is best explained with some pictures. 

Figure 5-1 shows part of the iris dataset that was introduced at the end of Chapter 4. 

The two classes can clearly be separated easily with a straight line (they are linearly separable). 

The left plot shows the decision boundaries of three possible linear classifiers. The model whose
decision boundary is represented by the dashed line is so bad that it does not even separate
the classes properly. 

The other two models work perfectly on this training set, but their decision boundaries come so close to the instances that these models will probably not perform as well on new instances. 

In contrast, the solid line in the plot on the right represents the decision boundary of an SVM classifier; this line not only separates the two classes but also stays as far away from the closest training instances as possible. 

You can
think of an SVM classifier as fitting the widest possible street (represented by the parallel
dashed lines) between the classes. This is called large margin classification.

La idea fundamental detrás de las SVM se explica mejor con algunas imágenes.

La Figura 5-1 muestra parte del conjunto de datos del iris que se presentó al final del Capítulo 4.

<img src = 'https://github.com/marco-canas/didactica_ciencia_datos/blob/main/4_images/images_of_referents/geron/5_chapter/fig_5_1.png?raw=true'>

Las dos clases se pueden separar fácilmente con una línea recta (son linealmente separables). 

El gráfico de la izquierda muestra los **límites de decisión** de tres posibles clasificadores lineales.

Notice that adding more training instances “off the street” will not affect the decision boundary at all: it is fully determined (or “supported”) by the instances located on the edge of the street. 

These instances are called the support vectors (they are circled in Figure 5-1).

![Sencitivity of feature scale](https://github.com/marco-canas/didactica_ciencia_datos/blob/main/4_images/images_of_referents/geron/5_chapter/fig_5_2.png?raw=true)

## WARNING  

SVMs are sensitive to the feature scales, as you can see in Figure 5-2: in the left plot, the vertical scale is much larger than the horizontal scale, so the widest possible street is close to horizontal. 

After feature scaling (e.g., using Scikit-Learn’s StandardScaler ), the decision boundary in the right plot looks much
better.

# Soft Margin Classification  



If we strictly impose that all instances must be off the street and on the right side, this is called hard margin classification. 

There are two main issues with hard margin classification. 
* First, it only works if the data is linearly separable. 

* Second, it is sensitive to outliers.   
* 
* Figure 5-3 shows the iris dataset with just one additional outlier: on the left, it is impossible to find a hard margin; on the right, the decision boundary ends up very different from the one we saw in Figure 5-1 without the outlier, and it will probably not generalize
as well.

![figura 5.3](https://github.com/marco-canas/didactica_ciencia_datos/blob/main/4_images/images_of_referents/geron/5_chapter/fig_5_3.png?raw=true)

To avoid these issues, use a more flexible model. 

The objective is to find a good balance between keeping the street as large as possible and limiting the margin violations (i.e., instances that end up in the middle of the street or even on the wrong side). 

This is called soft margin classification.


When creating an SVM model using Scikit-Learn, we can specify a number of hyperparameters. 

C is one of those hyperparameters. 

If we set it to a low value, then we end up with the model on the left of Figure 5-4. With a high value, we get the model on the right. 

Margin violations are bad. It’s usually better to have few of them. However, in this case the model on the left has a lot of margin violations but will probably generalize better.

![figura 5.4](https://github.com/marco-canas/didactica_ciencia_datos/blob/main/4_images/images_of_referents/geron/5_chapter/fig_5_4.png?raw=true)

## TIP
If your SVM model is overfitting, you can try regularizing it by reducing C.

The following Scikit-Learn code loads the iris dataset, scales the features, and then trains a linear SVM model (using the LinearSVC class with C=1 and the hinge loss function, described shortly) to detect Iris virginica flowers:

In [1]:
import numpy as np # 
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
iris = datasets.load_iris()
X = iris["data"][:, (2, 3)] # petal length, petal width
y = (iris["target"] == 2).astype(np.float64) # Iris virginica
svm_clf = Pipeline([
("scaler", StandardScaler()),
("linear_svc", LinearSVC(C=1, loss="hinge")),
])
svm_clf.fit(X, y)



## Referencias  

* El modelo de Máquina de soporte vectorial: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html  

* Cuaderno de Geron sobre Maquinas de Soporte Vectorial: https://github.com/ageron/handson-ml2/blob/master/05_support_vector_machines.ipynb

* Formulación matemática de la SVC: https://scikit-learn.org/stable/modules/svm.html#svm-mathematical-formulation

* Duval, R. (2004). Semiosis y pensamiento humano: 