<a href="https://colab.research.google.com/github/laperez/Phyton/blob/master/Introducci%C3%B3n_a_Scikit_learn_alumno.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introducción a Scikit-learn

[Scikit-learn](https://scikit-learn.org/stable/) va a ser nuestra biblioteca principal para el procesamiento de datos, la generación de modelos (aprendizaje automático) y la evaluación de los mismos. 

## Orígenes

Esta herramienta comenzón en 2007 como un proyecto de David Cournapeau para el *Google Summer of Code* (una especie de seminario de verano). Ese mismo año se sumaría Matthieu Brucher, que tendría a este módulo de Python como tema central de su tesis. Posteriormente, en el año 2010, otros desarrolladores se subieron al barco de Scikit-learn, cuya navegación no ha parado desde entonces llevándonos a variados e interesantes puertos.

Es una herramienta ya imprescindible en la comunidad del *Machine learning* y, al ser un proyecto de código abierto, tenemos toda la libertad para usarla en nuestras actividades de investigación.

## Módulos principales

Podemos distinguir seis bloques diferenciados de herramientas:

1. Preprocesamiento
1. Reducción de la dimensionalidad
1. Clasificación
1. Regresión
1. Clustering
1. Selección de modelos

La mayoría de las clases que proporciona Scikit-learn implementando varios métodos clave:

* **Constructor**, en el que se definen los hiperparámetros.
* **fit()**: que sirve para ajustar (entrenar) el modelo y puede recibir más hiperparámetros
* **transform()**: que aplica el modelo a los datos para transformarlos
* **predict()**: que genera una predicción (clase o valor) para un conjunto de datos

En esta introducción vamos a ver un ejemplo de uso entrenando un modelo de clasificación.

# Clasificación

El aprendizaje supervisado de un algoritmo de clasificación es idéntico al caso de la regresión:
```
modelo = AlgoritmoAprendizaje(hiperparámetros)
modelo.fit(X_train, y_train)
y_pred = modelo.predict(X_eval)
evaluacion(y_pred, y_eval)
```



## Evaluación

La diferencia está en cómo evaluamos e interpretamos esa evaluación. Las medidas más típicas son *Accuracy*, *Precision*, *Recall* y *F-score*.

Recordemos la tabla de confusión para evaluar un clasificador:

|  | pred_P | pred_N |
| --- | --- | --- |
| **ref_P** |  TP  | FN |
| **ref_N** | FP | TN  |

*Accuracy = (TP+TN) / (TP+FN+FP+TN)*

*Precision = TP / (TP+FP)*

*Recall = TP / (TP+FN)*

*F-score = 2 * Precision * Recall / (Precision + Recall)*

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/26/Precisionrecall.svg/800px-Precisionrecall.svg.png" width="300">

# Línea base

Recordemos que es una buena práctica tener un primer valor de estimación como línea base, donde trabajemos con los **datos en bruto** y así podamos comprobar la validez de las transformaciones y filtrados del preprocesamiento. 

En esta ocasión vamos a intentar predecir si una persona tiene diabetes con un conjunto conocido que es el de los [indios PIMA](https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.names).

In [13]:
import pandas as pd
import numpy as np
import sklearn

DATA_PATH=""

Cargamos datos:

In [14]:
df = pd.read_csv(DATA_PATH + 'pima-indians-diabetes.csv')
df.columns

Index(['pregnant_times', 'glucose', 'blood_pressure', 'tst', 'insulin', 'bmi',
       'dpf', 'age', 'is_diabetic'],
      dtype='object')

In [17]:
# Seleccionamos columnas de características
feature_cols = ['pregnant_times', 'glucose', 'blood_pressure', 'tst', 'insulin', 'bmi', 'dpf', 'age']
X = df[feature_cols]

# seleccionamos columna objetivo
y = df['is_diabetic']

Vamos a entrenar con un algoritmo KNN. Evaluaremos con validación cruzada debido a la escasez de datos:

In [19]:
# Usaremos validación cruzada para evaluar
from sklearn.model_selection import cross_validate
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

# Dejamos 20% para validación final
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) 
print('train: %d, test %d' % (X_train.shape[0], X_test.shape[0]))

# Evaluamos el modelo
scoring = ('accuracy', 'balanced_accuracy', 'precision', 'recall', 'f1')
clf = KNeighborsClassifier()
scores = cross_validate(clf, X_train, y_train, cv=10, scoring=scoring) # por defecto, es estratificado
for m in scoring:
  print(m, "%0.2f (+/- %0.2f)" % (scores['test_'+m].mean(), scores['test_'+m].std() * 2))

train: 614, test 154
accuracy 0.73 (+/- 0.13)
balanced_accuracy 0.69 (+/- 0.13)
precision 0.65 (+/- 0.22)
recall 0.56 (+/- 0.25)
f1 0.59 (+/- 0.17)


**Pregunta**

¿Cómo varían los resultados al modificar el tamaño de las particiones (parámetro `cv` )?

Razona tu respuesta.

### Probando varios clasificadores

Vamos a construir un evaluador con los siguientes algoritmos de clasificación facilitados por Scikit-learn:

* [Regresión logística](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression)
* [Support Vector Classification (SVC)](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC)
* [Stochastic Gradient Descent](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html#sklearn.linear_model.SGDClassifier)
* [Árbol de decisión](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier)
* [Random forest](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier)
* [KNN](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html#sklearn.neighbors.KNeighborsClassifier)
* [Multi-Layer Perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier)

In [20]:
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import SGDClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier

def eval_classifiers(X_train, y_train):
  clfs = [('Logistic regression', LogisticRegression(max_iter=1000)),
          ('SVM', SVC()),
          ('Decision tree', DecisionTreeClassifier()),
          ('RandomForest', RandomForestClassifier(n_estimators=20, random_state=45)),
          ('SGD', SGDClassifier(max_iter=1000, tol=1e-4, random_state=45)),
          ('KNN', KNeighborsClassifier()),
          ('MLP', MLPClassifier(max_iter=1000))
          ]

  # Vamos devolver los resultados como una tabla
  # Cada fila un algoritmo, cada columna un resultado
  results = pd.DataFrame(columns=['accuracy', 'balanced_accuracy', 'precision', 'recall', 'f-score'])
  for alg, clf in clfs:
    scores = cross_validate(clf, X_train, y_train, cv=10, scoring=scoring) # por defecto, es estratificado
    results.loc[alg,:] = [scores['test_'+m].mean() for m in scoring]
  return results.sort_values(by='f-score', ascending=False)
  

In [21]:
# 4 decimales para cada valor en Pandas
pd.options.display.float_format = '{:,.4f}'.format
  
eval_classifiers(X_train, y_train)

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


Unnamed: 0,accuracy,balanced_accuracy,precision,recall,f-score
Logistic regression,0.7735,0.7307,0.7327,0.5827,0.6412
RandomForest,0.754,0.7137,0.6962,0.5738,0.6234
SVM,0.7669,0.7144,0.7426,0.5325,0.6134
KNN,0.7263,0.6904,0.6465,0.5649,0.5919
Decision tree,0.6725,0.6438,0.5375,0.5455,0.5402
MLP,0.697,0.6527,0.5858,0.4998,0.5379
SGD,0.6529,0.566,0.4359,0.2686,0.2994


# Preprocesamiento

## Escalado

In [22]:
from sklearn import preprocessing

scaler = preprocessing.StandardScaler()
eval_classifiers(scaler.fit_transform(X_train), y_train)



Unnamed: 0,accuracy,balanced_accuracy,precision,recall,f-score
Logistic regression,0.7736,0.7307,0.7338,0.5827,0.6422
RandomForest,0.7557,0.716,0.695,0.5784,0.6266
SVM,0.7555,0.7108,0.7058,0.5554,0.611
MLP,0.7215,0.6906,0.6279,0.5829,0.5978
KNN,0.7346,0.6962,0.6609,0.5641,0.5976
SGD,0.723,0.6864,0.6412,0.5597,0.5808
Decision tree,0.684,0.6571,0.5529,0.5643,0.5563


## Interacciones

In [23]:
from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2, interaction_only=True)
eval_classifiers(poly.fit_transform(X_train), y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logist

Unnamed: 0,accuracy,balanced_accuracy,precision,recall,f-score
Logistic regression,0.7556,0.7077,0.7074,0.5418,0.609
RandomForest,0.7312,0.6942,0.6481,0.5652,0.5987
KNN,0.7198,0.6937,0.6114,0.6017,0.5979
SVM,0.7506,0.6864,0.74,0.4636,0.5592
Decision tree,0.6726,0.6453,0.5407,0.5502,0.5402
MLP,0.6873,0.6468,0.5812,0.5082,0.5189
SGD,0.5684,0.5458,0.3957,0.4643,0.376


In [None]:
eval_classifiers(scaler.fit_transform(poly.fit_transform(X_train)), y_train)

# Búsqueda de hiperparámetros

Scikit-learn ofrece herramientas para ayudarnos en la búsqueda de los parámetros para el algoritmo de entrenamiento. 

In [24]:
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

# Establecemos posibles parámetros a explorar (6 experimentos en total)
tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
                     'C': [1, 10]},
                    {'kernel': ['linear'], 'C': [1, 10]}]

clf = GridSearchCV(SVC(), tuned_parameters, scoring='f1_macro')
clf.fit(X_train, y_train)

print("Mejores hiperparámetros:")
print(clf.best_params_)
print("Resultados para distintas combinaciones:")
means = clf.cv_results_['mean_test_score']
stds = clf.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, clf.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r" % (mean, std * 2, params))


Mejores hiperparámetros:
{'C': 1, 'kernel': 'linear'}
Resultados para distintas combinaciones:
0.703 (+/-0.101) for {'C': 1, 'gamma': 0.001, 'kernel': 'rbf'}
0.716 (+/-0.105) for {'C': 1, 'gamma': 0.0001, 'kernel': 'rbf'}
0.671 (+/-0.066) for {'C': 10, 'gamma': 0.001, 'kernel': 'rbf'}
0.712 (+/-0.098) for {'C': 10, 'gamma': 0.0001, 'kernel': 'rbf'}
0.738 (+/-0.055) for {'C': 1, 'kernel': 'linear'}
0.738 (+/-0.051) for {'C': 10, 'kernel': 'linear'}


# Ejercicios

1. Resuelve la clasificación de los datos de la flor de Iris aplicando los siguiente:

* Algortimo [PCA](https://scikit-learn.org/stable/modules/decomposition.html#principal-component-analysis-pca) para reducir la dimensionalidad a dos características. 

* Algoritmo de [Regresión Logística](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) para entrenar el modelo.

* [Validación cruzada](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_validate.html) sobre el conjunto total de datos.

In [4]:
from sklearn import svm,datasets
from sklearn.decomposition import PCA
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_validate
from sklearn.linear_model import LogisticRegression

# Cargamos datos
iris = datasets.load_iris()
X = iris.data
y = iris.target

In [88]:
# Reducimos dimensionalidad (usa random_state=42 para reproducir resultados)
pca=PCA(n_components=2, random_state= 42) 
pca.fit(X) 

Xr = pca.transform(X) 
Xr.shape

(150, 2)

Resultado esperado:

```
(150, 2)
```

In [5]:
# Creamos clasificador (usa random_state=42 para reproducir resultados)
clf = LogisticRegression(random_state=42)
clf.get_params()

{'C': 1.0,
 'class_weight': None,
 'dual': False,
 'fit_intercept': True,
 'intercept_scaling': 1,
 'l1_ratio': None,
 'max_iter': 100,
 'multi_class': 'auto',
 'n_jobs': None,
 'penalty': 'l2',
 'random_state': 42,
 'solver': 'lbfgs',
 'tol': 0.0001,
 'verbose': 0,
 'warm_start': False}

Resultado esperado:
```
{'C': 1.0,
 'class_weight': None,
 'dual': False,
 'fit_intercept': True,
 'intercept_scaling': 1,
 'l1_ratio': None,
 'max_iter': 100,
 'multi_class': 'auto',
 'n_jobs': None,
 'penalty': 'l2',
 'random_state': 42,
 'solver': 'lbfgs',
 'tol': 0.0001,
 'verbose': 0,
 'warm_start': False}
 ```

In [44]:
import warnings
warnings.filterwarnings('ignore') # filtramos warnings

# Lanzamos validación cruzada de 10 particiones y mostramos resultados
scoring = ['precision_macro', 'recall_macro', 'f1_macro']
skf = StratifiedKFold(n_splits=10, random_state=42, shuffle=True) # particionado de xval, para reproducir resultados

scores = cross_validate(clf, X, y,  scoring=scoring, cv = skf)
for m in scoring:
  print(m, "%.4f" % (scores['test_'+m].mean()))

precision_macro 0.9771
recall_macro 0.9733
f1_macro 0.9728


Resultado esperado:
```
precision_macro 0.9683
recall_macro 0.9600
f1_macro 0.9592
```

2. Repite el proceso anterior, pero con una búsqueda de hiperparámetros para la regresión logística.

Valores a experimentar con los hiperparámetros siguientes:

``` 
[{'solver': ['newton-cg', 'lbfgs', 'sag'], 
  'penalty': ['l2'],
  'tol': [1e-3, 1e-4, 1e-5],
  'C': [1, 10, 50],
  'max_iter': [1000]},
 {'solver': ['saga'],
  'penalty': ['l1', 'l2', 'elasticnet'],
  'tol': [1e-3, 1e-4, 1e-5],
  'C': [1, 10, 50],
  'max_iter': [1000]},
 {'solver': ['liblinear'],
  'penalty': ['l1'],
  'tol': [1e-3, 1e-4, 1e-5],
  'C': [1, 10, 50],
  'max_iter': [1000]}
  ]
```

In [38]:
import warnings
warnings.filterwarnings('ignore') # filtramos warnings

from sklearn import svm
from sklearn.model_selection import GridSearchCV

# Establecemos posibles parámetros a explorar (6 experimentos en total)
tuned_parameters = [{'solver': ['newton-cg', 'lbfgs', 'sag'], 
  'penalty': ['l2'],
  'tol': [1e-3, 1e-4, 1e-5],
  'C': [1, 10, 50],
  'max_iter': [1000]},
 {'solver': ['saga'],
  'penalty': ['l1', 'l2', 'elasticnet'],
  'tol': [1e-3, 1e-4, 1e-5],
  'C': [1, 10, 50],
  'max_iter': [1000]},
 {'solver': ['liblinear'],
  'penalty': ['l1'],
  'tol': [1e-3, 1e-4, 1e-5],
  'C': [1, 10, 50],
  'max_iter': [1000]}
  ]

# usaremos 'f1_macro' como métrica objetivo en la búsqueda grid
estimate = LogisticRegression(random_state=42)

clf = GridSearchCV(estimator = estimate, param_grid=tuned_parameters, scoring='f1_macro')
clf.fit(X, y)

print("Mejores hiperparámetros:")
print(clf.best_params_)
print("Resultados para distintas combinaciones:")
means = clf.cv_results_['mean_test_score']
stds = clf.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, clf.cv_results_['params']):
    print("%0.6f (+/-%0.6f) for %r" % (mean, std * 2, params))




Mejores hiperparámetros:
{'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.001}
Resultados para distintas combinaciones:
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.0001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 1e-05}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.0001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.0001}
0.973165 (+/-0.

Resultado esperado:

```
Mejores hiperparámetros:
{'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.001}
Resultados para distintas combinaciones:
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.0001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 1e-05}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.0001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.0001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 1e-05}
0.973300 (+/-0.049907) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.001}
0.973300 (+/-0.049907) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.0001}
0.973300 (+/-0.049907) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 1e-05}
0.973300 (+/-0.049907) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.001}
0.973300 (+/-0.049907) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.0001}
0.973300 (+/-0.049907) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'newton-cg', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'lbfgs', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'sag', 'tol': 1e-05}
0.986633 (+/-0.032742) for {'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 1e-05}
0.986633 (+/-0.032742) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 0.001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 0.0001}
0.973165 (+/-0.050339) for {'C': 1, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 1e-05}
nan (+/-nan) for {'C': 1, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 0.001}
nan (+/-nan) for {'C': 1, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 0.0001}
nan (+/-nan) for {'C': 1, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 1e-05}
nan (+/-nan) for {'C': 10, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 0.001}
nan (+/-nan) for {'C': 10, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 0.0001}
nan (+/-nan) for {'C': 10, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'saga', 'tol': 1e-05}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 50, 'max_iter': 1000, 'penalty': 'l2', 'solver': 'saga', 'tol': 1e-05}
nan (+/-nan) for {'C': 50, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 0.001}
nan (+/-nan) for {'C': 50, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 0.0001}
nan (+/-nan) for {'C': 50, 'max_iter': 1000, 'penalty': 'elasticnet', 'solver': 'saga', 'tol': 1e-05}
0.959933 (+/-0.077895) for {'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.001}
0.959933 (+/-0.077895) for {'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.0001}
0.959933 (+/-0.077895) for {'C': 1, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 1e-05}
0.973199 (+/-0.065651) for {'C': 10, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.0001}
0.979983 (+/-0.053350) for {'C': 10, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 1e-05}
0.966515 (+/-0.059931) for {'C': 50, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.001}
0.966515 (+/-0.059931) for {'C': 50, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 0.0001}
0.966515 (+/-0.059931) for {'C': 50, 'max_iter': 1000, 'penalty': 'l1', 'solver': 'liblinear', 'tol': 1e-05}

```

# Referencias

* [Scikit-learn User Guide](https://scikit-learn.org/stable/user_guide.html)