# Exercise 6

## SVM & Regularization


For this homework we consider a set of observations on a number of red and white wine varieties involving their chemical properties and ranking by tasters. Wine industry shows a recent growth spurt as social drinking is on the rise. The price of wine depends on a rather abstract concept of wine appreciation by wine tasters, opinion among whom may have a high degree of variability. Pricing of wine depends on such a volatile factor to some extent. Another key factor in wine certification and quality assessment is physicochemical tests which are laboratory-based and takes into account factors like acidity, pH level, presence of sugar and other chemical properties. For the wine market, it would be of interest if human quality of tasting can be related to the chemical properties of wine so that certification and quality assessment and assurance process is more controlled.

Two datasets are available of which one dataset is on red wine and have 1599 different varieties and the other is on white wine and have 4898 varieties. All wines are produced in a particular area of Portugal. Data are collected on 12 different properties of the wines one of which is Quality, based on sensory data, and the rest are on chemical properties of the wines including density, acidity, alcohol content etc. All chemical properties of wines are continuous variables. Quality is an ordinal variable with possible ranking from 1 (worst) to 10 (best). Each variety of wine is tasted by three independent tasters and the final rank assigned is the median rank given by the tasters.

A predictive model developed on this data is expected to provide guidance to vineyards regarding quality and price expected on their produce without heavy reliance on volatility of wine tasters.

In [31]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

In [10]:
data_r = pd.read_csv('https://github.com/albahnsen/PracticalMachineLearningClass/raw/master/datasets/Wine_data_red.csv')
data_w = pd.read_csv('https://github.com/albahnsen/PracticalMachineLearningClass/raw/master/datasets/Wine_data_white.csv')

In [11]:
data = data_w.assign(type = 'white')

data = data.append(data_r.assign(type = 'red'), ignore_index=True)
data.sample(5)

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality,type
698,7.5,0.31,0.51,14.8,0.039,62.0,204.0,0.9982,3.06,0.6,9.5,5,white
6284,7.4,0.64,0.07,1.8,0.1,8.0,23.0,0.9961,3.3,0.58,9.6,5,red
2979,6.7,0.36,0.26,7.9,0.034,39.0,123.0,0.99119,2.99,0.3,12.2,7,white
960,7.3,0.24,0.34,15.4,0.05,38.0,174.0,0.9983,3.03,0.42,9.0,6,white
4736,6.0,0.16,0.36,1.6,0.042,13.0,61.0,0.99143,3.22,0.54,10.8,6,white


# Exercise 6.1

Show the frecuency table of the quality by type of wine

In [12]:
import numpy as np
from sklearn.model_selection import train_test_split
pd.pivot_table(data[['quality','type','alcohol']],index=['quality'],columns=['type'], aggfunc='count')

Unnamed: 0_level_0,alcohol,alcohol
type,red,white
quality,Unnamed: 1_level_2,Unnamed: 2_level_2
3,10.0,20.0
4,53.0,163.0
5,681.0,1457.0
6,638.0,2198.0
7,199.0,880.0
8,18.0,175.0
9,,5.0


# SVM

# Exercise 6.2
Lo que se quiere clasificar es si el vinoes bueno o malo. Variable Calidad. Convertirla en binario.
Una vez se tenga esavariable se hacen dos modelos, uno para pronosticar con vinos blancos y otro para pronosticar con los rojos.
Se haran dos modelos de clasificación. El target de cada modelo es si la calidad del vino es buena o mala.
* Standarized the features (not the quality). Normalizar los features, las variables
* Create a binary target for each type of wine. Se crean dos vectores soporte para separar vinos blancos de vinos rosados
* Create two Linear SVM's for the white and red wines, repectively.


In [13]:
from sklearn import preprocessing
import matplotlib.pyplot as plt

data['Q1']=data['quality']>5
data['Q1']=data['Q1'].astype(int)
data.sample(5)

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality,type,Q1
4192,6.7,0.2,0.24,6.5,0.044,28.0,100.0,0.99348,3.12,0.33,10.2,6,white,1
1538,6.0,0.17,0.49,1.0,0.034,26.0,106.0,0.992,3.21,0.42,9.8,6,white,1
5895,5.6,0.66,0.0,2.2,0.087,3.0,11.0,0.99378,3.71,0.63,12.8,7,red,1
3601,6.6,0.295,0.24,1.6,0.039,29.0,140.0,0.99304,3.35,0.61,10.4,7,white,1
4789,5.5,0.62,0.33,1.7,0.037,24.0,118.0,0.98758,3.15,0.39,13.55,6,white,1


In [16]:
from sklearn.svm import SVC # "Support Vector Classifier"
from sklearn import preprocessing
scaler = preprocessing.StandardScaler()
vino_blanco=data.loc[data['type']=='white']
vino_rojo=data.loc[data['type']=='red']
ya = vino_blanco['Q1'].values
yb = vino_rojo['Q1'].values

xa1 = vino_blanco['fixed acidity'].values
xa2 = vino_blanco['volatile acidity'].values
xa3 = vino_blanco['citric acid'].values
xa4 = vino_blanco['residual sugar'].values
xa5 = vino_blanco['chlorides'].values
xa6 = vino_blanco['free sulfur dioxide'].values
xa7 = vino_blanco['total sulfur dioxide'].values
xa8 = vino_blanco['density'].values
xa9 = vino_blanco['pH'].values
xa10 = vino_blanco['sulphates'].values
xa11 = vino_blanco['alcohol'].values

xb1 = vino_rojo['fixed acidity'].values
xb2 = vino_rojo['volatile acidity'].values
xb3 = vino_rojo['citric acid'].values
xb4 = vino_rojo['residual sugar'].values
xb5 = vino_rojo['chlorides'].values
xb6 = vino_rojo['free sulfur dioxide'].values
xb7 = vino_rojo['total sulfur dioxide'].values
xb8 = vino_rojo['density'].values
xb9 = vino_rojo['pH'].values
xb10 = vino_rojo['sulphates'].values
xb11 = vino_rojo['alcohol'].values

Xa1_std = preprocessing.scale(xa1)
Xa2_std = preprocessing.scale(xa2)
Xa3_std = preprocessing.scale(xa3)
Xa4_std = preprocessing.scale(xa4)
Xa5_std = preprocessing.scale(xa5)
Xa6_std = preprocessing.scale(xa6)
Xa7_std = preprocessing.scale(xa7)
Xa8_std = preprocessing.scale(xa8)
Xa9_std = preprocessing.scale(xa9)
Xa10_std = preprocessing.scale(xa10)
Xa11_std = preprocessing.scale(xa11)

Xb1_std = preprocessing.scale(xb1)
Xb2_std = preprocessing.scale(xb2)
Xb3_std = preprocessing.scale(xb3)
Xb4_std = preprocessing.scale(xb4)
Xb5_std = preprocessing.scale(xb5)
Xb6_std = preprocessing.scale(xb6)
Xb7_std = preprocessing.scale(xb7)
Xb8_std = preprocessing.scale(xb8)
Xb9_std = preprocessing.scale(xb9)
Xb10_std = preprocessing.scale(xb10)
Xb11_std = preprocessing.scale(xb11)

md1 = np.c_[Xa1_std,Xa2_std,Xa3_std,Xa4_std,Xa5_std,Xa6_std,Xa7_std,Xa8_std,Xa9_std,Xa10_std,Xa11_std]
md2 = np.c_[Xb1_std,Xb2_std,Xb3_std,Xb4_std,Xb5_std,Xb6_std,Xb7_std,Xb8_std,Xb9_std,Xb10_std,Xb11_std]

clf = SVC(kernel='linear',gamma='auto').fit(md1, ya)
aw = clf.score(md1, ya)
clf = SVC(kernel='linear',gamma='auto').fit(md1, ya)
ar = clf.score(md2, yb)

print('Accuracy vino blanco regresion lineal',aw)
print('Accuracy vino rojo regresion lineal',ar)

Accuracy vino blanco regresion lineal 0.7539812168231932
Accuracy vino rojo regresion lineal 0.6898061288305191


# Exercise 6.3

Test the two SVM's using the different kernels (‘poly’, ‘rbf’, ‘sigmoid’).

In [17]:
Xa_train, Xa_test, ya_train, ya_test = train_test_split(md1, ya, test_size=0.3)
Xb_train, Xb_test, yb_train, yb_test = train_test_split(md2, yb, test_size=0.3)

for kernel in ('sigmoid', 'poly', 'rbf'):
    clf = SVC(kernel=kernel,gamma='auto').fit(Xa_train, ya_train)
    aw = clf.score(Xa_test, ya_test)
    clf = SVC(kernel=kernel,gamma='auto').fit(Xb_train, yb_train)
    ar = clf.score(Xb_test, yb_test)
    print("Accuracy para ",kernel," de vino blanco es:",aw)
    print("Accuracy para ",kernel," de vino rojo es:",ar)

Accuracy para  sigmoid  de vino blanco es: 0.6585034013605442
Accuracy para  sigmoid  de vino rojo es: 0.6395833333333333
Accuracy para  poly  de vino blanco es: 0.7591836734693878
Accuracy para  poly  de vino rojo es: 0.7
Accuracy para  rbf  de vino blanco es: 0.773469387755102
Accuracy para  rbf  de vino rojo es: 0.73125


El mejor kernel con accuracy mas alto es para rbf para los dos vinos.

# Exercise 6.4
Using the best SVM find the parameters that gives the best performance

'C': [0.1, 1, 10, 100, 1000], 'gamma': [0.01, 0.001, 0.0001]

In [18]:
for gamma in (0.01,0.001,0.00001):
    for C in (0.1, 1, 10, 100, 1000):
        clf = SVC(kernel='rbf',gamma=gamma,C=C).fit(Xa_train, ya_train)
        aw = clf.score(Xa_test, ya_test)
        clf = SVC(kernel='rbf',gamma=gamma,C=C).fit(Xb_train, yb_train)
        ar = clf.score(Xb_test, yb_test)
        print("Accuracy para ",kernel," con gamma de ",gamma," y un C de ",C," el accuracy de vino blanco es:",aw)
        print("Accuracy para ",kernel," con gamma de ",gamma," y un C de ",C," el accuracy de vino rojo es:",ar)

Accuracy para  rbf  con gamma de  0.01  y un C de  0.1  el accuracy de vino blanco es: 0.710204081632653
Accuracy para  rbf  con gamma de  0.01  y un C de  0.1  el accuracy de vino rojo es: 0.7104166666666667
Accuracy para  rbf  con gamma de  0.01  y un C de  1  el accuracy de vino blanco es: 0.7564625850340136
Accuracy para  rbf  con gamma de  0.01  y un C de  1  el accuracy de vino rojo es: 0.71875
Accuracy para  rbf  con gamma de  0.01  y un C de  10  el accuracy de vino blanco es: 0.7564625850340136
Accuracy para  rbf  con gamma de  0.01  y un C de  10  el accuracy de vino rojo es: 0.7291666666666666
Accuracy para  rbf  con gamma de  0.01  y un C de  100  el accuracy de vino blanco es: 0.7564625850340136
Accuracy para  rbf  con gamma de  0.01  y un C de  100  el accuracy de vino rojo es: 0.7270833333333333
Accuracy para  rbf  con gamma de  0.01  y un C de  1000  el accuracy de vino blanco es: 0.7721088435374149
Accuracy para  rbf  con gamma de  0.01  y un C de  1000  el accuracy de

# Exercise 6.5

Compare the results with other methods

In [19]:
from sklearn.linear_model import LogisticRegression
logreg1 = LogisticRegression(solver='liblinear',C=1e9)
logreg2 = LogisticRegression(solver='liblinear',C=1e9)
logreg1.fit(Xa_train, ya_train)
logreg2.fit(Xb_train, yb_train)
al1=logreg1.score(Xa_test, ya_test)
al2=logreg2.score(Xb_test, yb_test)
print('Accuracy vino blanco regresion logistica',al1)
print('Accuracy vino rojo regresion logistica',al2)
print('Accuracy vino blanco regresion lineal',aw)
print('Accuracy vino rojo regresion lineal',ar)

Accuracy vino blanco regresion logistica 0.7448979591836735
Accuracy vino rojo regresion logistica 0.7104166666666667
Accuracy vino blanco regresion lineal 0.7503401360544217
Accuracy vino rojo regresion lineal 0.7083333333333334


En terminos generales con regresión logistica mejoran los resultados respecto a la regresión lineal. Los resultados de la regresión logistica son muy similares a la regresión con kernel rbf con los valores de gamma y C que hacen el accuracy mayor.

# Regularization

# Exercise 6.6


* Train a linear regression to predict wine quality (Continous).

* Analyze the coefficients

* Evaluate the RMSE

In [20]:
ya = vino_blanco['quality'].values
yb = vino_rojo['quality'].values

Xa_train, Xa_test, ya_train, ya_test = train_test_split(md1, ya, test_size=0.3)
Xb_train, Xb_test, yb_train, yb_test = train_test_split(md2, yb, test_size=0.3)

from sklearn.linear_model import LinearRegression
linreg1 = LinearRegression()
linreg2 = LinearRegression()
linreg1.fit(Xa_train,ya_train)
linreg2.fit(Xb_train,yb_train)
PredictW = linreg1.predict(Xa_test)
PredictR = linreg2.predict(Xb_test)
alm1 = linreg1.score(Xa_test,ya_test)
alm2 = linreg2.score(Xb_test,yb_test)

print("Accuracy vino blanco ",alm1)
print("Accuracy vino rojo ",alm2)

Accuracy vino blanco  0.2826686678192136
Accuracy vino rojo  0.3436350272682721


In [21]:
print(linreg1.coef_)
print(linreg2.coef_)

[ 0.04290461 -0.1734983   0.0009931   0.37857245 -0.00363547  0.06594748
 -0.02579    -0.37609639  0.08866284  0.06303131  0.27381265]
[ 0.08711108 -0.18741723 -0.03614418  0.01990296 -0.06028506  0.05717189
 -0.11326955 -0.03702806 -0.02250662  0.14149377  0.29992496]


In [22]:
# calculate RMSE
from sklearn import metrics
import numpy as np
print(np.sqrt(metrics.mean_squared_error(ya_test, PredictW)))
print(np.sqrt(metrics.mean_squared_error(yb_test, PredictR)))

0.7726914691078659
0.6676599549334559


La significancia de los coeficientes entre los dos grupos de vinos son muy similares aśi como su impacto sobre la variación de la variable dependiente.

# Exercise 6.7

* Estimate a ridge regression with alpha equals 0.1 and 1.
* Compare the coefficients with the linear regression
* Evaluate the RMSE

In [23]:
# alpha=0.1 y 1
from sklearn.linear_model import Ridge
for alpha in (0.1,1):
    ridgereg1 = Ridge(alpha=alpha, normalize=True)
    ridgereg1.fit(Xa_train, ya_train)
    ridgereg2 = Ridge(alpha=alpha, normalize=True)
    ridgereg2.fit(Xb_train, yb_train)
    PredictW = ridgereg1.predict(Xa_test)
    PredictR = ridgereg2.predict(Xb_test)

In [24]:
print("Coeficientes regresión lineal vino blanco: ")
print(linreg1.coef_)
print("Coeficientes regresión ridge vino blanco: ")
print(ridgereg1.coef_)
print("Coeficientes regresión lineal vino rojo: ")
print(linreg1.coef_)
print("Coeficientes regresión ridge vino rojo: ")
print(ridgereg2.coef_)

Coeficientes regresión lineal vino blanco: 
[ 0.04290461 -0.1734983   0.0009931   0.37857245 -0.00363547  0.06594748
 -0.02579    -0.37609639  0.08866284  0.06303131  0.27381265]
Coeficientes regresión ridge vino blanco: 
[-0.02483302 -0.07539287  0.00690314  0.03358716 -0.04865522  0.03341322
 -0.035096   -0.07056633  0.02129613  0.02162906  0.15969884]
Coeficientes regresión lineal vino rojo: 
[ 0.04290461 -0.1734983   0.0009931   0.37857245 -0.00363547  0.06594748
 -0.02579    -0.37609639  0.08866284  0.06303131  0.27381265]
Coeficientes regresión ridge vino rojo: 
[ 0.03661064 -0.10689763  0.04108219  0.00881858 -0.0390305   0.00700692
 -0.05982622 -0.04858785 -0.00396759  0.08093045  0.15853218]


In [25]:
print("Vino blanco ",np.sqrt(metrics.mean_squared_error(ya_test, PredictW)))
print("Vino rojo ",np.sqrt(metrics.mean_squared_error(yb_test, PredictR)))

Vino blanco  0.8176967789087581
Vino rojo  0.694789419431097


El menor error cuadratico medio se presenta con vino blanco.

# Exercise 6.8

* Estimate a lasso regression with alpha equals 0.01, 0.1 and 1.
* Compare the coefficients with the linear regression
* Evaluate the RMSE

In [26]:
from sklearn.linear_model import Lasso
for alpha in (0.01,0.1,1):
    lassoreg1 = Lasso(alpha=alpha, normalize=True)
    lassoreg2 = Lasso(alpha=alpha, normalize=True)
    lassoreg1.fit(Xa_train, ya_train)
    lassoreg2.fit(Xb_train, yb_train)
    a1=lassoreg1.score(Xa_test,ya_test)
    a2=lassoreg2.score(Xb_test,yb_test)
    print("Accuracy para regresion lasso, con un valor de alpha de ",alpha," es:",a1," para vino blanco.")
    print("Accuracy para regresion lasso, con un valor de alpha de ",alpha," es:",a2," para vino rojo.")


Accuracy para regresion lasso, con un valor de alpha de  0.01  es: -0.0005259998116662334  para vino blanco.
Accuracy para regresion lasso, con un valor de alpha de  0.01  es: 0.054830603719757565  para vino rojo.
Accuracy para regresion lasso, con un valor de alpha de  0.1  es: -0.0005259998116662334  para vino blanco.
Accuracy para regresion lasso, con un valor de alpha de  0.1  es: -0.0036433423519339754  para vino rojo.
Accuracy para regresion lasso, con un valor de alpha de  1  es: -0.0005259998116662334  para vino blanco.
Accuracy para regresion lasso, con un valor de alpha de  1  es: -0.0036433423519339754  para vino rojo.


In [27]:
for alpha in (0.01,0.1,1):
    lassoreg1 = Lasso(alpha=alpha, normalize=True)
    lassoreg2 = Lasso(alpha=alpha, normalize=True)
    lassoreg1.fit(Xa_train, ya_train)
    lassoreg2.fit(Xb_train, yb_train)
    print('Coeficientes vino blanco con alpha de ',alpha)
    print(lassoreg1.coef_)
    print('Coeficientes vino rojo con alpha de ',alpha)
    print(lassoreg2.coef_)


Coeficientes vino blanco con alpha de  0.01
[-0. -0. -0. -0. -0.  0. -0. -0.  0.  0.  0.]
Coeficientes vino rojo con alpha de  0.01
[ 0.         -0.          0.         -0.         -0.         -0.
 -0.         -0.         -0.          0.          0.05732882]
Coeficientes vino blanco con alpha de  0.1
[-0. -0. -0. -0. -0.  0. -0. -0.  0.  0.  0.]
Coeficientes vino rojo con alpha de  0.1
[ 0. -0.  0. -0. -0. -0. -0. -0. -0.  0.  0.]
Coeficientes vino blanco con alpha de  1
[-0. -0. -0. -0. -0.  0. -0. -0.  0.  0.  0.]
Coeficientes vino rojo con alpha de  1
[ 0. -0.  0. -0. -0. -0. -0. -0. -0.  0.  0.]


In [28]:
for alpha in (0.01,0.1,1):
    lassoreg1 = Lasso(alpha=alpha, normalize=True)
    lassoreg2 = Lasso(alpha=alpha, normalize=True)
    lassoreg1.fit(Xa_train, ya_train)
    lassoreg2.fit(Xb_train, yb_train)
    ya_pred = lassoreg1.predict(Xa_test)
    yb_pred = lassoreg2.predict(Xb_test)
    print("RMSE para vino blanco con alpha de ",alpha,' ',np.sqrt(metrics.mean_squared_error(ya_test, ya_pred)))
    print("RMSE para vino rojo con alpha de ",alpha,' ',np.sqrt(metrics.mean_squared_error(yb_test, yb_pred)))

RMSE para vino blanco con alpha de  0.01   0.9125578523783424
RMSE para vino rojo con alpha de  0.01   0.8011935715643821
RMSE para vino blanco con alpha de  0.1   0.9125578523783424
RMSE para vino rojo con alpha de  0.1   0.82560503812887
RMSE para vino blanco con alpha de  1   0.9125578523783424
RMSE para vino rojo con alpha de  1   0.82560503812887


El RMSE es bastatne alto para los dos vinos. Con cualquier alor de alpha no hay una variación significativa.

# Exercise 6.9

* Create a binary target

* Train a logistic regression to predict wine quality (binary)

* Analyze the coefficients

* Evaluate the f1score

In [29]:
ya = vino_blanco['Q1'].values
yb = vino_rojo['Q1'].values

Xa_train, Xa_test, ya_train, ya_test = train_test_split(md1, ya, test_size=0.3)
Xb_train, Xb_test, yb_train, yb_test = train_test_split(md2, yb, test_size=0.3)

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score
logreg1 = LogisticRegression(solver='liblinear',C=1e9)
logreg2 = LogisticRegression(solver='liblinear',C=1e9)
logreg1.fit(Xa_train, ya_train)
logreg2.fit(Xb_train, yb_train)
al1=logreg1.score(Xa_test, ya_test)
al2=logreg2.score(Xb_test, yb_test)
ce1 = logreg1.coef_
ce2 = logreg2.coef_
ya_pred = logreg1.predict(Xa_test)
yb_pred = logreg2.predict(Xb_test)
fa11=f1_score(ya_test, ya_pred, average='macro')  
fa12=f1_score(ya_test, ya_pred, average='micro')  
fa13=f1_score(ya_test, ya_pred, average='weighted')  
fb11=f1_score(yb_test, yb_pred, average='macro')  
fb12=f1_score(yb_test, yb_pred, average='micro')  
fb13=f1_score(yb_test, yb_pred, average='weighted')  

print('Accuracy vino blanco regresion logistica',al1)
print('Accuracy vino rojo regresion logistica',al2)
print('Coeficientes vino blanco regresion logistica')
print(ce1)
print('Coeficientes vino rojo regresion logistica')
print(ce2)
print("F1 vino blanco promedio macro: ",fa11)
print("F1 vino blanco promedio micro: ",fa12)
print("F1 vino blanco promedio weighted: ",fa13)
print("F1 vino rojo promedio macro: ",fb11)
print("F1 vino rojo promedio micro: ",fb12)
print("F1 vino rojo promedio weighted: ",fb13)

Accuracy vino blanco regresion logistica 0.7333333333333333
Accuracy vino rojo regresion logistica 0.7479166666666667
Coeficientes vino blanco regresion logistica
[[-0.00830468 -0.64259758  0.02575139  0.80065959 -0.00485021  0.13593535
  -0.05493487 -0.6315655   0.11245233  0.17692136  1.01697489]]
Coeficientes vino rojo regresion logistica
[[ 0.28527521 -0.49442883 -0.2106724   0.00973286 -0.20448886  0.19829353
  -0.48107056 -0.25583964 -0.19193918  0.46997561  0.93043955]]
F1 vino blanco promedio macro:  0.6800266533399967
F1 vino blanco promedio micro:  0.7333333333333333
F1 vino blanco promedio weighted:  0.7192959076017547
F1 vino rojo promedio macro:  0.7475210725137913
F1 vino rojo promedio micro:  0.7479166666666667
F1 vino rojo promedio weighted:  0.7481873363502132


En terminos generales tanto para vino blanco como vino rojo se tiene una precisión del 70% en promedio

# Exercise 6.10

* Estimate a regularized logistic regression using:
* C = 0.01, 0.1 & 1.0
* penalty = ['l1, 'l2']
* Compare the coefficients and the f1score

In [30]:
#REgularizado desde el comienzo del ejercicio
for C in (0.01, 0.1 , 1.0):
    for penalty in ('l1','l2'):
        logreg1 = LogisticRegression(C=C, penalty=penalty,solver='liblinear',multi_class='auto')
        logreg2 = LogisticRegression(C=C, penalty=penalty,solver='liblinear',multi_class='auto')
        logreg1.fit(Xa_train, ya_train)
        logreg2.fit(Xb_train, yb_train)
        ya_pred=logreg1.predict(Xa_test)
        yb_pred=logreg2.predict(Xb_test)
        f1a = f1_score(ya_test,ya_pred)
        f1b = f1_score(yb_test,yb_pred)
        print('F1 Score vino blanco con C: ',C,' ,penalty de: ',penalty,' es: ',f1a)
        print('F1 Score vino rojo con C: ',C,' ,penalty de: ',penalty,' es: ',f1b)


F1 Score vino blanco con C:  0.01  ,penalty de:  l1  es:  0.8149951314508277
F1 Score vino rojo con C:  0.01  ,penalty de:  l1  es:  0.703862660944206
F1 Score vino blanco con C:  0.01  ,penalty de:  l2  es:  0.8153846153846153
F1 Score vino rojo con C:  0.01  ,penalty de:  l2  es:  0.7398373983739838
F1 Score vino blanco con C:  0.1  ,penalty de:  l1  es:  0.8136735676456428
F1 Score vino rojo con C:  0.1  ,penalty de:  l1  es:  0.7479674796747967
F1 Score vino blanco con C:  0.1  ,penalty de:  l2  es:  0.8094321462945139
F1 Score vino rojo con C:  0.1  ,penalty de:  l2  es:  0.7590361445783131
F1 Score vino blanco con C:  1.0  ,penalty de:  l1  es:  0.8088803088803089
F1 Score vino rojo con C:  1.0  ,penalty de:  l1  es:  0.7569721115537849
F1 Score vino blanco con C:  1.0  ,penalty de:  l2  es:  0.8094548962855764
F1 Score vino rojo con C:  1.0  ,penalty de:  l2  es:  0.7575150300601203


El mejor modelo para vino blanco es con C:0.1 y penalty de l1 con F1 de 0.8383119962067329
El mejor modelo para vino rojo es es con C:0.1 y penalty de l2 con F1 de 0.7490196078431373