# Leer base de datos

Archivo para probar la lectura de los datos de [UC Irvine ML Repository](https://archive.ics.uci.edu/dataset/109/wine).

In [1]:
import numpy as np
import pandas as pd
from myqp import *

In [2]:
df = pd.read_csv('wine.data')

## Función a minimizar

El objetivo es minimizar 
$$\frac{1}{2} w^{\tau}w$$ 
sujeto a 
$$ 
\begin{align*}
w^{\tau}x_i + \beta &\leq -1, \quad i = 1, 2, \ldots, s \\
w^{\tau}x_i + \beta &\geq 1, \quad i = s+1, s+2, \ldots, s+r, \\
\end{align*}
$$
con las variables $w \in \mathbb{R}^n$, $\beta \in \mathbb{R}$

In [3]:
A = df[df['class'] == 1]
B = df[df['class'] == 2]
C = df[df['class'] == 3]

In [4]:
X_A = A.drop(columns=['class'])
y_A = A['class']
X_B = B.drop(columns=['class'])
y_B = B['class']
X_C = C.drop(columns=['class'])
y_C = C['class']

In [5]:
X_A.shape

(59, 13)

In [6]:
# s y r son el número de filas de A y B, respectivamente
# n es el número de columnas que tienen A y B y es el tamaño de nuestra w
(s, n) = X_A.shape
r = X_B.shape[0]

In [7]:
Q = np.eye(n)
c = np.zeros(n)
F = np.concatenate((-X_A, X_B))
beta = 10
d = np.concatenate((np.ones(s) + beta, np.ones(r) - beta))

---

$$\frac{1}{2} w^{\tau}w$$ 
sujeto a 
$$ 
\begin{align*}
-w^{\tau}x_i &\geq 1 + \beta, \quad i = 1, 2, \ldots, s \\
w^{\tau}x_i &\geq 1 - \beta, \quad i = s+1, s+2, \ldots, s+r, \\
\end{align*}
$$

In [8]:
w,mu,z,iter = myqp_intpoint_proy(Q, F, c, d, verbose=False)

In [9]:
# valor de la función objetivo
np.dot(w,w)

39.44142062257272

In [10]:
# Todos tendrían que ser <= -1
np.dot(X_A, w) + beta

array([ -7.25226783,  -7.92698534, -11.22344886, -17.32449515,
        -1.00000005, -16.0025566 , -11.94967121, -11.2864576 ,
        -6.21073482,  -7.42782356, -14.607566  ,  -9.35120243,
       -12.0425962 , -10.59905352, -19.15215316, -12.87651496,
       -11.33440669,  -8.28022011, -20.96388513,  -6.34324587,
        -1.00000027,  -3.73167032,  -5.49498479,  -4.95800683,
        -1.00000062,  -1.00000002, -11.32966915,  -8.90822989,
        -3.39244637,  -4.70040728,  -9.49367399, -15.33163877,
        -3.58166799,  -8.32789458,  -6.60134758,  -2.97297289,
        -5.16606237,  -6.82581464,  -4.08906978,  -6.05815827,
        -2.66265361,  -7.35292831, -10.83106896,  -2.13470107,
        -3.0485042 ,  -9.41070034,  -8.71538117,  -6.05641206,
        -6.49361894, -12.22549694, -10.75253491, -11.02261421,
       -12.20535594, -13.72699589,  -7.1121592 ,  -6.8707921 ,
        -5.3292317 , -13.24323968, -13.41271366])

In [11]:
# Todos tendrían que ser >= 1
np.dot(X_B, w) + beta

array([ 9.59073566,  4.6286704 ,  7.18597144,  6.82601296,  8.27744825,
       10.51926929,  1.35461254,  6.03517664,  7.82781294,  3.34705972,
        7.36566973,  1.0000003 , 10.48507559,  9.27694263,  2.43271049,
        1.00000006,  8.11514819, 10.33454252,  5.43929735,  3.16412963,
        6.67917917, 14.02934742,  1.92563511,  7.84702425,  3.75961198,
        4.09917927,  9.51006017,  9.97076069,  8.72256883,  2.96329952,
        7.66858717,  7.08397128,  8.75521099,  7.78154517,  9.59435791,
        8.77700688,  1.00000012,  4.32169883,  8.37351128,  2.13893381,
        7.39310706,  4.2904442 ,  8.05423062,  6.20960508, 10.52865475,
        5.36105542,  9.88990774,  6.04856203,  7.46602077, 11.63855583,
        1.00000014,  3.76714091,  9.99554551,  1.00000003,  8.85663287,
        8.865131  , 12.71346939,  9.33339273, 12.2240543 ,  5.76227081,
        5.14001884,  1.00000026,  1.00000017,  7.25424396,  3.92043064,
        3.58411914,  8.02676561,  7.27317094,  8.00781865, 11.26

---

## Meter $\beta$ en el vector de variables

Sea $\hat{w} = (w_1, w_2, \ldots, w_n, \beta)$. Entonces queremos minimizar

$$\frac{1}{2} \hat{w}^{\tau} Q \hat{w} = \frac{1}{2}(w_1^2 + w_2^2 + \cdots + w_n^2)$$ 
sujeto a 
$$ 
\begin{align*}
-w^{\tau}x_i - \beta &\geq 1, \quad i = 1, 2, \ldots, s \\
w^{\tau}x_i + \beta &\geq 1, \quad i = s+1, s+2, \ldots, s+r, \\
\end{align*}
$$

donde 
$$ Q = \begin{pmatrix}
1 & 0 & \cdots & 0 & 0 \\
0 & 1 & \cdots & 0 & 0 \\
\vdots & \vdots & \ddots & \vdots & \vdots \\
0 & 0 & \cdots & 1 & 0 \\
0 & 0 & \cdots & 0 & 0
\end{pmatrix} \in \mathbb{R}^{(n+1) \times {(n+1)}}
$$

In [12]:
# s y r son el número de filas de A y B, respectivamente
# n es el número de columnas que tienen A y B y es el tamaño de nuestra w
(s, n) = X_A.shape
r = X_B.shape[0]

In [13]:
# Agregamos la variable de la beta
n += 1

In [20]:
Q = np.eye(n)
Q[-1][-1] = 0
c = np.zeros(n)
F = np.concatenate((-X_A, X_B))
aux = np.concatenate((-np.ones(s),np.ones(r)))
F = np.c_[F, aux]
d = np.ones(s+r)

In [23]:
w_hat,mu,z,iter = myqp_intpoint_proy(Q, F, c, d, verbose=True)

iter= 1 | ||cnpo||= 20829.824282075053
iter= 2 | ||cnpo||= 9722.298950269827
iter= 3 | ||cnpo||= 4117.119032378213
iter= 4 | ||cnpo||= 1265.1377238473754
iter= 5 | ||cnpo||= 543.1324720823604
iter= 6 | ||cnpo||= 317.1816691041723
iter= 7 | ||cnpo||= 67.23178079960307
iter= 8 | ||cnpo||= 3.3642872853175945
iter= 9 | ||cnpo||= 0.49990821668759494
iter= 10 | ||cnpo||= 0.08203615566504611
iter= 11 | ||cnpo||= 0.025933534203123502
iter= 12 | ||cnpo||= 0.013562603564740231
iter= 13 | ||cnpo||= 0.007146092851240802
iter= 14 | ||cnpo||= 0.003761926433347697
iter= 15 | ||cnpo||= 0.0019781946548297273
iter= 16 | ||cnpo||= 0.0010394801623657096
iter= 17 | ||cnpo||= 0.0005459883863643105
iter= 18 | ||cnpo||= 0.00028671670669833566
iter= 19 | ||cnpo||= 0.00015054646282582833
iter= 20 | ||cnpo||= 7.904247762319735e-05
iter= 21 | ||cnpo||= 4.1498842868780726e-05
iter= 22 | ||cnpo||= 2.1787317967026762e-05
iter= 23 | ||cnpo||= 1.1438459260513311e-05
iter= 24 | ||cnpo||= 6.0052234589608995e-06


In [24]:
# Valor objetivo REAL - contando a la beta
np.dot(w_hat,w_hat)

546.8433164481445

In [25]:
w = w_hat[:-1]
beta = w_hat[-1]

In [26]:
# valor de la función objetivo sin tomar en cuenta a beta
np.dot(w,w)

6.659266436228828

In [27]:
beta

23.24185986559414

In [28]:
# Todos tendrían que ser <= -1
np.dot(X_A, w) + beta

array([-4.59169878, -2.93244954, -4.06003042, -7.40660529, -1.25919704,
       -6.67971512, -5.73234948, -5.25596571, -4.24989858, -3.8125283 ,
       -5.87524333, -4.06098997, -4.64609546, -5.97430517, -8.23456705,
       -5.50664856, -5.36625466, -3.89739666, -8.35592502, -3.60491804,
       -2.4513823 , -1.9689593 , -2.86142347, -1.46609258, -1.20742986,
       -1.00000017, -4.02888944, -2.42777631, -2.82103785, -2.83072428,
       -3.69680568, -5.27692806, -2.08796216, -4.06599142, -2.71603245,
       -1.6420951 , -2.55303629, -1.95965189, -1.00000254, -4.69799146,
       -2.19111717, -2.58321673, -4.77333476, -1.37652623, -1.00000484,
       -4.87261905, -4.91744098, -3.07801793, -3.58842747, -5.59155859,
       -3.72512057, -4.8026764 , -5.41076564, -5.68020925, -3.27419886,
       -2.91066529, -3.6300607 , -4.74290045, -5.02786493])

In [29]:
# Todos tendrían que ser >= 1
np.dot(X_B, w) + beta

array([4.83137926, 2.41148831, 2.70999472, 1.65704355, 3.37243739,
       3.83054555, 1.00000331, 1.75525992, 3.32241837, 1.00000468,
       3.23663184, 2.10032872, 2.0380101 , 2.75599487, 1.00000034,
       2.31619171, 4.156993  , 3.39334372, 3.01742991, 1.93864968,
       2.00442178, 5.66096009, 1.00000078, 4.06590003, 1.00000583,
       2.88006054, 3.29340926, 4.46116812, 4.37659553, 2.91818749,
       4.29571681, 3.55696377, 3.98778231, 3.30077791, 3.52833652,
       4.21811322, 1.00000155, 2.8964536 , 3.616783  , 1.8810655 ,
       2.6503904 , 2.76936023, 3.64582928, 2.70786134, 5.56190224,
       2.68222285, 3.62752349, 2.94784934, 3.15896439, 4.98326263,
       2.14034595, 3.44750167, 4.19656835, 1.00000136, 5.05526143,
       4.08482802, 6.1936241 , 5.01524215, 4.78645205, 2.30404023,
       3.21781145, 2.2949808 , 1.08960098, 2.61644789, 1.00004186,
       2.24039121, 3.92487331, 3.36352187, 4.27470722, 4.92898491,
       2.01192827])