# Regresión Lineal

**Daremos paso a paso** el algoritmo de regresión lineal.
Primero debemos tener unos datos que se comporten de manera lineal.

Usaremos data de grados celsius como entrada y grados fahrenheit como salida

Recordemos que la fórmula de Celsius a Fahrenheit es:
$$°F = \frac{9}{5} * °C + 32$$

1) Se hace el forward progagation con la fórmula de regresión lineal:

$$\hat{y}^{(i)} = w^T x^{(i)} + b $$

2) Luego se usa la fórmula de error cuadrático medio para medir el error: $$\begin{align*} & MSE = \frac{1}{m}\sum_{i=1}^{m}(y^{(i)} - \hat{y}^{(i)})^2 \end{align*}$$ es útil para darnos una idea de cuán alejados están los puntos de la línea que busca predecir los datos de salida. Nos ayuda a medir qué tan buena es la pareja de peso e intercepto.



3) Los pesos y sesgos deben actualizarse, cambiar, para cada vez predecir de mejor manera los datos. $${w = w - \alpha * \frac{\partial_J}{\partial w}}$$
$${b = b - \alpha * \frac{\partial_J}{\partial b}}$$ Se usa el método del descenso del gradiente. Este usa derivadas para darle mayor o menor cambio al peso y al sesgo siempre que estén influyendo en que se esté produciendo un error. Se calcula la derivada de la función de error J(w):$$J(w) = \frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})^2$$ Miremos la derivada respecto al peso: $${\frac{\partial J}{\partial w}} = -2*\frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})x_{i}$$ y respecto al sesgo o intercepto: $${\frac{\partial J}{\partial b}} = -2*\frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})$$ Entonces la cantidad que tienen que disminuir el peso y el sesgo es:
$$
{\frac{\partial J}{\partial w}} = w + \alpha*2*\frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})x_{i}\\{\frac{\partial J}{\partial b}} = b + \alpha*2*\frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})\\
{w = w + \alpha * \frac{\partial_J}{\partial w}}\\
{b = b + \alpha * \frac{\partial_J}{\partial b}}$$


Donde alfa es la tasa de aprendizaje, también conocido como el learning rate, esto sería la magnitud de los pasos que daría el algoritmo cada vez que va disminuyendo el error, o dicho de otra forma, al ir bajando las montañas del error.

No siendo más, creemos las funciones

In [3]:
import numpy as np

## Forward Propagation
$$\hat{y}^{(i)} = w^T X^{(i)} + b$$

In [4]:
def forward_propagation(w, X, b):
    #print(w.shape, X.shape)
    return np.dot(w, X) + b

## Mean Squared Error
$$\begin{align*} & MSE = \frac{1}{m}\sum_{i=1}^{m}(y^{(i)} - \hat{y}^{(i)})^2 \end{align*}$$

In [5]:
def mean_squared_error(y, y_hat):
    return np.mean(np.square(np.subtract(y, y_hat)))

## Actualización de pesos por el descenso del gradiente
$${\frac{\partial J}{\partial w}} = w + \alpha*2*\frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})x_{i}\\{\frac{\partial J}{\partial b}} = b + \alpha*2*\frac{1}{m}\sum_{i=1}^{m}({y_{i}-b-wx_{i}})$$

In [41]:
def actualize_weights(learning_rate, w, b, X, error, m):
    w = w + learning_rate*2*np.matmul(X.T, error.mean())
    b = b + learning_rate*2*(error.mean())
    return w, b

## Normalización

In [42]:
def normalize(x, y):
    x = (np.subtract(x,x.mean()))/x.std()
    y = (np.subtract(y,y.mean()))/y.std()
    return x, y

## Entrenar modelo

In [43]:
def train(X, y, epochs, learning_rate):
    w = random()
    b = random()
    print(w, b)
    cost_value = []
    m = X.shape[0]
    for i in range(epochs):
        y_hat = forward_propagation(w, X, b)
        cost = mean_squared_error(y, y_hat)
        error = np.subtract(y, y_hat)
        w, b = actualize_weights(learning_rate, w, b, X, error, m)
        #print(w.shape, b)
        cost_value.append(cost)
        print("Epoch:", str(i+1), "\n", "mse_loss_value:", cost_value[-1])
    return w, b, cost_value

In [44]:
celsius = np.array([-40, -15, -10, -2, 0, 5, 12, 22, 34, 55, 69, 90], dtype='float')
fahrenheit = celsius*9/5 + 32

In [45]:
print(celsius, fahrenheit)

[-40. -15. -10.  -2.   0.   5.  12.  22.  34.  55.  69.  90.] [-40.    5.   14.   28.4  32.   41.   53.6  71.6  93.2 131.  156.2 194. ]


In [46]:
print(celsius.shape, fahrenheit.shape)

(12,) (12,)


In [47]:
celsius, fahrenheit = normalize(celsius, fahrenheit)
celsius = celsius.reshape(celsius.shape[0], 1)
fahrenheit= fahrenheit.reshape(fahrenheit.shape[0], 1)
print(celsius.shape, fahrenheit.shape)

(12, 1) (12, 1)


In [48]:
w, b, costos = train(celsius, fahrenheit, 300, 0.005)

0.43770752444109884 0.805335150724774


ValueError: matmul: Input operand 1 does not have enough dimensions (has 0, gufunc core with signature (n?,k),(k,m?)->(n?,m?) requires 1)

In [40]:
print(w.shape)
print(b)
print(costos[-1])

(1, 12)
0.5504845753450817
0.9999999999999999


In [17]:
(np.array([0.6, 0.6, 0.6, 1.2, 1.2, 1.2]) + np.array([3])).shape

(6,)

In [18]:
np.array([0.6, 0.6, 0.6, 1.2, 1.2, 1.2])[-1]

1.2

In [19]:
def normalizar(x, y):
    x = (x.mean()-x)/x.std()
    y = (y.mean()-y)/y.std()
    return x, y

def calcular_prediccion(x,w):
    return np.dot(x, w[1]) + w[0]

def calcular_costo(y, y_hat, m):
    return ((y - y_hat)**2).sum() / m

def entrenar(x, y, w, epocas, alfa):
    costos = []
    m = x.shape[0]
    for i in range(epocas):
        y_hat = calcular_prediccion(x, w)
        print(y_hat)
        error = (y - y_hat)
        w[0] += (2*alfa)/m*error.sum()
        w[1] += (2*alfa)/m*np.dot(x, error)
        costo = calcular_costo(y, y_hat, m)
        costos.append(costo)
    return w, costos
    
celsius, fahrenheit = normalizar(celsius, fahrenheit)
#celsius.resize((len(celsius),1))
#fahrenheit.resize(len(fahrenheit),1)

w = [0.1, 0.5]
w, costos = entrenar(celsius, fahrenheit, w, 100, 0.001)
print('w0:', w[0])
print('w1:', w[1:])

[[-0.70917988]
 [-0.36238851]
 [-0.29303023]
 [-0.18205699]
 [-0.15431368]
 [-0.0849554 ]
 [ 0.01214618]
 [ 0.15086274]
 [ 0.3173226 ]
 [ 0.60862736]
 [ 0.80283053]
 [ 1.09413529]]


ValueError: shapes (12,1) and (12,1) not aligned: 1 (dim 1) != 12 (dim 0)

In [20]:
np.dot(np.array([-40., -15., -10.,  -2.,   0.,   5.,  12.,  22.,  34.,  55.,  69.,  90.]), np.array(w[1])) #+ w[0]

array([-20. ,  -7.5,  -5. ,  -1. ,   0. ,   2.5,   6. ,  11. ,  17. ,
        27.5,  34.5,  45. ])

In [21]:
w[1]

0.5

In [24]:
b

0.17807778141838057

In [25]:
np.dot(celsius, w[1]) + w[0]

array([[-0.70937988],
       [-0.36258851],
       [-0.29323023],
       [-0.18225699],
       [-0.15451368],
       [-0.0851554 ],
       [ 0.01194618],
       [ 0.15066274],
       [ 0.3171226 ],
       [ 0.60842736],
       [ 0.80263053],
       [ 1.09393529]])

In [95]:
celsius

array([ 1.61835977,  0.92477701,  0.78606046,  0.56411398,  0.50862736,
        0.3699108 ,  0.17570763, -0.10172547, -0.4346452 , -1.01725471,
       -1.40566106, -1.98827057])

In [97]:
np.zeros((4, 1))

array([[0.],
       [0.],
       [0.],
       [0.]])

In [99]:
a = np.zeros((10, 2))

# A transpose makes the array non-contiguous
b = a.T

# Taking a view makes it possible to modify the shape without modifying
# the initial object.
c = b.view()
#c.shape = (20)
a

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

In [20]:
x = np.array([[1.],
         [2.],
         [3.],
         [4.],
         [5.],])
y = x*1.8 + 32.
w = 0.5
b = 0.1

In [23]:
x.shape

(5, 1)

In [22]:
(np.dot(x, w) + b).shape

(5, 1)

In [27]:
x2 = np.array([1., 2., 3., 4., 5.])

In [32]:
x2.reshape(x2.shape[0], 1)

array([[1.],
       [2.],
       [3.],
       [4.],
       [5.]])

In [45]:
x2 = x2.reshape(x2.shape[0], 1)
x2 - np.array([[0.5],
       [0.5],
       [0.5],
       [0.5],
       [0.5]])

array([[0.5],
       [1.5],
       [2.5],
       [3.5],
       [4.5]])

In [64]:
np.dot(x2, np.array([5]))

array([ 5., 10., 15., 20., 25.])