# Descenso de gradiente

Es un algoritmo iterativo que se utiliza para minimizar una función encontrando los parámetros óptimos.Consiste en elegir la dirección $d$ de descenso mas pronunciada. Ya que la dirección de mayor crecimiento de la función es el gradiente:
	
\begin{equation*}
g^{k}=\nabla f(x^{k})
\end{equation*}

podemos obtener la dirección de mayor descenso como:
	
\begin{equation*}
d^{k}=-\frac{g^{k}}{||g^{k}||}
\end{equation*}


y el siguiente punto a evaluar:
	
\begin{equation*}
x^{k+1}=x^{k}+\alpha d^{k} 
\end{equation*}

donde $\alpha$ es la taza de aprendizaje 

Si la diferencia entre el valor de $x$ anterior y el actual es menor que el umbral de parada, se detienen las interaciones las iteraciones.

## Predictor

\begin{equation}
y_{pred} = \beta x + \epsilon
\end{equation}


### Función de costo (mínimos cuadrados)

\begin{equation}
J = \frac{1}{n}\sum_{i=1}^{n} (y_i-(\beta x_i+ \epsilon))^2
\end{equation}



### Gradiente

\begin{equation}
\frac{dJ}{d\beta} = \frac{-2}{n}\sum_{i=1}^{n} x_i *  (y_i-(\beta x_i+ \epsilon))
\end{equation}

\begin{equation}
\frac{dJ}{d\epsilon} = \frac{-2}{n}\sum_{i=1}^{n} (y_i-(\beta x_i+ \epsilon))
\end{equation}

### Actualización de parámetros

\begin{equation}
\beta = \beta - lr*\frac{dJ}{d\beta} 
\end{equation}

\begin{equation}
\epsilon = \epsilon - lr*\frac{dJ}{d\epsilon} 
\end{equation}

https://www.geeksforgeeks.org/how-to-implement-a-gradient-descent-in-python-to-find-a-local-minimum/

In [None]:
import numpy as np
import matplotlib.pyplot as plt


In [None]:
def mean_squared_error(y_true, y_predicted):  
    cost = np.sum((y_true-y_predicted)**2) / len(y_true)
    return cost

In [None]:
def gradient_descent(x, y, iterations = 1000, learning_rate = 0.0001, 
                     stopping_threshold = 1e-6):
    current_weight = 0.1
    current_bias = 0.01
    iterations = iterations
    learning_rate = learning_rate
    n = float(len(x))
     
    costs = []
    weights = []
    previous_cost = None
     
    for i in range(iterations):
        y_predicted = (current_weight * x) + current_bias
        current_cost = mean_squared_error(y, y_predicted)
        if previous_cost and abs(previous_cost-current_cost)<=stopping_threshold:
            break
         
        previous_cost = current_cost
        costs.append(current_cost)
        weights.append(current_weight)
        
        weight_derivative = -(2/n) * sum(x * (y-y_predicted))
        bias_derivative = -(2/n) * sum(y-y_predicted)
        
        current_weight = current_weight - (learning_rate * weight_derivative)
        current_bias = current_bias - (learning_rate * bias_derivative)
        
        print(f"Iteration {i+1}: Cost {current_cost}, Weight \
        {current_weight}, Bias {current_bias}")
    
    plt.figure(figsize = (8,6))
    plt.plot(weights, costs)
    plt.scatter(weights, costs, marker='o', color='red')
    plt.title("Cost vs Weights")
    plt.ylabel("Cost")
    plt.xlabel("Weight")
    plt.show()
     
    return current_weight, current_bias

In [None]:
import pandas as pd
import os

mainpath = "../datasets/"
filename = 'salary_data.csv'
fullpath = os.path.join(mainpath, filename)
dataset= pd.read_csv (fullpath)


In [None]:
data_array = np.array(dataset)
X = data_array[:,0]
Y = data_array[:,1]

In [None]:
def main():
	
	# Estimating weight and bias using gradient descent
	estimated_weight, estimated_bias = gradient_descent(X, Y, iterations=2000)
	print(f"Estimated Weight: {estimated_weight}\nEstimated Bias: {estimated_bias}")

	# Making predictions using estimated parameters
	Y_pred = estimated_weight*X + estimated_bias
	
	# Plotting the regression line
	plt.figure(figsize = (8,6))
	plt.scatter(X, Y, marker='o', color='red')
	plt.plot([min(X), max(X)], [min(Y_pred), max(Y_pred)], color='blue',markerfacecolor='red',
			markersize=10,linestyle='dashed')
	plt.xlabel("X")
	plt.ylabel("Y")
	plt.show()




In [None]:
main()