<a href="https://colab.research.google.com/github/raaraya1/Personal-Proyects/blob/main/Canales%20de%20Youtube/Python%20Engineer/Logistic_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Logistic Regresion**

**Prediccion (Aproximacion)**

$$
z = wx + b
$$

$$
\hat{y} = \frac{1}{1+e^{-z}}
$$

**Funcion de perdida (cross entropy)**

$$
loss = \frac{1}{N} \sum_{i=1}^{n} [y^{i}log(\hat{y(x^{i})}) + (1-y^{i})log(1 - \hat{y(x^{i})})]
$$

**Gradientes**

$$
\left[\begin{array}{11} \frac{d_{loss}}{dw} \\ \frac{d_{loss}}{db} \end{array}\right] = \left[\begin{array}{11} \frac{1}{N} \sum 2x_{i}(\hat{y} - y_{i}) \\ \frac{1}{N} \sum 2(\hat{y} - y_{i}) \end{array}\right]
$$

**Metodo de Gradient Descent**
- Iniciar parametros
- Iterar
 - Calcular el error (loss)
 - Actualizar los pesos ($lr$=learning rate)

 $$
w = w - lr*dw
 $$

 $$
 b = b - lr*db
 $$

- Terminar de iterar


## **Armando el algoritmo desde cero**

In [1]:
import numpy as np

class LogisticRegression:
  def __init__(self, lr=0.001, n_iters=1000):
    self.lr = lr
    self.n_iters = n_iters
    self.weights = None
    self.bias = None

  def fit(self, X, y):
    # Inicializar los parametros
    n_samples, n_features = X.shape
    self.weights = np.zeros(n_features)
    self.bias = 0

    # gradiant descent
    for _ in range(self.n_iters):
      # prediccion
      linear_model = np.dot(X, self.weights) + self.bias
      y_predicted = self._sigmoid(linear_model)

      # actualizar los pesos
      dw = (1/n_samples)*np.dot(X.T, (y_predicted - y))
      db = (1/n_samples)*np.sum((y_predicted - y))

      self.weights -= self.lr*dw
      self.bias -= self.lr*db

  def predict(self, X):
    linear_model = np.dot(X, self.weights) + self.bias
    y_predicted = self._sigmoid(linear_model)
    y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
    return y_predicted_cls

  def _sigmoid(self, x):
    return 1/(1 + np.exp(-x))



## **Probemos el algoritmo**

In [2]:
# Importar bibliotecas
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt

In [5]:
# Importamos los datos 
Bc = datasets.load_breast_cancer()
X, y = Bc.data, Bc.target

# Craemos los set de entrenamiento y validacion
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)

def accuracy(y_true, y_pred):
  accuracy = np.sum(y_true == y_pred)/len(y_true)
  return accuracy

# Entrenamos al algoritmo
regressor = LogisticRegression(lr=0.0001, n_iters=1000)
regressor.fit(X_train, y_train)
predictions = regressor.predict(X_test)

# Observemos su desempeño
acc = accuracy(y_test, predictions)
acc

0.9298245614035088

## **Ahora probemos el algortimo de sklearn**

In [7]:
# Importamos el algoritmo
from sklearn.linear_model import LogisticRegression as LR_sk

In [10]:
# Entrenemos al algoritmo
clf = LR_sk(max_iter=1000, random_state=1234)
clf.fit(X, y)

predicciones = clf.predict(X_test)

acc_sk = accuracy(y_test, predicciones) 
acc_sk

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression


0.9473684210526315