# Logistic Regression

Logistic regression is used to classify data into binary classes. Logistic regression can also be used for multi-class classification. Unlike linear regression, Logistic Regression does not have an explicit form, so the only way to solve it is through gradient ascent.



In [52]:
"""
Procedure LMS
1. Normalize data to have mean=0,std=1 for features
2. Add interept term to data
3. Initialize weights
4. For step in n_iters do:
5.     predict outputs - pred = sigmoid()
6.     compute loss = mean( y log pred + (1-y) log (1-pred)) 
7.     compute grad_loss = mean((pred-targets)@data) 
8.     update weights -= lr*grad_loss (maximizes log likelihood)
"""

import numpy as np

def generate_data(n,f):
    data = np.random.random_sample((n,f))+np.sqrt(np.arange(n*f).reshape(n,f))
    targets = np.concatenate((np.zeros(n//2),np.ones(n-n//2)))
    return data,targets

class LinearRegressionLMS:
    
    def __init__(self,lr=1e-2,iters=10):
        self.lr = lr
        self.iters = iters
        self.weights = []
        
    def fit(self,data,targets):
        print(self)
        n = data.shape[0]
        # normalize data + add intercept term
        data -= data.mean(0)
        data/=(data.std(0)+1e-5)
        data = self._add_intercept(data)
        f = data.shape[1]
        # init and normalize weights
        self.weights = np.random.randn(f)
        self.weights = (self.weights- self.weights.mean())/(self.weights.std()+1e-5)
        for i in range(self.iters):
            # predict
            pred = self._sigmoid(data @ self.weights)
            # compute log probs for loss
            logp0 = np.array([np.log(p) for p in pred if p >=.5])
            logp1 = np.array([1-p for p in pred if p< 0.5])
            # compute loss
            loss = np.mean(np.concatenate((logp0,logp1)))
            # compute gradient of loss
            grad_loss = data.T @ (pred-targets) 
            # update weights
            self.weights -= self.lr * grad_loss
            score = self._score(pred,targets)
            print('Step',i,'Loss',round(loss,3),'Score',score)
 
    
    def _score(self,pred,targets):
        out = [0.0 if p>=0.5 else 1.0 for p in pred]
        return np.mean([x==y for x,y in zip(out,targets)])

    def _sigmoid(self,x):
        return 1/(1+np.exp(x))
    
    def _grad_sigmoid(self,x):
        return self._sigmoid(x)*(1-self._sigmoid(x))
    
    def _add_intercept(self,data):
        intercept = np.ones(data.shape[0]).reshape(-1,1)
        return np.concatenate((intercept,data),1)
    
    def __str__(self):
        line = '='*40
        print(line)
        print('Logistic Regression')
        print(line)
        print('Hyperparamters:')
        print('lr =',self.lr,'> learning rate')
        print('iters =',self.iters,'> optimization steps')
        return line
    
data, targets = generate_data(20,3)
model = LinearRegressionLMS(lr=1e-2,iters=20)
model.fit(data,targets)

Logistic Regression
Hyperparamters:
lr = 0.01 > learning rate
iters = 20 > optimization steps
Step 0 Loss -0.208 Score 0.75
Step 1 Loss -0.175 Score 0.75
Step 2 Loss 0.043 Score 0.9
Step 3 Loss 0.085 Score 0.9
Step 4 Loss 0.246 Score 1.0
Step 5 Loss 0.283 Score 1.0
Step 6 Loss 0.314 Score 1.0
Step 7 Loss 0.339 Score 1.0
Step 8 Loss 0.359 Score 1.0
Step 9 Loss 0.375 Score 1.0
Step 10 Loss 0.388 Score 1.0
Step 11 Loss 0.398 Score 1.0
Step 12 Loss 0.406 Score 1.0
Step 13 Loss 0.413 Score 1.0
Step 14 Loss 0.419 Score 1.0
Step 15 Loss 0.424 Score 1.0
Step 16 Loss 0.428 Score 1.0
Step 17 Loss 0.431 Score 1.0
Step 18 Loss 0.494 Score 0.95
Step 19 Loss 0.497 Score 0.95
