## logistic regression multi-dimensional data

 logistic regression multi-dimensional data
 
 
 $$ F(X)=X \times W $$
 $$ H(x)= \frac{1}{1+ e ^{-F(x)}} $$
 $$ C= -\frac{1}{n} \sum_{i,j} (Y \odot log(H(x)) + (1-Y) \odot log(1-H(x)) ) $$

$X_{n \times k}$

$W_{k \times p}$

$Y_{n \times p}$

In [1]:
import numpy as np
import random

In [2]:
n, k, p=100, 8, 3 

In [3]:
X=np.random.random([n,k])
W=np.random.random([k,p])

y=np.random.randint(p, size=(1,n))
Y=np.zeros((n,p))
Y[np.arange(n), y]=1

max_itr=5000
alpha=0.01
Lambda=0.01

Gradient is as follows:
$$ X^T (H(x)-Y) + \lambda 2 W$$

In [4]:
# F(x)= w[0]*x + w[1]
def F(X, W):
    return np.matmul(X,W)

def H(F):
    return 1/(1+np.exp(-F))

def cost(Y_est, Y):
    E= - (1/n) * (np.sum(Y*np.log(Y_est) + (1-Y)*np.log(1-Y_est)))  + np.linalg.norm(W,2)
    return E, np.sum(np.argmax(Y_est,1)==y)/n

def gradient(Y_est, Y, X):
    return (1/n) * np.matmul(X.T, (Y_est - Y) ) + Lambda* 2* W

In [5]:
def fit(W, X, Y, alpha, max_itr):
    for i in range(max_itr):
        
        F_x=F(X,W)
        Y_est=H(F_x)
        E, c= cost(Y_est, Y)
        Wg=gradient(Y_est, Y, X)
        W=W - alpha * Wg
        if i%1000==0:
            print(E, c)
        
    return W, Y_est

To take into account for the biases, we concatenate X by a 1 column, and increase the number of rows in W by one

In [6]:
X=np.concatenate( (X, np.ones((n,1))), axis=1 ) 
W=np.concatenate( (W, np.random.random((1,p)) ), axis=0 )

W, Y_est = fit(W, X, Y, alpha, max_itr)

9.368653735228364 0.31
4.994251188297815 0.43
4.951873226767272 0.48
4.922370610237865 0.47
4.901694423284286 0.48
