# Logistic Regression Solver

## Data Set Up

In [1]:
# Implemented in Julia 1.0
# Vivak Patel

#Import Data
using RDatasets
UCBAdmit = RDatasets.dataset("datasets","UCBAdmissions")

#Generate Observed Variables
Y = map(y -> y == "Admitted" ? 1 : 0, UCBAdmit[1]);


In [2]:
#Generate Explanatory Variables
using LinearAlgebra
X0 = ones(nrow(UCBAdmit))
X1 = map(x -> x == "Female" ? 1 : 0, UCBAdmit[2])
function proc_dept(val)
    E = Matrix{Float64}(I,5,5)
    if val == "A"; return zeros(5); end
    if val == "B"; return E[:,1]; end
    if val == "C"; return E[:,2]; end
    if val == "D"; return E[:,3]; end
    if val == "E"; return E[:,4]; end
    if val == "F"; return E[:,5]; end
end

X2 = vcat(map(x -> proc_dept(x)', UCBAdmit[3])...)
X = hcat(X0,X1,X2,Float64[UCBAdmit[4]...]);


## Likelihood Function

We have $Pr(y_i=1|X_i) = \frac{e^{X_i^{'}\beta}}{1+e^{X_i^{'}\beta}}$, which may also be expressed as:

$$Pr(y_i=1|X_i) =\frac{1}{1+e^{-X_i^{'}\beta}}$$

At the same time we have:

$$Pr(y_i=0|X_i) =1-\frac{1}{1+e^{-X_i^{'}\beta}}$$

We use these to construct a log-likelihood function

$$\log l(\beta|X)= \sum_{i=1}^{N} y_i\log \left ( {\frac{1}{1+e^{-X_i^{'}\beta}}} \right ) +(1-y_i)\log\left ({1-\frac{1}{1+e^{-X_i^{'}\beta}}}\right ) $$

After algebraic simplification this may be expressed as

$$\log l(\beta|X)= \sum_{i=1}^{N} [y_iX_i\beta -\log(1+e^{X_i\beta})] $$

In [3]:
function admitll(B, Y, X, counts)
    ll = zeros(24,1)
    for i = 1:24
        ll[i] = counts[i]*(Y[i]*dot(X[i,:],B)-log(1+exp(dot(X[i,:],B))))
    end
    return(sum(ll))
end
    

admitll (generic function with 1 method)

In [4]:
B = [1 1 1 1 1 1 1]
admitll(B, Y, X[:,1:7], X[:,8])

-7080.907142169214

## Gradient of the Loglikelihood

$$\nabla_\beta \log l(\beta|X)= \sum_{i=1}^{N} \nabla_\beta [y_iX_i\beta -\log(1+e^{X_i\beta})] $$

After a bit of algebraic manipulation, this is:

$$\nabla_\beta \log l(\beta|X) = \sum_{i=1}^{N} X_i \left ( y_i - \frac{1}{1+e^{-X_i\beta}} \right )$$

In [5]:
function admitgr(B,Y,X,counts)
    Yf = convert(Array{Float64,1}, Y)
    gr = zeros(24,1)
    for i in 1:24
        gr[i,1] = (Yf[i].-1/(1+exp(-dot(X[i,:],B)))).*counts[i]
    end
    return((X[:,:]'*gr)./sum(counts))
end


admitgr (generic function with 1 method)

In [6]:
admitgr(B, Y, X[:,1:7], X[:,8])

7×1 Array{Float64,2}:
 -0.4931313027663358 
 -0.261427662714086  
 -0.03249242528461664
 -0.11691018726215527
 -0.10064243903561411
 -0.08740474452827923
 -0.13419467241100308

## Newton Method for Logistic Regression

In [7]:
function newtonMethod(S,∇S, B, alpha; 
        Y=Y, X=X[:,1:7], counts=X[:,8], ϵ=1e-8, maxiter = 25)
  i = 0
  S₀ = S(B, Y, X, counts)
  ∇S₀ = ∇S(B, Y, X, counts)
  while norm(∇S₀) > ϵ && i <= maxiter
    i += 1
    #Search Direction
    ∇S₀ = ∇S(B, Y, X, counts)
    p = ∇S₀'
    #Step Length
    α = alpha #, evals = backtrack(S,∇S₀,p,X)
    #Update Parameter
    B += α*p
    #Update Gradient
    S₀ = S(B, Y, X, counts)
  end

  return B, i
end

newtonMethod (generic function with 1 method)

In [8]:
sol, iters = newtonMethod(admitll, admitgr, B, 1; Y=Y, X=X[:,1:7], counts=X[:,8], maxiter=3000);
sol'

7×1 Adjoint{Float64,Array{Float64,2}}:
  0.5820509441989553 
  0.09986982421880027
 -0.04339737425335429
 -1.2625973212260113 
 -1.2946058038223807 
 -1.7393049376118885 
 -3.30647879307108   

In [9]:
iters

2845