# Logistic Regression Solver

## Data Set Up

In [1]:
# Implemented in Julia 1.0
# Vivak Patel

#Import Data
using RDatasets
UCBAdmit = RDatasets.dataset("datasets","UCBAdmissions")

#Generate Observed Variables
Y = map(y -> y == "Admitted" ? 1 : 0, UCBAdmit[1])


┌ Info: Precompiling RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]
└ @ Base loading.jl:1186


24-element Array{Int64,1}:
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0
 1
 0

In [2]:
#Generate Explanatory Variables
using LinearAlgebra
X0 = ones(nrow(UCBAdmit))
X1 = map(x -> x == "Female" ? 1 : 0, UCBAdmit[2])
function proc_dept(val)
    E = Matrix{Float64}(I,5,5)
    if val == "A"; return zeros(5); end
    if val == "B"; return E[:,1]; end
    if val == "C"; return E[:,2]; end
    if val == "D"; return E[:,3]; end
    if val == "E"; return E[:,4]; end
    if val == "F"; return E[:,5]; end
end

X2 = vcat(map(x -> proc_dept(x)', UCBAdmit[3])...)
X = hcat(X0,X1,X2,Float64[UCBAdmit[4]...])


24×8 Array{Float64,2}:
 1.0  0.0  0.0  0.0  0.0  0.0  0.0  512.0
 1.0  0.0  0.0  0.0  0.0  0.0  0.0  313.0
 1.0  1.0  0.0  0.0  0.0  0.0  0.0   89.0
 1.0  1.0  0.0  0.0  0.0  0.0  0.0   19.0
 1.0  0.0  1.0  0.0  0.0  0.0  0.0  353.0
 1.0  0.0  1.0  0.0  0.0  0.0  0.0  207.0
 1.0  1.0  1.0  0.0  0.0  0.0  0.0   17.0
 1.0  1.0  1.0  0.0  0.0  0.0  0.0    8.0
 1.0  0.0  0.0  1.0  0.0  0.0  0.0  120.0
 1.0  0.0  0.0  1.0  0.0  0.0  0.0  205.0
 1.0  1.0  0.0  1.0  0.0  0.0  0.0  202.0
 1.0  1.0  0.0  1.0  0.0  0.0  0.0  391.0
 1.0  0.0  0.0  0.0  1.0  0.0  0.0  138.0
 1.0  0.0  0.0  0.0  1.0  0.0  0.0  279.0
 1.0  1.0  0.0  0.0  1.0  0.0  0.0  131.0
 1.0  1.0  0.0  0.0  1.0  0.0  0.0  244.0
 1.0  0.0  0.0  0.0  0.0  1.0  0.0   53.0
 1.0  0.0  0.0  0.0  0.0  1.0  0.0  138.0
 1.0  1.0  0.0  0.0  0.0  1.0  0.0   94.0
 1.0  1.0  0.0  0.0  0.0  1.0  0.0  299.0
 1.0  0.0  0.0  0.0  0.0  0.0  1.0   22.0
 1.0  0.0  0.0  0.0  0.0  0.0  1.0  351.0
 1.0  1.0  0.0  0.0  0.0  0.0  1.0   24.0
 1.0  1.0  

## Likelihood Function

We have $Pr(y_i=1|X_i) = \frac{e^{X_i^{'}\beta}}{1+e^{X_i^{'}\beta}}$, which may also be expressed as:

$$Pr(y_i=1|X_i) =\frac{1}{1+e^{-X_i^{'}\beta}}$$

At the same time we have:

$$Pr(y_i=0|X_i) =1-\frac{1}{1+e^{-X_i^{'}\beta}}$$

We use these to construct a log-likelihood function

$$\log l(\beta|X)= \sum_{i=1}^{N} y_i\log \left ( {\frac{1}{1+e^{-X_i^{'}\beta}}} \right ) +(1-y_i)\log\left ({1-\frac{1}{1+e^{-X_i^{'}\beta}}}\right ) $$

After algebraic simplification this may be expressed as

$$\log l(\beta|X)= \sum_{i=1}^{N} [y_iX_i\beta -\log(1+e^{X_i\beta})] $$

In [102]:
function admitll(B, Y, X, counts)
    ll = zeros(24,1)
    for i = 1:24
#    counts[i]*Y[i]*(X[i,:]'*B')
        #display(convert(Float64, X[i,:]'*B'))
#        display(counts[i]*Y[i]*X[i,:]'*B')
#        display(log(1 .+ exp(1).^(X[i,:]'*B')))
        ll[i,1] = (counts[i]*Y[i]*X[i,:]'*B' - log(1 .+ exp(1).^(X[i,:]'*B')))
    end
#    return(sum(ll))
end
    

admitll (generic function with 1 method)

In [103]:
B = [1 1 1 1 1 1 1]
admitll(B, Y, X[:,1:7], X[:,8]')

MethodError: MethodError: Cannot `convert` an object of type Array{Float64,2} to an object of type Float64
Closest candidates are:
  convert(::Type{T<:Number}, !Matched::T<:Number) where T<:Number at number.jl:6
  convert(::Type{T<:Number}, !Matched::Number) where T<:Number at number.jl:7
  convert(::Type{T<:Number}, !Matched::Base.TwicePrecision) where T<:Number at twiceprecision.jl:250
  ...

## Gradient of the Loglikelihood

$$\nabla_\beta \log l(\beta|X)= \sum_{i=1}^{N} \nabla_\beta [y_iX_i\beta -\log(1+e^{X_i\beta})] $$

After a bit of algebraic manipulation, this is:

$$\nabla_\beta \log l(\beta|X) = \sum_{i=1}^{N} \left ( y_i - \frac{1}{1+e^{-X_i\beta}} \right )X_i$$