# Logistic regression

In [None]:
using DataFrames
using Plots
using RDatasets
using Convex
using SCS

This is an example logistic regression using `RDatasets`'s iris data.
Our goal is to gredict whether the iris species is versicolor
using the sepal length and width and petal length and width.

In [None]:
iris = dataset("datasets", "iris");
iris[1:10,:]

We'll define `Y` as the outcome variable: +1 for versicolor, -1 otherwise.

In [None]:
Y = [species == "versicolor" ? 1.0 : -1.0 for species in iris.Species]

We'll create our data matrix with one column for each feature
(first column corresponds to offset).

In [None]:
X = hcat(ones(size(iris, 1)), iris.SepalLength, iris.SepalWidth, iris.PetalLength, iris.PetalWidth);

Now to solve the logistic regression problem.

In [None]:
n, p = size(X)
beta = Variable(p)
problem = minimize(logisticloss(-Y.*(X*beta)))
solve!(problem, () -> SCS.Optimizer(verbose=false))

Let's see how well the model fits.

In [None]:
using Plots
logistic(x::Real) = inv(exp(-x) + one(x))
perm = sortperm(vec(X*evaluate(beta)))
plot(1:n, (Y[perm] .+ 1)/2, st=:scatter)
plot!(1:n, logistic.(X*evaluate(beta))[perm])

---

*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*