# [The Elements of Statistical Learning in Julia](https://github.com/jcontesti/the-elements-of-statistical-learning-in-julia)

# Chapter: 2.3 Tow Simple Approaches: Least Squares and Nearest Neighbors

## Objectives

1. Understanding of the basics of Linear Models and Least Squares.
2. Complete a simple classification example using Least Squares and Nearest Neighbors.

## Source code

### Classification example with linear regression

In [None]:
include("../shared/GaussiansMixture.jl")

using LinearAlgebra, Plots, RData, Statistics

# Use PyPlot as a backend for Plots
pyplot();

training_palette = [:lightblue, :orange];

In [None]:
# Load Gaussian distributions mixture data
mixture = load("../../data/ESL.mixture.rda", convert=true)

# Load traininig data
X = mixture["ESL.mixture"]["x"]
Y = mixture["ESL.mixture"]["y"];

In [None]:
# Add a column with ones
X = hcat(ones(size(X, 1)), X);

In [None]:
# Compute β using the unique solution to the normal equation
β = (X' * X)^-1 * X' * Y  # (2.6)

In [None]:
# Declare the function that will calculate the linear model
Ŷ(x1, x2) = β[1] + x1 * β[2] + x2 * β[3]

# And finally plot the results
plot_decision_boundary_on_scatter(X, Y, Ŷ, "Figure 2.1.")

### Classification example with Nearest Neighbors

In [None]:
using NearestNeighbors

In [None]:
# Get data and transpose it to get it ready for the NearestNeighbors package
data = mixture["ESL.mixture"]["x"]'

kdtree = KDTree(data);

In [None]:
# Declare the function that fits
Ŷ(x1, x2) = (1/k) * sum(Y[knn(kdtree, [x1, x2], k, true)[1]])

In [None]:
# Classification with 15-nearest-neighbors
k = 15
plot_decision_boundary_on_scatter(X, Y, Ŷ, "Figure 2.2.")

In [None]:
# Classification with only 1-nearest-neighbor
k = 1
plot_decision_boundary_on_scatter(X, Y, Ŷ, "Figure 2.3.")