# Advanced Supervised Machine Learning

### K-nearest Neighbour

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Circle
from utils_common import generate_data
import random

In [None]:
# Generate a random data set
m = generate_data(0, 50, 0, 50, 150, 0.4)
n = generate_data(0, 50, 0, 50, 150, 0.45)
o = generate_data(30, 50, 30, 50, 10, 0.1)

cols = [random.randint(0, 1) for _ in range(10)]
radii = [1, 2, 3, 4, 5]


In [None]:
# K-Nearest Neighbour Regression
# Mean of the K-Nearest Neighbours is used as the prediction
zoom = True
plt.xlabel("X")
plt.ylabel("Y")
plt.scatter(40, 40, color='red', label='Target Point')
if not zoom:
    plt.scatter(m[0], m[1], color='black')
    plt.scatter(n[0], n[1], color='black')
else:
    plt.scatter(o[0], o[1], color='black')
    plt.plot([],[],color='green', linestyle='--', linewidth=1, label='Euclidean Distance')
    for xi, yi in zip(o[0], o[1]):
        plt.plot([40, xi], [40, yi], color='green', linestyle='--', linewidth=1)
plt.legend(loc='upper left')
plt.title("K-Nearest Neighbour")
plt.show()

In [None]:
# K-Nearest Neighbour Classification\
# Mode Class of the K-Nearest Neighbours is used as the prediction
zoom = True
fig, ax = plt.subplots()
plt.scatter(40, 40, color='red', label='Target Point')
if not zoom:
    plt.scatter(m[0], m[1], color='gold', label='Class 0')
    plt.scatter(n[0], n[1], color='indigo', label='Class 1')
else:
    plt.scatter([], [], c='gold', label='Class 0')
    plt.scatter([], [], c='indigo', label='Class 1')
    plt.scatter(o[0], o[1], c=cols)
    plt.plot([],[],color='green', linestyle='--', linewidth=1, label='Euclidean Distance')
    for radius in radii:
        circle = Circle((40, 40), radius, color='blue', fill=False, linestyle='--')
        ax.add_patch(circle)
    for xi, yi in zip(o[0], o[1]):
        plt.plot([40, xi], [40, yi], color='green', linestyle='--', linewidth=1)
ax.set_xlabel("X")
ax.set_ylabel("Y")
plt.legend()
plt.title("K-Nearest Neighbour Classification")
plt.show()

#### Measuring Distance

KNN uses either Euclidean distance (straight-line) and Manhattan distance (grid-like, right-angle path) between two points. Although Euclidean distance is the most commenly used measure in KNN.

In [None]:
# Visualise Euclidean & Manhattan Distance
A, B = np.array([2, 3]), np.array([8, 7])
fig, ax = plt.subplots(figsize=(6, 6))
ax.scatter(*A, color='blue', s=100, label='A')
ax.scatter(*B, color='red', s=100, label='B')
# Euclidean (straight line)
ax.plot([A[0], B[0]], [A[1], B[1]], color='green', lw=2, label='Euclidean')
# Manhattan (right-angle path)
ax.plot([A[0], B[0]], [A[1], A[1]], color='orange', ls='--', lw=2)
ax.plot([B[0], B[0]], [A[1], B[1]], color='orange', ls='--', lw=2, label='Manhattan')
ax.annotate('A', A + [0.2, -0.2], fontsize=12, color='blue')
ax.annotate('B', B + [0.2, 0.2], fontsize=12, color='red')
ax.grid(True, linestyle=':')
ax.set(xlim=(0, 10), ylim=(0, 10), aspect='equal', xlabel='X', ylabel='Y', title='Euclidean vs Manhattan Distance')
ax.legend()
plt.show()

### Neural Network Course Specifications

<figure>
    <center><img src="images\NN_Course-Specs.png" alt="Course Specs Neural Network image" width="500" />
    <figcaption><p><em>Source: Page 29 of the Software Engineering Course Specifications</em></p>
    </figcaption></center>
</figure>

Neural networks were designed to mimic the processing inside the human brain. They consist of a series of interconnected nodes (artificial neurones). Each neurone can accept a binary input signal and potentially output another signal to connected nodes.

#### Training cycle

Internal weightings and threshold values for each node are determined in the initial training cycle for each neural network. The system is exposed to a series of inputs with known responses. Linear regression with backward chaining is used to iteratively determine the set of unique values required for output. Regular exposure to the training cycle results in improved accuracy and pattern matching.

#### Execution cycle

In the diagram, signal strength between nodes with the strongest weightings are thicker representing a higher priority in determining the final output. The execution cycle follows the training cycle and utilises the internal values developed during the training cycle to determine the output.

Page 29

### Decision Trees

<figure>
    <center><img src="images\decision_tree.png" alt="Decision Tree" width="500" />
</figure>

1. Root Node:
The algorithm starts with the entire dataset as the root node. 
2. Splitting:
At each node, the algorithm selects the feature and split value that best separates the data into subsets, minimizing the variance or impurity of the target variable in the subsets. 
3. Recursion:
This process is repeated for each subset, creating new nodes and branches until a stopping criterion is met (e.g., maximum tree depth, minimum number of samples in a leaf). 
4. Leaf Nodes:
The leaf nodes contain the predicted values, which are often the average or mean of the target variable values in the corresponding subset. 
5. Prediction:
To predict the value for a new data point, you follow the path from the root node to a leaf node based on the data point's features, and the prediction is the value stored in that leaf. 