# <center>Project 4 on Mathematics in AI</center>

Subject: Within Distance

Name: Hesam Mousavi

Student number: 9931155

<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {
inlineMath: [['$','$'], ['\\(','\\)']],
processEscapes: true},
jax: ["input/TeX","input/MathML","input/AsciiMath","output/CommonHTML"],
extensions: ["tex2jax.js","mml2jax.js","asciimath2jax.js","MathMenu.js","MathZoom.js","AssistiveMML.js", "[Contrib]/a11y/accessibility-menu.js"],
TeX: {
extensions: ["AMSmath.js","AMSsymbols.js","noErrors.js","noUndefined.js"],
equationNumbers: {
autoNumber: "AMS"
}
}
});
</script>

### How I store the dataset

I created a module called 'Dataset' which contains everything about our dataset (sample, label, number of sample, number of feature, and representor)

In [1]:
from copy import deepcopy
import numpy as np
from dataset import Dataset

dataset = Dataset('dataset/iris.csv')

norm_set = np.array((
    [1, 2],
    [1, np.inf],
    [2, 0],
    [2, 1],
    [2, np.inf],
    [np.inf, 0],
    [np.inf, 1],
    [np.inf, 2]
))

### Find $e^{d, d^{\prime}}(X, c)$

In [3]:
def find_err(dataset: Dataset, d_norm: int, dp_norm: int):
    err_from_rep = [
        np.linalg.norm(sample - dataset.representor, d_norm)
        for sample in dataset.sample]
    return np.linalg.norm(err_from_rep, dp_norm)

## Fisrt idea

As a first step, let's use an alternate search to get a good approximation of the representative on our dataset with our set of norms
$$
\begin{array}{|c|c|}
\hline d & d^{\prime} \\
\hline 1 & 2 \\
\hline 1 & \infty \\
\hline 2 & 0 \\
\hline 2 & 1 \\
\hline 2 & \infty \\
\hline \infty & 0 \\
\hline \infty & 1 \\
\hline \infty & 2 \\
\hline
\end{array}
$$

For that, I use step-decay, which after a few steps without any improvement, it will half the step size. And I'll return the best representative that we've seen if the step size is smaller than epsilon

In [2]:
def find_rep_with_AS(dataset: Dataset, d_norm: int, dp_norm: int):
    best_err, best_rep = np.inf, None
    step_size, no_improve = 1, 0
    eps, no_improve_threshold = 1e-04, dataset.number_of_feature ** 3
    random_sample = np.random.randint(0, dataset.number_of_sample)
    dataset.representor = dataset.sample[random_sample]

    while(step_size > eps):
        random_feature = np.random.randint(0, dataset.number_of_feature)
        dataset.representor[random_feature] += step_size
        this_err = find_err(dataset, d_norm, dp_norm)

        if(this_err < best_err):
            best_err = this_err
            no_improve = 0
            best_rep = deepcopy(dataset.representor)
            continue
        dataset.representor[random_feature] -= step_size

        dataset.representor[random_feature] -= step_size
        this_err = find_err(dataset, d_norm, dp_norm)
        if(this_err < best_err):
            best_err = this_err
            no_improve = 0
            best_rep = deepcopy(dataset.representor)
            continue
        dataset.representor[random_feature] += step_size

        no_improve += 1
        if(no_improve > no_improve_threshold):
            step_size /= 2

    dataset.representor = best_rep
    return best_err


for d_norm, dp_norm in norm_set:
    print(f'for d, d\' = {d_norm}, {dp_norm} error is',
          np.round(find_rep_with_AS(dataset, d_norm, dp_norm), 3))
    print(f'with representor {np.round(dataset.representor, 3)}\n')


for d, d' = 1.0, 2.0 error is 43.433
with representor [5.7   3.1   3.913 1.194]

for d, d' = 1.0, inf error is 6.05
with representor [5.9  3.05 3.85 1.55]

for d, d' = 2.0, 0.0 error is 149.0
with representor [6.1 2.8 5.7 1.2]

for d, d' = 2.0, 1.0 error is 279.845
with representor [5.917 2.917 4.181 1.358]

for d, d' = 2.0, inf error is 3.771
with representor [7.053 3.1   3.239 1.914]

for d, d' = inf, 0.0 error is 149.0
with representor [4.5 2.3 1.3 1.3]

for d, d' = inf, 1.0 error is 227.837
with representor [5.917 2.783 4.183 1.325]

for d, d' = inf, 2.0 error is 21.283
with representor [5.844 2.683 3.783 1.256]

