# Lab 1d: Let's build a Self Organizing Map (SOM) for Synthetic Coagulation Data
A self-organizing map (SOM) is an unsupervised machine learning technique used to produce a low-dimensional, e.g., two-dimensional representation of high-dimension data while preserving the topological structure of the data. For example, a data set with $p$-variables with $n$-observations
could be represented as clusters of observations with similar values for the variables. 
* These clusters could then be visualized as a two-dimensional map such that observations in proximal clusters have more similar values than in distal clusters. This can make high-dimensional data easier to visualize and analyze.

## Setup
We set up the computational environment by including the `Include.jl` file. The `Include.jl` file loads external packages, various functions that we will use in the exercise, and custom types to model the components of our lab problem.

In [3]:
include("Include.jl");

learn (generic function with 1 method)

## Prerequisites: Constants and measurement data
Fill me in

In [5]:
dataset = MySyntheticDataset() |> d-> d["ensemble"]; # optional keyword arg visit::Int where visit = {1 | 2 | 3}

The keys of the dataset dictionary are the `actual` patient indexes. These keys point to `synthetic` patient measurement vectors constructed by building a model of the original data distribution. To explore this data, specify an original patient index (one of the keys of the original dictionary) in the `original_patient_index::Int` variable:

In [27]:
original_patient_index = 5; # i ∈ {keys}

Next, we'll build a data matrix with the `synthetic` measurement vectors for the specified original patient index. We'll store this in the `D::Array{<:Number, 1}` matrix.

In [78]:
D = let

    M = dataset[original_patient_index];
    number_of_rows = length(M); # number of synthetic patients
    number_of_cols = length(M[1]); # number of measurements (features)
    D = Array{Float64,2}(undef, number_of_rows, number_of_cols);

    for i ∈ 0:(number_of_rows - 1)
        for j ∈ 1:number_of_cols
            D[i+1,j] = M[i][j];
        end
    end
    
    D;
end

101×33 Matrix{Float64}:
 1.0       3.11674  1.2722     …  27.3417  100.0     44.1833  1476.5
 0.907661  1.34553  0.0467624     26.8426  100.123   41.236   1298.9
 0.914448  1.42047  3.24366       36.7955  100.119   54.2847  1436.83
 0.9014    2.04816  2.75776       45.0829   99.9167  62.6917  1651.85
 0.90399   1.34557  2.17829       25.1745  100.027   32.6987  1716.12
 0.901527  1.80023  2.70127    …  28.6956   99.8424  41.7497  1720.1
 0.901474  1.87665  0.561092      17.8382   99.9466  28.0801  1707.69
 0.912968  1.75272  1.68913       16.0171   99.5636  33.9347  1309.5
 0.903468  1.74534  0.79968       30.9937   99.8898  47.095   1563.59
 0.904401  1.69023  1.24625       40.9327   99.4321  58.1156  1461.08
 0.906582  1.51657  1.64198    …  24.3037   99.769   39.5752  1612.7
 0.91318   1.58922  0.332276      22.7526  100.095   33.2077  1514.16
 0.915021  1.76303  0.359262      32.9741   99.9894  56.474   1519.62
 ⋮                             ⋱             ⋮                
 0.91456

Finally, let's setup some constants that we'll use later.

In [84]:
number_of_neurons = 100; # how many neurons are we going to use?
number_of_examples = size(D,1); # number of synthetic patients
number_of_features = size(D,2); # number of features (measurements)

## Task 1: Setup the SOM model instance
Fill me in

In [None]:
model = let

    model = build(MySimpleSelfOrganizingMapModel, (
        number_of_neurons = number_of_neurons,
        number_of_features = number_of_features,
        α = (t::Int) -> 0.1, # constant learning rate function
        σ = (t::Int, radius::Float64) -> 0.99*radius, # neighborhood radius function
        
    ));


end

## Task 2: Learn the SOM weight parameters
Fill me in