# Example Demonstrating Metric Repair

The first step in the process is we need to generate the swiss roll data set. This can be done using the following snippet of code

In [27]:
using Plots, Interact, MRMissing, Distances
plotly()

function genData(n)
    theta = sort(pi*rand(n))
    x = cos.(3*theta).*theta/3
    y = sin.(3*theta).*theta/3
    z = rand(n)
    return [x y z]
end

genData (generic function with 1 method)

Now we can we can visualize the data as follows:

In [29]:
n = 2000

p = genData(n)
scatter(p[1:400,1], p[1:400,2], p[1:400,3])
scatter!(p[401:800,1], p[401:800,2], p[401:800,3])
scatter!(p[801:1200,1], p[801:1200,2], p[801:1200,3])
scatter!(p[1201:1600,1], p[1201:1600,2], p[1201:1600,3])
scatter!(p[1601:end,1], p[1601:end,2], p[1601:end,3])

Once we have our data let ua take a look at the 2 dimensional unrolling of this data that is produced by $\texttt{Isomap}$

In [5]:
X = MRMissing.Isomap(p,12,2);

In [30]:
scatter(X[1:400,1], X[1:400,2])
scatter!(X[401:800,1], X[401:800,2])
scatter!(X[801:1200,1], X[801:1200,2])
scatter!(X[1201:1600,1], X[1201:1600,2])
scatter!(X[1601:end,1], X[1601:end,2])

Now let us take the distance matrix used for the above projection and let us corrupt it a little bit and see what happens if then run Isomap. Here we shall corrupt the distance matrix by adding gaussian noise to the entries, while mainting non negagtivity and symmetry of the distance matrix.

In [20]:
D = pairwise(Euclidean(1e-12),p',p') #Calculate the distance

#Add the noise while mainting non negativity
Dpert = max.(D + randn(n,n)/5,0)
for i = 1:n
    Dpert[i,i] = 0 
end

Dpert = (Dpert + Dpert')/2; #maintain Symmetry

Dc = copy(Dpert); # keep a copy for later

Now let us run Isomap as follows

In [21]:
Dmin = MRMissing.Kmin(Dpert,12)
Dmani = MRMissing.apsp(Dmin)
Xp = mds(Dmani,2);

In [31]:
scatter(Xp[1:400,1], Xp[1:400,2])
scatter!(Xp[401:800,1], Xp[401:800,2])
scatter!(Xp[801:1200,1], Xp[801:1200,2])
scatter!(Xp[1201:1600,1], Xp[1201:1600,2])
scatter!(Xp[1601:end,1], Xp[1601:end,2])

Here we can see that the low dimensional projection computed using the corrupted distances have no structure to it whatsoever. In fact all the points seem to have collapsed to the origin. Now we let use that saved copy of the corrupted distances. Let us then run IOMR to correct the distances and let us see what happens

In [34]:
Dfixed = Dc + IOMR(Dc) # fix the distances
Dfmin = MRMissing.Kmin(Dfixed,12)
Dfmani = MRMissing.apsp(Dfmin)
Xf = mds(Dfmani,2);

In [36]:
scatter(Xf[1:400,1], Xf[1:400,2])
scatter!(Xf[401:800,1], Xf[401:800,2])
scatter!(Xf[801:1200,1], Xf[801:1200,2])
scatter!(Xf[1201:1600,1], Xf[1201:1600,2])
scatter!(Xf[1601:end,1], Xf[1601:end,2])

Now that we have corrected the distances we suddenly see the manifold structure! Albeit the structure is not exactly what we had with the true distance information, but is leagues better than what we had with the corrupted distances!