# Clustering
* This notebook illustrates our method's application to clustering.
* The analogous bars method is used to identify topological features that are preserved during clustering

In [1]:
using Revise
includet("../../../extension_method.jl")
includet("gen_points.jl")

│ has been implemented directly in PlotlyBase itself.
│ 
│ By implementing in PlotlyBase.jl, the savefig routines are automatically
│ available to PlotlyJS.jl also.
└ @ ORCA /opt/julia/packages/ORCA/U5XaN/src/ORCA.jl:8


In [2]:
using .ext
using .gen_points
using Distances
using Eirene
using Plots
using JLD

# 1. Load points
* The points are generated from the `generate_points` function in `points.jl`
* The clusters and centroids are generated from `get_centroids` function in `points.jl`.

In [3]:
# load points
data = load("points.jld")
P = collect(transpose(data["points"]))
Q = collect(transpose(data["centroids"]))
D = data["D"];

# prepare distances 
n_points = size(P)[1]
n_centroids = size(Q)[1]

# Define submatrices 
D_P = D[1:n_points, 1:n_points]
D_Q = D[n_points+1:end, n_points+1:end]
D_P_Q = D[1:n_points, n_points+1:end]
    # rows (landmarks): P
    # columns (witness) : Q
D_Q_P = D[n_points+1:end, 1:n_points];
    # rows (landmarks): Q
    # columns (witness) : P

└ @ FileIO /opt/julia/packages/FileIO/JA3Vl/src/loadsave.jl:215


In [4]:
# plot points
plot_P_Q(P,Q)

# 2. Apply the analogous bars method

## 2(a) Compute four barcodes

In [5]:
# run VR persistence
VR_P = eirene(D_P, record = "all")
VR_Q = eirene(D_Q, record = "all");

# run Witness persistence
W_P = compute_Witness_persistence(D_P_Q, maxdim = 1)
W_Q = compute_Witness_persistence(D_Q_P, maxdim = 1);

In [6]:
# plot barcodes
barcode_P = barcode(VR_P, dim = 1)
barcode_W_PQ = barcode(W_P["eirene_output"], dim = 1)
barcode_W_QP = barcode(W_Q["eirene_output"], dim = 1)
barcode_Q = barcode(VR_Q, dim = 1)

p1 = plot_barcode(barcode_P, title = "barcode(VR(P))")
p2 = plot_barcode(barcode_W_PQ, title = "barcode(W(P,Q))", lw = 3)
p3 = plot_barcode(barcode_W_QP, title = "barcode(W(Q,P))", lw = 3)
p4 = plot_barcode(barcode_Q, title = "barcode(VR(Q))", lw = 3)

plot(p1, p2, p3, p4, layout = grid(4,1), size = (500, 700))

## 2(b) Apply the similarity-centric analogous bar method

In [7]:
# select witness bar
W_PQ_bar = 2

# run similarity-centric analogous bars method
extension_P, extension_Q = run_similarity_analogous(VR_P =  VR_P, 
                                                    D_P = D_P, 
                                                    VR_Q = VR_Q, 
                                                    D_Q = D_Q, 
                                                    W_PQ = W_P, 
                                                    W_PQ_bar = W_PQ_bar, 
                                                    dim = 1);

Plot the baseline bar extensions

In [8]:
plot_analogous_bars(extension_P, extension_Q)

Plot baseline cycle extensions

In [9]:
# select baseline cycle extension in P
epsilon0_P = extension_P["epsilon_0"]
cycle_P = extension_P["cycle_extensions"][epsilon0_P]["baseline"]

# select baseline cycle extension in Q
epsilon0_Q = extension_Q["epsilon_0"]
cycle_Q = extension_Q["cycle_extensions"][epsilon0_Q]["baseline"]

p1 = plot_cycle_square_torus(P, Q, cycle = cycle_P, cycle_loc = "P", title = "cycle extension to VR(P)"; legend = false)
p2 = plot_cycle_square_torus(P, Q, cycle = cycle_Q, cycle_loc = "Q", title = "cycle extension to VR(Q)"; legend = false)

plot(p1, p2, layout = grid(1,2), size = (600, 300))

## 2(c). Explore cycle extension & bar extension under fixed interval decompositions of `barcode(VR(P))` and `barcode(VR(Q))`.
* (i) Plotting the extension parameters for both `barcode(VR(P))` and `barcode(VR(Q))` 
* (ii) Finding all cycle extensions and bar extensions (non-interactive). 


### 2(c)(i) Plot parameters

In [10]:
p1 = plot_pY(extension_P, title = "barcode(VR(P))")
p2 = plot_pY(extension_Q, title = "barcode(VR(Q))")

plot(p1, p2, grid = (1, 2), size = (700, 300))

Both extensions have unique parameters.

### 2(c)(ii) Find all cycle extensions and bar extensions

In [11]:
CE_P, BE_P = find_CE_BE(extension_P)
CE_Q, BE_Q = find_CE_BE(extension_Q);

In [12]:
CE_P

Dict{Any,Any} with 1 entry:
  1.03942 => Dict{Any,Any}(0=>[[137, 179], [17, 137], [14, 179], [9, 194], [8, …

In [13]:
CE_Q

Dict{Any,Any} with 1 entry:
  1.99801 => Dict{Any,Any}(0=>[[3, 4], [2, 10], [6, 7], [1, 7], [8, 9], [5, 8],…

Both extensions have a unique cycle extension.

## 2(d) Explore the bar extension result under alternative interval decompositions of `C_VR`
* Up to this point, the bar extension result has been obtained for some fixed interval decompositions of `barcode(VR(P))` and `barcode(VR(Q))`. 
* In this section, we explore all alternative bar extensions. 

In [17]:
# find alternative bar extensions
alt_BE_P = find_alt_BE(extension_P, BE_P)
alt_BE_Q = find_alt_BE(extension_Q, BE_Q);

In [18]:
alt_BE_P

Dict{Any,Any} with 1 entry:
  1.03942 => Any[[24]]

In [19]:
alt_BE_Q

Dict{Any,Any} with 1 entry:
  1.99801 => Any[[1]]

There are no alternative bar extensions 