# Example Extension Method: Vietoris Rips to Vietoris Rips
* This notebook shows an application of the <b>bar-to-bars extension method</b> to compare barcodes obtained from two different Vietoris-Rips filtrations. 
* <b> Implementation </b>: Our function implements a component-wise bar-to-bars extension method with $\mathbb{F}_2$ coefficients. It assumes that all bars of barcodes have unique death times. 
* <b> Comparing VR to VR </b>: Given a point cloud `P`, let `D_Z, D_Y` be two distinct distance metrics on `P`. Let `Z` and `Y` be the two resulting Vietoris-Rips filtrations (corresponding to $Z^{\bullet}$ and $Y^{\bullet}$ in paper). Given a selected bar `Z_bar` of `barcode(Z)`, the extension method finds all representations in `barcode(Y)`. 
* <b> Example data</b>: We study how a topological feature in a point cloud is preserved during dimensionality reduction. In particular, we let `Z` denote the Vietoris-Rips filtration resulting from the original point cloud, and we let `Y` denote the Vietoris-Rips filtration resulting from the reduced-dimension point cloud. 
* <b> Contents </b>
    1. Load points and visualize
    2. Examine the two VR barcodes
    3. Apply bar-to-bars extension method
    4. Explore cycle extension & bar extension under fixed interval decomposition of `barcode(Y)`.
    5. Explore alternative bar extensions under all possible interval decompositions of `barcode(Y)`.

In [1]:
using Revise
includet("../../../extension_method.jl")

│ has been implemented directly in PlotlyBase itself.
│ 
│ By implementing in PlotlyBase.jl, the savefig routines are automatically
│ available to PlotlyJS.jl also.
└ @ ORCA /opt/julia/packages/ORCA/U5XaN/src/ORCA.jl:8


In [2]:
using .ext
using Distances
using Eirene
using JLD
using Plots
using Printf

# 1. Load points and visualize
* `P_original`: points sampled from a trefoil knot
* `P_pca`: result of PCA into two dimensions
* We'll use `P_original` and `P_pca` to come up with distance matrices `D_original` and `D_pca` on `P`

In [3]:
# load points 
P_original = load("points.jld2", "trefoil_knot")
P_pca = load("points_2D.jld", "points_2d");

└ @ FileIO /opt/julia/packages/FileIO/JA3Vl/src/loadsave.jl:215


In [None]:

# the original points were generated using the following code. 
# P_pca was obtained by 
"""
### Option 1 
# number of points
n = 200

# sample angles
t = rand(n) * 2 * π

# torus 
R1 = 2
R2 = 1

P_x = (R1 .+ R2 .* cos.(3 .* t) ).* cos.(2 .* t) 
P_y = (R1 .+ R2 .* cos.(3 .* t)) .* sin.(2 .* t) 
P_z = R2 .* sin.(3 .* t)  

P = cat(P_x, P_y, P_z, dims = 2);


### Option 2
# number of points
n = 200

# sample theta, phi
t = rand(n) * 2 * π

# coordinates
P_x = sin.(t) + 2 * sin.(2 * t)
P_y = cos.(t) - 2 * cos.(2 * t)
P_z = - sin.(3 * t);

P = cat(P_x, P_y, P_z, dims = 2)
"""

Plot original points in 3D

In [4]:
# plot points
scatter3d(P_original[:,1], P_original[:,2], P_original[:,3], label = "", xaxis = nothing, yaxis = nothing, zaxis = nothing)

Plot 2-dimensional points

In [5]:
# plot PCA
plot(P_pca[:,1], P_pca[:,2], 
    seriestype = :scatter, 
    label = "",
    framestyle = :box,
    xaxis = nothing,
    yaxis = nothing,
    markersize = 8,
    title = "visualization of PCA",
    size = (300, 300)
    )

# 2. Examine the two VR barcodes

* The extension method compares two filtrations built on the point cloud `P`.
* Compute the following two distance matrices:
    * `D_original`: Distance computed from `P_original` 
    * `D_pca`: Distance computed from `P_pca` 
* Consider the two resulting filtrations:
    * `C_original`: Vietoris-Rips filtration on `D_original`
    * `C_pca`: Vietoris-Rips filtration on `D_pca`

## 2(a) Prepare distance matrices

In [6]:
D_original = pairwise(Euclidean(), P_original, P_original, dims = 1)
D_pca = pairwise(Euclidean(), P_pca, P_pca, dims = 1);

## 2(b) Observe the two VR barcodes

In [7]:
# run Eirene
C_original = eirene(D_original)
C_pca = eirene(D_pca);

# specify dimension of interest
dim = 1

# compute barcodes
barcode_orig = barcode(C_original, dim = dim)
barcode_pca = barcode(C_pca, dim = dim)

# plot two barcodes
p1 = plot_barcode(barcode_orig, title = "barcode(C_original) in dim 1", lw = 3)
p2 = plot_barcode(barcode_pca, title = "barcode(C_PCA) in dim 1", lw = 3)
plot(p1, p2, layout = grid(2,1), size = (500, 500))

In [28]:
class2 = classrep(C_pca, dim = 1, class = 2)
class3 = classrep(C_pca, dim = 1, class = 3)
class4 = classrep(C_pca, dim = 1, class = 4)
class5 = classrep(C_pca, dim = 1, class = 5)

class2 = [class2[:,i] for i in 1:size(class2,2)]
class3 = [class3[:,i] for i in 1:size(class3,2)]
class4 = [class4[:,i] for i in 1:size(class4,2)]
class5 = [class5[:,i] for i in 1:size(class5,2)];

In [34]:
plot_cycle_single(transpose(P_pca), cycle = class2)

In [38]:
plot_cycle_single(transpose(P_pca), cycle = class3)

In [36]:
plot_cycle_single(transpose(P_pca), cycle = class4)

In [37]:
plot_cycle_single(transpose(P_pca), cycle = class5)

We'll now use the bar-to-bars extension method to understand how the long bar in `barcode(C_original)` is represented in `barcode(C_PCA)`.

# 3. Apply bar-to-bars extension method

<b> Hover over the above barcode to select interval of interest. </b>
* We selected bar 31 (the long bar) in the top barcode.
* Let `Z` denote the filtration that contains the selected bar. In this case, `Z` refers to the Vietoris-Rips filtration on the original point cloud (`C_original`). 
* Let `Y` denote the other filtration. In this case, `Y` refers to the Vietoris-Rips filtration on the reduced-dimension point cloud (`C_Y`). 
* The extension method will then find all representations of bar 34 in the barcode of PCA.

<b> Using the bar-to-bars extension method </b>
* Since we are implementing the bar-to-bars extension method to compare two VR filtrations, we'll use the function `run_extension_VR_to_VR_bar`.

In [8]:
# set variables
VR_Z = C_original
D_Z = D_original
VR_Y = C_pca
D_Y = D_pca
Z_bar = 31

# run bar-to-bars extension method
extension = run_extension_VR_to_VR_bar(C_Z = VR_Z,
                                       D_Z = D_Z,
                                       C_Y = VR_Y,
                                       D_Y = D_Y,
                                       Z_bar = Z_bar);

# 4. Explore the cycle extension & bar extension under fixed interval decomposition of `C_Y`
* As mentioned in "code_details.pdf", the `run_extension_VR_to_VR_bar()` presents the component-wise cycle and bar extensions. This section illustrates the use of various functions to explore all cycle extension & bar extensions.
* For bar extensions, we only consider the result under a fixed interval decomposition of $PH_k(Y^{\bullet})$. 
* This section is organized as the following.  
    * (a) Plotting the parameters `p_Y`.   
    * (b) Interactively exploring the baseline and offset bar-extensions at various parameters.  
    * (c) Finding all cycle extensions and bar extensions (non-interactive). 
* Both subsection (b) and (c) illustrate how one may understand the bar-extensions. If your data contains large number of bars in the barcode of `C_auxiliary_filtration` and `C_Y`, then implementing the non-interactive method may take a while.

## 4(a) Plot all nontrivial `p_Y`

Plot all values of `extension["nontrivial_pY"]` on the target barcode $\text{BC}_k(Y^{\bullet})$.

In [9]:
plot_pY(extension, lw = 4)

## 4(b) Interactive exploration of baseline and offset bar extensions
* We use the function `return_extension_results_at_parameter()`, which is an interactive function that requires the user to select the following:
    * A parameter of `p_Y` 
    * Offset bar extensions
* The function shows the baseline bar-extension at selected parameter, along with the final bar-extension (baseline bar extension + selected offset bar extensions).
* The function returns a plot object that highlights the final bar extension

In [19]:
p = return_extension_results_at_parameter(extension);

*** Parameter key, value pair *** 
key: 1 parameter: 1.027719 
key: 2 parameter: 1.422955 
key: 3 parameter: 1.440683 
key: 4 parameter: 1.682451 
key: 5 parameter: 1.684553 
key: 6 parameter: 1.685191 
key: 7 parameter: 1.686671 
key: 8 parameter: 1.690335 
key: 9 parameter: 1.691053 
key: 10 parameter: 1.695975 
key: 11 parameter: 1.701137 
key: 12 parameter: 1.703031 
key: 13 parameter: 1.707123 
key: 14 parameter: 1.709780 
key: 15 parameter: 1.709919 
key: 16 parameter: 1.711123 
key: 17 parameter: 1.716143 
key: 18 parameter: 1.717690 
key: 19 parameter: 1.723076 
key: 20 parameter: 1.726520 
key: 21 parameter: 1.726885 
key: 22 parameter: 1.728641 
key: 23 parameter: 1.730666 
key: 24 parameter: 1.731487 
key: 25 parameter: 1.733846 
key: 26 parameter: 1.734056 
key: 27 parameter: 1.735527 
key: 28 parameter: 1.736430 
key: 29 parameter: 1.737819 
key: 30 parameter: 1.737966 
key: 31 parameter: 1.738558 
key: 32 parameter: 1.738914 
key: 33 parameter: 1.739488 
key: 34 parameter


Select a key for parameter 1


Selected parameter: 1.0277185186806626

Baseline bars extension at selected parameter: [5]

*** Offset bar extensions at selected parameter *** 
key: 1 offset bar extension: [2]
key: 2 offset bar extension: [3]
key: 3 offset bar extension: [4]



Select keys for offset bar extensions. 
Leave blank to select none. 
To select multiple keys, separate keys with space. ex) 1 2 3 :  1



Baseline bars extension at selected parameter: [5]
Selected offset bar extension: [2]
Final bar extensions: [5, 2]

In [20]:
plot(p, lw = 5)

## 4(c) All cycle extensions and bar extensions 
* We use the function `find_CE_BE()` to find all cycle extensions and bar extensions at every parameter.
* Let `CE`, `BE` be the outputs of the function.
* Given parameter `param`, `CE[param]` is a dictionary. It's keys are indices and values are cycle extensions at given parameter. 
    * That is, `CE[param][i]` is the i-th cycle extension at given parameter
* Given parameter `param` and an index `i` of the cycle extension, `BE[param][i]` is the bar extension of the corresponding cycle extension. 
* If one wishes to compute the cycle extensions and the bar extensions only at a specific parameter, use the function `find_CE_BE_at_param(extension, param)`. This will result in the same output as `CE[param]` and `BE[param]`. 


In [21]:
# find all cycle extensions & bar extenisons (under a fixed interval decomposition of the target filtration) 
# at every parameter
CE, BE = find_CE_BE(extension);

<b> Plot cycle extensions </b>

In [22]:
# select parameter
param = extension["nontrivial_pY"][1]
@printf("number of cycle extensions at parameter %.4f : %i", param, length(CE[param]))

number of cycle extensions at parameter 1.0277 : 8

In [23]:
# plot the 8 cycle extensions at selected parameter
ms = 3
titlefontsize = 16

p_objects = []
for i=0:7
    p = plot_cycle_single(transpose(P_pca), cycle = CE[param][i], markersize = ms, title = "cycle extension "*string(i), titlefontsize = 10)
    push!(p_objects,p)
end

plot(p_objects..., layout = grid(2, 4), size = (700, 300))

Plot the <b>bar extensions</b>
* First select a parameter
* Then select a cycle extension 
* Find and plot the corresponding bar extensions

In [24]:
# select parameter 
param = extension["nontrivial_pY"][1]
@printf("number of cycle extensions at parameter %.4f : %i", param, length(CE[param]))

number of cycle extensions at parameter 1.0277 : 8

In [25]:
# select cycle extension 
y= 0

# find the corresponding bar extension
be = BE[param][y]

# plot the bar extension
barcode_pca = barcode(C_pca, dim = dim)
p = plot_barcode(barcode_pca, title = "selected bar extension", lw = 3, selected_bars = be, epsilon= param, v_line = [param])
plot(p, size = (500, 300))

For speed reasons, one may wish to find the cycle extensions and bar extensions only at a specific parameter. Use the function `find_CE_BE_at_param()`

In [40]:
# select parameter
param = extension["nontrivial_pY"][1]
CE_param, BE_param = find_CE_BE_at_param(extension, param);

# one can now re-run the CE and BE plotting functions

# 5. Explore the bar extension result under alternative interval decompositions of `C_Y`
* Up to this point, the bar extension result has been obtained for some fixed interval decomposition $\mathcal{D}:\mathbb{I}_{\text{BC}_k(Y^{\bullet})} \to H_k(Y^{\bullet})$. In particular, we used the default interval decomposition that is used by Eirene.
* In this section, we present various functions that allow us to find the full bar extensions under all possible interval decompositions of $PH_k(Y^{\bullet})$. The goal of this section is to explore $S(\tau, Y^{\bullet})$ from Algorithm 3 of paper. We'll refer to this set as <b>alternative bar extensions</b> since these arise from alternative choices of the interval decompositions.
* We present three different methods for exploring the collection of bar extensions $S(\tau, Y^{\bullet})$. The appropriate tool depends on the sizes of the barcodes of `C_auxiliary_filtration` and `C_Y`. 

1. Find all alternative bar extensions for all parameters.  
    * This is recommended for data with small barcodes. 
    * This finds the full $S(\tau, Y^{\bullet}) = \{ S^{\mathcal{D} \circ L^{-1}}_{[y]} | \ell \in p_Y, [y] \in \mathfrak{E}_{\ell}, L \in L_Y \}$ in Algorithm 3 of paper.
2. Find alternative bar extensions at specific parameters.  
    * This is recommended for data with medium size barcodes.
    * Given a parameter $\ell$, this method finds $S(\tau, Y^{\bullet}; \ell) = \{ S^{\mathcal{D} \circ L^{-1}}_{[y]} | [y] \in \mathfrak{E}_{\ell}, L \in L_Y \} $
3. Find alternative bar extensions of a specific bar extension.
    * This is recommended for data with large size barcodes.
    * Given a selected parameter $\ell$ and cycle extension $[y] \in \mathfrak{E}_{\ell}$, this method finds $\{S^{\mathcal{D} \circ L^{-1}}_{[y]} | L \in L_Y \}$. 


In this notebook, we implement (1): Find all alternative bar extensions

In [41]:
# find all bar extensions at all parameters (for a FIXED interval decomposition of C_Y)
_, BE = find_CE_BE(extension)

# find all alternative bar extensions (under alternative interval decompositions of C_Y)
BE_alt = find_alt_BE(extension, BE);

Exploring the results: Given a parameter `param`, `BE_all[param]` returns all possible bar extensions at the given parameter.

In [42]:
# select parameter
param = extension["nontrivial_pY"][1]

BE_alt[param]

8-element Array{Any,1}:
 [5]
 [2, 5]
 [3, 5]
 [4, 5]
 [2, 3, 5]
 [2, 4, 5]
 [3, 4, 5]
 [2, 3, 4, 5]

In [43]:
# plot one of the alternative bar extensions

# select an alternative bar extension
alt = BE_alt[param][2]
barcode_Y = barcode(extension["C_Y"], dim = dim)
p =plot_barcode(barcode_Y, selected_bars = alt, lw = 3,
                    epsilon = param, v_line = [param],
                    title = "alternative intervals")
plot(p)