# Embedding Visualzation

Embedding visualization methods, that have proven to be meaningful

## Choose Train Run

In [None]:
# ==== MNIST ========
dataset = "mnist"

run_id = "run-0011-CNN_mnist_32_0.9776"
#run_id = "run-0012-CNN_mnist_32_0.9768"
#run_id = "run-0013-CNN_mnist_32_0.9797"
#run_id = "run-0014-CNN_mnist_32_0.9744"

In [None]:
# ==== CIFAR 10 ========
dataset = "cifar10"

# Residual
run_id = "run-0016-CNN_cifar10_128_0.8093" # Seed 42, SAM
# run_id = "run-0018-CNN_cifar10_128_0.8499" # Seed 42
# run_id = "run-0020-CNN_cifar10_128_0.8079" # Seed 11, SAM
# run_id = "run-0022-CNN_cifar10_128_0.8519" # Seed 11

# No Residual
# run_id = "run-0017-CNN_cifar10_128_0.8072" # Seed 42, SAM
# run_id = "run-0019-CNN_cifar10_128_0.8487" # Seed 42
# run_id = "run-0021-CNN_cifar10_128_0.8054" # Seed 11, SAM
# run_id = "run-0023-CNN_cifar10_128_0.8509" # Seed 11

In [None]:
dataset = "cifar10"
run_id = "run-0041-ViT_cifar10_256_0.8107"

In [None]:
from helper.visualization import Run
run = Run(run_id, dataset)

## The Training

In [None]:
run.plot_training_records()

### Confusion Matrix Development

In [None]:
%matplotlib ipympl
%matplotlib widget
run.confusion_matrix(annotate=True)

## Embedding Drift & CKA Similarities

The embedding drift describes:
- **Multi-scale skips**: for each snapshot index `i`, compare its embedding `E_i` to earlier snapshots `E_{i - 2**n}` for `n = 0,1,…,4` (skip lengths 1, 2, 4, 8, 16).
- **Mean Euclidean distance**:
  ```python
  drift = np.linalg.norm(current_snapshot - previous_snapshot, axis=1).mean()
- **Result:** a dict mapping each skip length to a time series of drift values, showing how rapidly—and at what scales—the embedding space is evolving.

In [None]:
run.plot_embedding_drifts()

This plot shows **1 − CKA similarity** over time, representing the **structural change** in the embedding space.
Lower values indicate high similarity (stable structure), while higher values reflect greater representational drift.
It allows direct comparison with Euclidean embedding drift and helps identify when and how much the internal structure evolves during training.

In [None]:
run.plot_cka_similarities(y_lim=0.3)

## Eigenvalue development
This plot shows the **10 top PCA eigenvalues** of the embedding space over training time.
Each curve represents the variance explained by a principal direction.
Changes in the eigenvalue spectrum reveal how the dimensional structure of the embeddings evolves — e.g., early compression, later expansion, or stabilization of representational capacity.

In [None]:
run.eigenvalues()

# Compare Visualizations
Hyperparameter choices through evaluation

### Compute

In [None]:
from helper.visualization import generate_pca_animation

ani_pca_all = generate_pca_animation(run, fit_basis='all')
ani_pca_window = generate_pca_animation(run, fit_basis='window', window_size=16).denoise(do_cka_similarities=False)

In [None]:
from helper.visualization import generate_tsne_animation

tsne_blended = generate_tsne_animation(
    run,
    tsne_update=0.2
)

In [None]:
from helper.visualization import generate_umap_animation

umap_ani = generate_umap_animation(
    run,
    metric='cosine',
    n_neighbors=20,
    min_dist=0.2
).denoise(do_cka_similarities=False)

In [None]:
from helper.visualization import generate_mphate_animation

#mphate_ani = generate_mphate_animation(
#    run,
#    #t=t, #TODO Best PARAMETERS
#)

### Visualize

In [None]:
%matplotlib ipympl
%matplotlib widget
from helper.visualization import show_animations

show_animations(
    animations=[
        ani_pca_all,
        ani_pca_window,
        tsne_blended,
        umap_ani,
        #mphate_ani
    ],
    custom_titles=[
        "PCA on all",
        "PCA window denoised",
        "t-SNE",
        "UMAP",
        "M-PHATE"
    ],
    figsize_per_plot=(4, 4),
    cols=3,
    shared_axes=False,
    add_confusion_matrix=True,
    annotate_confusion_matrix=True,
    interpolate=True,
    steps_per_transition=3,
)