# Latent Space Traversal

This module provides functions for exploring the latent space of embeddings, 
allowing for systematic traversal and interpolation between concept representations. 
Such traversal enables analysis of semantic relationships, 
identification of concept clusters, and generation of intermediate embeddings that can be used for reasoning, analogy, or visualization.


## Latent Space Traversal Analysis
This part perform latent space traversal experiments to investigate the structure and semantic organization of the embedding space. Specifically, we explore the paths connecting the embeddings of a selected pair of entities—dotproduct and matrix_multiplication—through both Euclidean (linear) and hyperbolic (geodesic) interpolation. The interpolation is conducted by sampling 10 evenly spaced points between the two endpoints, allowing us to observe how the semantic neighborhood evolves throughout the path.

The rationale for selecting the pair (dotproduct, matrix_multiplication) is grounded in their mathematical proximity: both are central to linear algebra, and intermediate concepts such as matrix-vector_multiplication, replicatecol, or dot_product are expected to lie semantically between them. Therefore, a well-structured latent space should populate the intermediate regions with relevant and generalizable entities.

From the traversal paths, we observe that both the Euclidean and hyperbolic spaces successfully capture intermediate concepts at early and midpoints, including replicatecol, matrix_map, matrix-vector_multiplication, and dot_product. These are semantically aligned with the seed entities and indicate that the space is hierarchically and semantically organized.

A comparison of midpoint behaviors (t ≈ 0.5) reveals more generalization in the hyperbolic space, where entities like unit_matrix, symmetric_matrix, and addmonoidhom emerge. These are not only relevant to both endpoints but also serve as abstract generalizations that unify dot products and matrix operations. In contrast, the Euclidean space favors more direct interpolation with consistently recurring entities like matrix-vector_multiplication, dot_product, and fromblocks, suggesting a more linear blending of seed concepts rather than abstraction.

This divergence highlights a key distinction: hyperbolic interpolation encourages semantic abstraction, making it more suitable for hierarchical knowledge representation, while Euclidean interpolation preserves semantic continuity, offering clearer paths for symbolic or stepwise reasoning. The consistent appearance of core entities across both models reinforces the latent space's robustness, while the difference in generalization at the midpoint underscores the complementary nature of Euclidean and hyperbolic geometries in capturing different aspects of concept relationships.

In [None]:
# Example embedding path
# embedding_path = "../Mathlib4_embeddings/Latent_space_analysis/model_dict_poincare_300_my_dataset "
!python hyperbolic_traversal_pickle.py \
    --embedding_path "path/to/your/embedding"  \
    --model hyperbolic \
    --dim 300 \
    --visualize \
    --num_points 10 \
    --top_k 10 \
    --word_pairs '[["vector", "matrix"]]'

Loaded embedding dictionary with 276 entities
=== Latent Space Traversal: vector -> matrix ===
Using Hyperbolic (geodesic) interpolation

Point 1 (t=0.00):
  Nearest entities: ['vector', '3x3_matrix', 'rank', 'projection_map', 'preserving_addition_and_zero', 'preserves_the_addition_operation', 'placing_them_in_blocks', 'relation', 'placing_given_matrices_along_the_diagonal_and_zeros_elsewhere', 'permutation']

Point 2 (t=0.11):
  Nearest entities: ['vector', '3x3_matrix', 'rank', 'projection_map', 'preserving_addition_and_zero', 'preserves_the_addition_operation', 'placing_them_in_blocks', 'relation', 'placing_given_matrices_along_the_diagonal_and_zeros_elsewhere', 'permutation']

Point 3 (t=0.22):
  Nearest entities: ['vector', '3x3_matrix', 'rank', 'projection_map', 'preserving_addition_and_zero', 'preserves_the_addition_operation', 'placing_them_in_blocks', 'relation', 'placing_given_matrices_along_the_diagonal_and_zeros_elsewhere', 'permutation']

Point 4 (t=0.33):
  Nearest entiti

In [None]:
!python hyperbolic_traversal_pickle.py \
    --embedding_path "path/to/your/embedding"  \
    --word_pairs '[["vector", "matrix"]]' \
    --num_points 10 \
    --dim 300 \
    --visualize \
    --analyze_paths

Loaded embedding dictionary with 276 entities
=== Latent Space Traversal: vector -> matrix ===
Using Hyperbolic (geodesic) interpolation

Point 1 (t=0.00):
  Nearest entities: ['3x3_matrix', 'output', 'over_its_diagonal', 'pairs_of_elements', 'pairwise', 'permutation', 'perpendicular', 'placing_given_matrices_along_the_diagonal_and_zeros_elsewhere', 'placing_them_in_blocks', 'preserves_the_addition_operation']

Point 2 (t=0.11):
  Nearest entities: ['3x3_matrix', 'output', 'over_its_diagonal', 'pairs_of_elements', 'pairwise', 'permutation', 'perpendicular', 'placing_given_matrices_along_the_diagonal_and_zeros_elsewhere', 'placing_them_in_blocks', 'preserves_the_addition_operation']

Point 3 (t=0.22):
  Nearest entities: ['3x3_matrix', 'output', 'over_its_diagonal', 'pairs_of_elements', 'pairwise', 'permutation', 'perpendicular', 'placing_given_matrices_along_the_diagonal_and_zeros_elsewhere', 'placing_them_in_blocks', 'preserves_the_addition_operation']

Point 4 (t=0.33):
  Nearest ent