# Challenge: Hyperbolic Embedding via Graph Learning
Authors: Rishi Sonthalia and Xinyue Cui

## Introduction

Many different types of datasets are better represented in hyperbolic space as compared to Euclidean space. These are normally datasets that have semantically rich hierarchies such as text, social networks, evolutionary trees, cell development trees, and phylogenetic trees. Nature example paper. 

However, Hyperbolic optimization is 
1) Non convex
2) Highly unstable

We would like to get embeddings of data into hyperbolic space that avoids these issues. Also in many cases, we just care about the tree like stucture of the data. Also these things are slow

Hence we look at combinatorial algorithms to embed the data into hyperbolic space. 


## Related work

### Embedding Literature

1) Poincare, Lorentz - Optimization based techniques for embeddings into these manifolds
2) Representation Tradeoff - Non linear optimization
3) New Chami paper on hyperbolic pca

1) Sarkar and code based for embedding into a tree

The above are slow. 

Algorithms for learning trees

1) Construct Tree (CS community)
2) Level Tree (math community doing algs)
3) Neighbor Join (Bio community)
4) Low Stretch Trees (CS community, different problem)

### Other implementations

1) TreeRep - Julia code by the authors exist in Github
C code by start up people 

2) Sarkar - Julia code exists by Stanford people. Start up people are currently working on their own implmentation. 


## Implementation




In [5]:
import networkx as nx
import geomstats.backend as gs

In [2]:
import TreeRep

INFO: Using numpy backend


## Synthetic Tests

1) Recovering Known Trees
2) Learing trees of random graphs
3) Learing trees for Random points in hyperbolic space

Embedding trees

1) Show scaling effect of embedding
2) Take random points in hyperbolic space learn tree and embed and then compare the points

## Applications 

1) Pure embeddings, do the karate club graph
2) Dimensionality reduction - take high dimensional data, learn tree, and then embed into two dimensional space
3) Learn Tree structure, such as a phylogentic tree.

In [9]:
from importlib import reload 
reload(TreeRep)

for trial in range(100,5000,100):
  n = 200
  G = nx.gnp_random_graph(200, 0.7)
  for e in G.edges():
      G[e[0]][e[1]]['weight'] = gs.random.rand()*10
  d = nx.algorithms.shortest_paths.dense.floyd_warshall(G)
  D = gs.zeros((n,n))
  for i in range(n):
    for j in range(n):
      D[i,j] = d[i][j]

  T = TreeRep.TreeRep(D)
  T.learn_tree()
  print(G.number_of_nodes(), G.number_of_edges())
  print(T.G.number_of_nodes(),T.G.number_of_edges())
  print(nx.is_k_edge_connected(T.G,1), nx.is_k_edge_connected(T.G,2))
  print()

200 13870
289 288
True False

200 13941
271 270
True False

200 13917
273 272
True False

200 13873
297 296
True False



KeyboardInterrupt: ignored

# New Section