# Hyperbolic Embedding of ICD-9 Mappings 
<br>
`Drew Wilimitis`


## Writing/Research Project Outline
___

### Abstract 

As discussed in a recent survey of representation learning for Electronic Health Records, there is currently a vast, disparate set of clinical data sources that reflect a multi-layered, incredibly complex network of interactions occuring between social actors (patients and clinicians), but also between medical ontologies and the growing promise in collaboration with modern genetics.

### Introduction

This problem of representation learning challenges the progression of biomedical informatics and clinical science not only in the potential to build less accurate predictive models, but also to potentially erode any human interpretation or explainability of these algorithms. Given that many SOTA methods for representation learning are highly sophisticated deep-learning algorithms, and also because these SOTA methods involve immensely expensive transfer learning, converging to potentially hundreds of millions of parameters like BERT, despite its undeniable success in NLP tasks.

Weng, Wei-Hung and Peter Szolovits. “Representation Learning for Electronic Health Records.” ArXiv abs/1909.09248 (2019): n. pag.

Learning Contextual Hierarchical Structure of Medical Concepts with PoincairéEmbeddings to Clarify PhenotypesBrett K. Beaulieu-Jones, Isaac S. Kohane and Andrew L. Beam†Department of Biomedical Informatics, Harvard Medical School

### Brief Background: Hyperbolic Geometry

### Method 1: Apply the Poincare Embedding Algorithm

### Method 2: Apply Lorentz Embedding

### Method 3: Lorentzian Distance Learning??

### Evaluation (to Euclidean & Earlier Approach)

### Load libraries and helper functions

In [1]:
# import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn')
%matplotlib inline
import networkx as nx
import sys
import os

# import modules within repository
#my_path = 'C:\\Users\\dreww\\Desktop\\hyperbolic-learning\\utils' # path to utils folder
#sys.path.append(my_path)
#from utils import *
#from embed import train_embeddings, load_embeddings, evaluate_model
#from hkmeans import HyperbolicKMeans, plot_clusters

# ignore warnings
import warnings
warnings.filterwarnings('ignore');

# display multiple outputs within a cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all";

### Format output given by R script

In [2]:
all_codes = pd.read_csv('data/tmp/allcodes.csv', sep="|" , encoding='latin1', false_values=['"'])
majors = pd.read_csv('data/tmp/majors.csv', sep="|")
chapters = pd.read_csv('data/tmp/chapters.csv', sep="|").transpose()
sub_chapters = pd.read_csv('./data/tmp/subchapters.csv', sep="|").transpose()

df = pd.DataFrame(columns=['parent', 'child'])
# print(all_codes.head(3))

# handle chapters
chapters = chapters.reset_index()
chapters.columns = ['name', 'start', 'end']
chapters['range'] = 'c_' + chapters['start'].map(str) + '_' + chapters['end'].map(str)

chap_name_dict = dict(zip(chapters['name'], chapters['range']))
chap_range_dict = dict(zip(chapters['range'], chapters['name']))

sub_chapters = sub_chapters.reset_index()
sub_chapters.columns = ['name', 'start', 'end']
sub_chapters['range'] = 's_' + sub_chapters['start'].map(str) + '_' + sub_chapters['end'].map(str)

subchap_name_dict = dict(zip(sub_chapters['name'], sub_chapters['range']))
subchap_range_dict = dict(zip(sub_chapters['range'], sub_chapters['name']))

## Issue with template formatting

In [3]:
from IPython.core.display import display, HTML
#display(HTML("<style>.container { width:100% !important; }</style>"))
#display(HTML('<style>'
#                 'div.text_cell_selected { border-left: 5px solid #ff0000 !important; border-radius: 3px; }'
#             '</style>'
#
display(HTML('<style>'
                 'div.cell.text_cell.selected::before, .edit_mode div.cell.text_cell.selected:before, '
                 'div.cell.text_cell.selected:before, div.cell.text_cell.selected.jupyter-soft-selected:before {background: #ffffff !important; background-color: #ffffff !important; border-color: #f2f2f2 !important;} '
             '</style>'
))                  

In [4]:
from IPython.core.display import display, HTML
display(HTML(
    '<style>'
         'div.text_cell_render{width:100%; margin-left:3%; margin-right:auto; } '
    '</style>'
))