[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DSIMB/PoincareMSA/blob/master/PoincareMSA_colab.ipynb)

<img src="https://github.com/DSIMB/PoincareMSA/blob/master/.github/PoincareMSA_small_logo.png?raw=true" height="100" style="height:100px;margin-left: 0px;">

# Poincaré maps for visualization of large protein famillies

**Authors**: Anna Klimovskaia Susmelj, Yani Ren, Yann Vander Meersche, Jean-Christophe Gelly and Tatiana Galochkina

PoincaréMSA builds an interactive projection of an input protein multiple sequence alignemnt (MSA) using a method based on Poincaré maps described by Klimovskaia et al [1]. It reproduces both local proximities of protein sequences and hierarchy contained in give data. Thus, sequences located closer to the center of projection correspond to the proteins sharing the most general functional properites and/or appearing at the earlier stages of evolution. Source code is available at https://github.com/DSIMB/PoincareMSA.

[1] Klimovskaia, A., Lopez-Paz, D., Bottou, L. et al. Poincaré maps for analyzing complex hierarchies in single-cell data. Nat Commun 11, 2966 (2020).

# Comparative plots

To control that the Poincare representation is really a better representation than other dimensionality reduction methods, it is important to plot the selected group of proteins using traditional methods. This notebook takes an input of features and associated labels and plots the results of a PCA, t-SNE, MDS and UMAP.

## INPUT FEATURES

In [None]:
import pandas as pd
import plotly.express as px

# Input a matrix of features in .csv format, as well as the associated labels.
name_of_feature_file = 'feature_file_name.csv'
name_of_label_file = 'label_file_name.csv'

# Read and load
features = pd.read_csv(name_of_feature_file, delimiter=',')

## PCA

In [None]:
from sklearn.decomposition import PCA

## t-SNE

In [None]:
from sklearn.manifold import TSNE

## MDS

In [None]:
from sklearn.manifold import MDS

## UMAP

In [None]:
import umap