Skip to content

InfectionMedicineProteomics/BINN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Biologically Informed Neural Network (BINN)

Docs License: MIT PyPI version Python application DOI Open In Colab

BINN documentation is avaiable here.

The BINN-package allows you to create a sparse neural network from a pathway and input file. The examples presented in docs use the Reactome pathway database and a proteomic dataset to generate the neural network. It also allows you to train and interpret the network using SHAP. Plotting functions are also available for generating sankey plots. The article presenting the BINN can currently be found here.

This repo is accompanied by a Colab notebook for easy use.


Installation

BINN can be installed via pip

pip install binn

The package can also be built from source and installed with git.

git clone git@github.com:InfectionMedicineProteomics/BINN.git
pip install -e BINN/

Usage

The complete pipeline to create, train and interpret a BINN is:

from binn import BINN, BINNDataLoader, BINNTrainer, BINNExplainer
import pandas as pd

# Load your data
data_matrix = pd.read_csv("../data/sample_datamatrix.csv")
design_matrix = pd.read_csv("../data/sample_design_matrix.tsv", sep="\t")

# Initialize BINN
binn = BINN(data_matrix=data_matrix, network_source="reactome", n_layers=4, dropout=0.2)

## Initialize DataLoader
binn_dataloader = BINNDataLoader(binn)

# Create DataLoaders
dataloaders = binn_dataloader.create_dataloaders(
    data_matrix=data_matrix,
    design_matrix=design_matrix,
    feature_column="Protein",
    group_column="group",
    sample_column="sample",
    batch_size=32,
    validation_split=0.2,
)
# Train the model
trainer = BINNTrainer(binn)
trainer.fit(dataloaders=dataloaders, num_epochs=100)

# Explain the model
explainer = BINNExplainer(binn)
single_explanations = explainer.explain_single(dataloaders, split="val", normalization_method="subgraph")
single_explanations

The output can be visualized in a network:

from binn.plot.network import visualize_binn

layer_specific_top_n = {"0": 10, "1": 7, "2": 5, "3":5, "4":5}
plt = visualize_binn(single_explanations, top_n=layer_specific_top_n, plot_size=(20,10), sink_node_size=500, node_size_scaling = 200, edge_width=1,  node_cmap="coolwarm")
plt.title("Interpreted network")

vis

Cite

If you use this package, please cite: Hartman, E., Scott, A.M., Karlsson, C. et al. Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis. Nat Commun 14, 5359 (2023). https://doi.org/10.1038/s41467-023-41146-4

@article{BINN,
  title = {Interpreting biologically informed neural networks for enhanced proteomic biomarker discovery and pathway analysis},
  volume = {14},
  ISSN = {2041-1723},
  url = {http://dx.doi.org/10.1038/s41467-023-41146-4},
  DOI = {10.1038/s41467-023-41146-4},
  number = {1},
  journal = {Nature Communications},
  publisher = {Springer Science and Business Media LLC},
  author = {Hartman,  Erik and Scott,  Aaron M. and Karlsson,  Christofer and Mohanty,  Tirthankar and Vaara,  Suvi T. and Linder,  Adam and Malmstr\"{o}m,  Lars and Malmstr\"{o}m,  Johan},
  year = {2023},
  month = sep 
}

Contributors

Erik Hartman, infection medicine proteomics, Lund University Aaron Scott, infection medicine proteomics, Lund University

Packages

No packages published

Contributors 2

  •  
  •