Skip to content

NKI-CCB/sobolev_alignment

Repository files navigation

Sobolev Alignment

Tests Documentation pre-commit.ci status codecov

This GitHub repository contains the implementation of Sobolev Alignment, a computational framework designed to align pre-clinical and tumor scRNA-seq data. Sobolev Alignment combines a deep generative model with a kernel method to detect non-linear processes that are shared by a source (e.g., cell line) and a target (e.g., tumor) dataset.

Getting started

Please refer to the documentation. In particular, the

Installation

You need to have Python 3.8 or newer installed on your system. If you don't have Python installed, we recommend installing Mambaforge.

The installation can be done in two steps.

1. Install Sobolev Alignment

You can install Sobolev Alignment and (almost) all dependencies using the following command:

pip install sobolev-alignment

The resulting package is ready to use, but will use scikit-learn instead of Falkon, resulting in largely sub-optimal performances.

2. Install Falkon

To employ large-scale GPU-accelerated kernel methods, we turn to Falkon. The installation notice for Falkon is available on the FalkonML documentation. The previous installation procedure has already taken care of the various dependencies required for Falkon (i.e., cython, scipy and torch.)

Tutorial

In the folder tutorial, you will find different tutorial in the form of Jupyter notebooks examplifying how to use the package. Specifically:

  • process_data.ipynb: example on how to pre-process the data prior to use Sobolev Alignment. If you data has already been processed, the main step consists in adding the counts as layers, e.g., an.layers['counts'] = an.X.
  • tutorial_simple.ipynb: basic example on how to run Sobolev Alignment with basic parameters.

Additional packages need to be installed to use MNN on top of Sobolev Alignment:

  • rpy2, which can be installed using pip (pip install rpy2).
  • The R package batchelor. To do so, after activating your conda environment, get into R and enter the following commands: ''' if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")

BiocManager::install("batchelor") '''

Frequent issues

Issues with the compiler.

Due to incompatibilities between g++, gcc and cuda, the installation of FalkonML sometimes fails. The following elements can help alleviate potential issues:

  • Prior to installing Falkon, re-install torch 1.11.
  • Check compatibility between your cuda version and the one installed with torch.
  • Using cxx-compiler=1.2.0 (available on conda-forge) is compatible with cuda 11.3.

Issues with Jaxlib (MacOS)

For Mac users, the jaxlib version installed from PyPI sometimes returns issues. We then advise to re-install jaxlib from condo, and subsequently re-install dcvi-tools:

mamba install jaxlib
mamba install scvi-tools

Incompatibilities with numba

Errors are sometimes raised due to numba inconsistencies. The errors raised were due to clashes between different packages. Re-installing numba seem to have fixed the issues:

pip install numba --force-reinstall

Please feel free to contact the development team by e-mail or by creating an issue.

Workflow presentation

Sobolev Alignment workflow

Release notes

See the changelog.

Contact

For questions and help requests, you can reach out at the following e-mail address: s [dot] mourragui [at] hubrecht [dot] eu.

Citation

If you find this package useful, please cite our publication:

Identifying commonalities between cell lines and tumors at the single cell level using Sobolev Alignment of deep generative models, Mourragui et al, 2022, Biorxiv