Sparse bottleneck neural networks for exploratory non-linear visualization of Patch-seq data

A sparse bottleneck neural network to predict electrophysiological properties of neurons from their gene expression. Code corresponds to work found on ArXiv.

Requirements:

TensorFlow and Keras, specifically. We used version 2.7.0 for TensorFlow 2 and 2.7.0 for Keras (https://keras.io/#installation). Glmnet, a package to fit generalized linear models with penalties like ridge and lasso (https://github.com/bbalasub1/glmnet_python).

For the sparse bottleneck neural network framework and linear models, cross validation takes on average ~10 minutes. If performed once, the data can be pickled, however, so that one does not need to rerun the models everytime for plotting. Check KerasSavedModels/scala_2020_tf2, KerasSavedModels/gouwens_2020_tf2, KerasSavedModels/stoeckius_2017_tf2, KerasSavedModels/scala_2020_with_ion_channel_genes_tf2 for pickled results. These can indeed directly be used in the notebooks for plotting and reproducing figures.

Data

All the data can be found in the folders data/scala_2020, data/gouwens_2020 and data/stoeckius_2017 which are needed to run the notebooks all the way. The first dataset we call the M1 dataset in our paper and corresponds to the following Nature study from previous work conducted in this lab and collaborative labs. The 2nd we call the V1 dataset which corresponds to work from the Allen Institute reported in Cell and the 3rd is called the CITE-seq dataset reported in Nature Methods.

Preprocessing

Preprocessing can be found in preprocess.ipynb. To run it for the M1 dataset, make sure you un-archive data/scala_2020/final_patch_seq_allcells_exon_allreads.rar and put final_patch_seq_allcells_exon_allreads.txt in that same folder. For the CITE-seq dataset, you need to get GSE100866_CBMC_8K_13AB_10X-RNA_umi.csv from the NCBI portal and put it in data/stoeckius_2017. Preprocessing produces .pickle files in the same respective directories and are added to this GitHub repo (note that for CITE-seq, you'll need to unzip that first and put it in the same directory). One therefore does not have to run the preprocessing and can move on directly to cross validation and full dataset visualizations.

Cross validation

Cross validation has been performed for both the linear framework (sparse reduced-rank regression, sRRR) and nonlinear framework (sparse bottleneck neural network, sBNN) applied to the M1, V1 and CITE-seq dataset in M1_cv_2d_bottleneck.ipynb, M1_cv_64d_bottleneck.ipynb, M1_cv_with_ion_channel_genes_2d_bottleneck.ipynb, V1_cv_2d_bottleneck.ipynb, V1_cv_64d_bottleneck.ipynb, CITE-seq_cv_2d_bottleneck.ipynb and CITE-seq_cv_64d_bottleneck.ipynb respectively. Results are pickled in KerasSavedModels so it is possible to produce figures also without running the cross validation.

Figures

With optimal parameters deduced from the cross validation, one can run models for the full datasets in M1_full_dataset_2d_bottleneck.ipynb, M1_full_dataset_64d_bottleneck.ipynb, M1_full_dataset_with_ion_channel_genes_2d_bottleneck.ipynb, M1_full_dataset_with_ion_channel_genes_64d_bottleneck.ipynb, V1_full_dataset_2d_bottleneck.ipynb, V1_full_dataset_64d_bottleneck.ipynb, CITE-seq_full_dataset_2d_bottleneck.ipynb and CITE-seq_full_dataset_64d_bottleneck.ipynb. Trained neural network weights will be saved in KerasSavedModels and can then be used to produce final figures in combined_figures.ipynb.

Work conducted under supervision of Dmitry Kobak and Philipp Berens.

Contact: yves.bernaerts@uni-tuebingen.de

Website: http://www.eye-tuebingen.de/berenslab/members/yves-bernaerts/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparse bottleneck neural networks for exploratory non-linear visualization of Patch-seq data

Requirements:

Data

Preprocessing

Cross validation

Figures

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 389 Commits
KerasSavedModels		KerasSavedModels
data		data
figures		figures
CITE-seq_cv_2d_bottleneck.ipynb		CITE-seq_cv_2d_bottleneck.ipynb
CITE-seq_cv_64d_bottleneck.ipynb		CITE-seq_cv_64d_bottleneck.ipynb
CITE-seq_full_dataset_2d_bottleneck.ipynb		CITE-seq_full_dataset_2d_bottleneck.ipynb
CITE-seq_full_dataset_64d_bottleneck.ipynb		CITE-seq_full_dataset_64d_bottleneck.ipynb
LICENSE		LICENSE
M1_cv_2d_bottleleneck.ipynb		M1_cv_2d_bottleleneck.ipynb
M1_cv_64d_bottleneck.ipynb		M1_cv_64d_bottleneck.ipynb
M1_cv_with_ion_channel_genes_2d_bottleneck.ipynb		M1_cv_with_ion_channel_genes_2d_bottleneck.ipynb
M1_full_dataset_2d_bottleneck.ipynb		M1_full_dataset_2d_bottleneck.ipynb
M1_full_dataset_64d_bottleneck.ipynb		M1_full_dataset_64d_bottleneck.ipynb
M1_full_dataset_with_ion_channel_genes_2d_bottleneck.ipynb		M1_full_dataset_with_ion_channel_genes_2d_bottleneck.ipynb
M1_full_dataset_with_ion_channel_genes_64d_bottleneck.ipynb		M1_full_dataset_with_ion_channel_genes_64d_bottleneck.ipynb
README.md		README.md
V1_cv_2d_bottleneck.ipynb		V1_cv_2d_bottleneck.ipynb
V1_cv_64d_bottleneck.ipynb		V1_cv_64d_bottleneck.ipynb
V1_full_dataset_2d_bottleneck.ipynb		V1_full_dataset_2d_bottleneck.ipynb
V1_full_dataset_64d_bottleneck.ipynb		V1_full_dataset_64d_bottleneck.ipynb
bottleneck_networks_tf2.py		bottleneck_networks_tf2.py
combined_figures.ipynb		combined_figures.ipynb
preprocess.ipynb		preprocess.ipynb
sparseRRR.py		sparseRRR.py

License

berenslab/sBNN

Folders and files

Latest commit

History

Repository files navigation

Sparse bottleneck neural networks for exploratory non-linear visualization of Patch-seq data

Requirements:

Data

Preprocessing

Cross validation

Figures

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages