GitHub

A scRNA-seq interface to bulk RNA-seq datasets

Ingredients:

Methods:

Normalize all datasets so that they sum to 1.
Binarize the normalized expression level for each gene using the qualNorm binarization method - clustering into two clusters and separating by means.
Build a search index by Hamming distance.

How to deal with missing genes? Is a "skyline query" a good method?

Missing gene imputation? What kind of model to use?

Protein Atlas

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Zeisel		Zeisel
archs4		archs4
gtex		gtex
test_data		test_data
README.md		README.md
binarization.py		binarization.py
binarized_archs4_build_index.py		binarized_archs4_build_index.py
binarized_archs4_query.py		binarized_archs4_query.py
build_nmslib_index.py		build_nmslib_index.py
bulk_data.py		bulk_data.py
load_archs4.py		load_archs4.py
nmslib_query_rank_corr_results.txt		nmslib_query_rank_corr_results.txt
nmslib_query_results.txt		nmslib_query_results.txt
query_archs4.py		query_archs4.py
sparse_matrix_h5.py		sparse_matrix_h5.py
test.py		test.py
test_gtex_query.py		test_gtex_query.py
test_nmslib_indexed_query.py		test_nmslib_indexed_query.py
test_poisson_query.py		test_poisson_query.py
test_rank_correlation_query.py		test_rank_correlation_query.py
test_zeisel_query.py		test_zeisel_query.py
tf_idf_archs4_query.py		tf_idf_archs4_query.py

Provide feedback