BarCall

A deep-learning approach to spot detection and base-calling to identify sgRNA barcodes in OPS data. This repository includes the contents to train and evaluate this model. It also includes contents for a paper on this project. Evaluations are in comparison to SCALLOPS and Yuan Gao's internship where she benchmarked open-source spot-calling methods.

Note that anywhere in the code that says "Oracle" refers to hard BAD loss in the manuscript.

Requirements

pip install filelock pytorch_lightning torch kornia pandas tqdm "numpy<2"

Additional requirement for barcall/create_new_bc_dataset.py:

pip install https://github.com/Genentech/scallops.git

Usage

To run BarCall inference or training:
- ISS images must be saved as a memmap, segmentation of cells must be saved as a memmap , and mean and std of the ISS images must be saved. These functions are all defined in in preprocess_image_files.py, and an example of running them on the snippet of PERISCOPE zarr files is included in mini_tutorial.ipynb
- For training, a dataset must be used. spots_meta.parquet and l1_matching_unamimous_cliques.parquet are included here. The former utilizes the output of DeepBase as new basecalling labels, and l1_matching_unamimous_cliques.parquet uses Scallops as basecalling labels.
- For inference, checkpoints for BarCall with hard and soft BAD loss, respectively, are provided in checkpoints/
- BarCall inference can then either be run with barcall/5nm.sh or in the barcall/barcall_inference.ipynb
- If running inference or training on a plate other than A, you need to overwrite the MEANS, STDS with the well_stats from preprocess_image_files.py (example of this can be seen in barcall_inference.ipynb for plate C)
For a DeepBase model
- To train, run basecalling_model/train_bc.py, with any of the parameters at the top. It similarly requires a memmap of the ISS image, but not of the cell map nor the well stats. It uses the same l1_matching_unamimous_cliques.parquet dataset files as BarCall. Default parameters for DeepBase in train_deepbase.sh
- To use for inference, see basecalling_model/deep_base_eval.ipynb. This saves a .parquet with the barcode label for each spot provided to the model
The generation of Table 1 is in table_3_1.ipynb
The generation of Tables 2, 3, and 4, as well as Figure 3 is all in spot_finding/barcall_evaluation.ipynb.
The generation of the .parquet files referenced in the manuscript are in spot_finding/barcall_inference.ipynb and basecalling/barcall_inference.ipynb.

Tutorial

See mini_tutorial.ipynb. This operates on a tiny crop of an ISS image from PERISCOPE for the ease of storing and running quickly.

Authors and acknowledgments

Thanks to Joshua Gould, Sergio Hleap, Bo Li, Yuan Gao, Monica Ge, Avtar Sing, Amy Chuong, and David Richmond for their leadership, support, advice, and contributions to this project.

Model Checkpoints

See spot_basecalling_model/checkpoints and basecalling_model/checkpoints

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
basecalling_model		basecalling_model
spot_basecalling_model		spot_basecalling_model
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
table_3_1.ipynb		table_3_1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BarCall

Requirements

Usage

Tutorial

Authors and acknowledgments

Model Checkpoints

About

Uh oh!

Releases 1

Packages

Languages

License

Genentech/barcall

Folders and files

Latest commit

History

Repository files navigation

BarCall

Requirements

Usage

Tutorial

Authors and acknowledgments

Model Checkpoints

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages