GitHub - MViscardi-UCSC/ANDROMEDA: Alignment-based Nucleotide Detection and Read Optimization for Mapping Errors, Deaminations, and Alterations

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃            _   _ _____  _____   ____  __  __ ______ _____            ┃
┃      /\   | \ | |  __ \|  __ \ / __ \|  \/  |  ____|  __ \   /\      ┃
┃     /  \  |  \| | |  | | |__) | |  | | \  / | |__  | |  | | /  \     ┃
┃    / /\ \ | . ` | |  | |  _  /| |  | | |\/| |  __| | |  | |/ /\ \    ┃
┃   / ____ \| |\  | |__| | | \ \| |__| | |  | | |____| |__| / ____ \   ┃
┃  /_/    \_\_| \_|_____/|_|  \_\\____/|_|  |_|______|_____/_/    \_\  ┃
┃                                                                      ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Alignment-based Nucleotide Detection and Read Optimization for Mapping Errors, Deaminations, and Alterations

Overview

ANDROMEDA is a bioinformatics tool designed to process mapped sequencing reads and extract information from ambiguous nucleotide positions. The tool identifies barcode and UMI sequences from predefined mapping positions, assigns tags to BAM files, and detects mismatched bases resulting from RNA modifications (e.g., deamination by TadA or other base editors). It also enables UMI grouping, consensus calling, and error annotation to facilitate high-confidence sequence reconstruction.

Key Features

Map-based Barcode & UMI Extraction – Extracts nucleotide sequences from user-defined ambiguous positions in mapped reads.
BAM Tagging – Annotates BAM files with extracted barcode/UMI sequences for downstream analysis.
Modification Detection – Identifies mismatched bases caused by base modifications (e.g., deamination).
UMI Grouping – Leverages UMI-tools to cluster reads.
PCR Collapsing & Consensus Calling - UMI groups are collapsed to generate high-confidence consensus sequences with additional Phred averaging for each nucleotide.
Error & Confidence Tagging – Adds metadata on sequence confidence, ambiguous bases, and read support.
Modular Design – Individual processing steps can be run independently, allowing seamless integration into other workflows.

Running ANDROMEDA with uv

To run ANDROMEDA with uv, you will need to have the following installed:

uv (GitHub link)
samtools (GitHub link)

The uvx command will do the rest (gathering the correct python version and all other dependencies)! WOW!

To run the full pipeline, you can use the following command:

uvx --from git+https://github.com/MViscardi-UCSC/ANDROMEDA andromeda run-all --help

Full Installation

ANDROMEDA requires Python 3.8+ and dependencies such as pysam, UMI-tools, and samtools. Install using:

# Clone the repository
git clone https://github.com/MViscardi-UCSC/ANDROMEDA.git
cd ANDROMEDA

# Create a virtual environment (optional but recommended)
python -m venv andromeda_env
source andromeda_env/bin/activate  # On Windows use: andromeda_env\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

ANDROMEDA consists of multiple modules that can be run independently ~~or as part of a pipeline~~ (coming soon).

Within the andromeda folder we currently have these modules working:

ref_pos_picker.py will help users pick correct coordinates for their UMIs in their reference
extract.py will take in a mapped BAM file, a reference genome, and a list of positions to extract from the reads. It will output a new BAM file with the extracted sequences in the read tags.
umi_group.py will take in the output from extract.py and group reads by UMI. It will output a new BAM file with the UMI sequences in the read tags. (This will leverage UMI-Tools)
consensus.py will take in the output from umi_group.py and generate consensus sequences for each UMI group. It will output a new BAM file of the consensus sequences.

More specific instructions for each module can be found below, with extended details available with the --help flag.

0. Pick Reference Positions for UMIs

python andromeda ref_pos_picker <ref.fasta> <output_parent_directory>

1. Extract Barcodes and UMIs from Mapped Reads

python andromeda extract <ref.fasta> <mapped.bam> <output_parent_directory>

2. Group UMIs Using UMI-Tools

python andromeda umi_group <tagged.bam (from extract step)> <output_parent_directory>

3. Create Consensus Sequences

python andromeda consensus <ref.fasta> <grouped.bam (from group step)> <output_parent_directory>

Or Run all the steps together!:

python andromeda run-all <ref.fasta> <mapped.bam> <output_parent_directory>

Dependencies

Python 3.11+
samtools
UMI-tools
pandas
biopython
seaborn and matplotlib for plotting (~~optional~~) TODO: Make this optional

Future Features

Support for base modifications, not just mismatches due to deamination
Integration with alternative consensus-calling algorithms
Enhanced confidence scoring for ambiguous base calls

License

MIT License. See LICENSE for details.

Contributors

Developed by Marcus Viscardi and Liam Tran in the Arribere Lab at UCSC. Contributions welcome!

Contact

For questions or feedback, please contact marcus.viscardi@gmail.com or open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
.github/workflows		.github/workflows
.idea		.idea
andromeda		andromeda
examples/JA-NP-093		examples/JA-NP-093
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Key Features

Running ANDROMEDA with uv

Full Installation

Usage

0. Pick Reference Positions for UMIs

1. Extract Barcodes and UMIs from Mapped Reads

2. Group UMIs Using UMI-Tools

3. Create Consensus Sequences

Or Run all the steps together!:

Dependencies

Future Features

License

Contributors

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

MViscardi-UCSC/ANDROMEDA

Folders and files

Latest commit

History

Repository files navigation

Overview

Key Features

Running ANDROMEDA with uv

Full Installation

Usage

0. Pick Reference Positions for UMIs

1. Extract Barcodes and UMIs from Mapped Reads

2. Group UMIs Using UMI-Tools

3. Create Consensus Sequences

Or Run all the steps together!:

Dependencies

Future Features

License

Contributors

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages