PanDora: Pangenome Database of Reference Assemblies

A non-redundant human pangenome database for microbiome sequence decontamination using reference assemblies.

This database integrates genomes from the following projects:

The Genome Reference Consortium human reference assembly, GRCh38 (NCBI RefSeq GCF_000001405.40)
The T2T Consortium human reference assembly, CHM13v2.0 (NCBI RefSeq: GCF_009914755.1)
The Human Pangenome Reference Consortium (HPRC) Year 1 assemblies (available on GitHub or NCBI; PRJNA730823)

To recreate the database

Requirements

The following tools must be installed and available in your PATH:

datasets (NCBI datasets CLI)
panSieve (GitHub)
seqtk
seqkit
vg
bcftools
tabix
bgzip

# Install dependencies using conda
conda env create -f environment.yml

# Clone panSieve and add to your PATH
git clone https://github.com/cpnh/panSieve

export PATH="panSieve/src/:$PATH"

Usage

create-pandora -g ref/hprc-v1.0-minigraph-grch38.gfa

Documentation

Full documentation is available in the docs directory.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
docs		docs
utils		utils
.Rprofile		.Rprofile
LICENSE		LICENSE
README.md		README.md
create-pandora		create-pandora
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PanDora: Pangenome Database of Reference Assemblies

Contents

To recreate the database

Requirements

Usage

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PanDora: Pangenome Database of Reference Assemblies

Contents

To recreate the database

Requirements

Usage

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages