GitHub - epiviz/EpivizQuindex: Fast Index search across a large collection of genomic files (BigWigs or BigBeds)

EpivizQuindex

Genomic analysis pipelines and workflows often use specialized file formats for manipulating and quickly finding data on potential genomic regions of interest. These file formats contain an index as part of the specification and allows users to perform random access queries. When we have a collection of these files, it's time consuming to read every single file and extract the data for a region of interest. The goal with Quindex approach is to "index the index" from these files and provide fast access to large collections of genomic data across files.

Usage

To import the package, simply run:

from epivizquindex import EpivizQuindex

Create the index

Define the genome range, and set the path to a folder where you want to hold the index:

base_path should be a folder. If the path does not exist, Quindex will create the path.

Add files to index with a simple function call:

f1 = "/path_to_your_file/some.bigwig"
f2 = "/path_to_your_file/someOther.bigwig"
# adding file to index
index.add_to_index(f1)
index.add_to_index(f2)

Performe in-memory query

Once the index is created, invoke the query in a specific chromosome and range:

index.query("chr2", 0, 900000)

You can also specify which file you are looking for:

index.query("chr2", 0, 900000, file = f1)

Store and load computed index to disk

Store the index to disk and load index to memory with to_disk() and from_disk(). The path is the base_path parameter when creating the index.

# storing the precomputed index
index.to_disk()
# reading a precomputed index
index = EpivizQuindex.EpivizQuindex(genome, base_path=base_path)
index.from_disk()

Perform search without loading

We can also perform search without loading the index to memory:

memory = False
index = EpivizQuindex.EpivizQuindex(genome, base_path=base_path)
index.from_disk(load = memory)
index.query("chr2", 0, 900000, in_memory = memory)

Note

This project has been set up using PyScaffold 4.2.3. For details and usage information on PyScaffold see https://pyscaffold.org/.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
benchmarks		benchmarks
docs		docs
src/epivizquindex		src/epivizquindex
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE.txt		LICENSE.txt
README.rst		README.rst
Showcase.ipynb		Showcase.ipynb
pyproject.toml		pyproject.toml
readthedocs.yaml		readthedocs.yaml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EpivizQuindex

Usage

Create the index

Performe in-memory query

Store and load computed index to disk

Perform search without loading

Note

About

Releases

Packages

Languages

License

epiviz/EpivizQuindex

Folders and files

Latest commit

History

Repository files navigation

EpivizQuindex

Usage

Create the index

Performe in-memory query

Store and load computed index to disk

Perform search without loading

Note

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages