Skip to content

Predicting the impact of regulatory SNPs from cell and tissue specific DNase-footprints

License

Notifications You must be signed in to change notification settings

Hughes-Genome-Group/sasquatch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sasquatch

Description

Sasquatch uses DNase-seq data to pile-up average DNase I cut profiles over all possible short k-mers in the context of open-chromatin. By identifying and quantifying footprints in these average profiles, Sasquatch infers a k-mers potential to bind transcription factors in a tissue-specific manner. Based on comparative analysis, Sasquatch predicts the damaging potential of sequence variations considering the tissue context of interest and independent of a specific genotype. Furthermore, in silico mutation analysis profiles larger sequences for transcription factor binding sites actively bound in the tissue of interest. With a large repository of preprocessed, publicly available DNase-seq data, complemented by our own, deep DNase-seq in human primary erythroid cells, Sasquatch provides a powerful tool for rapid transcription factor profiling and for prioritising and interpreting non-coding sequence variations.

Getting Started

  1. Clone the repository.

  2. Get data. The repository only comes with a minimal dummy of example data to run the example scripts. To start with real data, visit our webtool site and download your data of interest from our repository of preprocessed DNase-seq data. Details about the different samples are avaialble here. Extract them into ./sasquatch/data/human/DNase or ./sasquatch/data/mouse/DNase. Every tissue data comes in a directory which should form the subdirectory in your sasquatch/data/organism/DNase local repository. (When extracting into any other directory just make sure that you link Sasquatch to your right personal repository and keep every tissue dataset in a separate subdirectory.)

  3. In your R-script, source the the R-functions ./R_utility/functions_sasq_r_utility.R, point to your local data repository and set basic parameters as described in the beginning of the Vignette and Example R-script.

  4. Also make sure to check out our webtool for further documentation about Sasquatch and average DNase-footprints analysis.

Documentation

Vignette running through the basic Sasquatch analysis steps.

Reference Manual for all implemented functions.

Example R-script running through the most of Sasquatchs functions.

Also check out our general introduction into the Sasquatch approach and into average DNase I footprints from our webtool site.

Links

Here, you can reach our Sasquatch webtool implementation as well as the repository of preprocessed DNase-seq data to download for local use.

Paper: http://www.genome.org/cgi/doi/10.1101/gr.220202.117

License

Sasquatch is pulished under GPLv3 or later.

About

Predicting the impact of regulatory SNPs from cell and tissue specific DNase-footprints

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published