Skip to content

A set of scripts used to analyze DamID-seq experiments in the Laboratory of Analysis of Gene Regulation of IMG RAS, Moscow

Notifications You must be signed in to change notification settings

foriin/DamID-seq

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

DamID-seq

A set of scripts used to analyze DamID-seq experiments in the Laboratory of Analysis of Gene Regulation of IMG RAS, Moscow

Part of these scripts was designed by Ludo Pagie from Bas van Steensel's research group from Netherlands Cancer Institute, other part was developed in collaboration with Alexey Pindyurin's group from IMCB SB RAS.

Dependencies

To succesfully implement these scripts you need to have these programs installed and running on your system:

All paths to these programs must be set by yourself in aligner.sh and reads2bins.sh as bash variables.

R scripts may require packages that are not installed in your library. Please install them from the CRAN repository or if it doesn't work via Bioconductor.

Before you start

If you clone repository you will find make.bins.R script in DamIDseq/bins folder. Please open it, set desirable size of bins and launch it. It will generate gff and txt files, the former for the HTSeq-count and the latter for the DamID-seq_analysis.R script. You can re-run this script with different bin size setting and make files for different bin sizes.

Given that you've set paths to programs and bowtie2 indices in .sh scripts, you need to set path to your fastq files and path to directory where you want to keep output in parameterfile.txt in DamIDseq folder. (You may change ASSEMBLY parameter if you have indices prepared and know what you're doing). Example:

SPECIES=fly
FASTQ_FILES='/home/johndoe/work/projectX/run33/*.gz'
ASSEMBLY=dm3
OUTPUT_DIR='/home/johndoe/work/projectX/OUT'

Last but not least, you have to add info about your run files in damid_description.csv file. It's a simple tab-delimited table where in the left column you should write file name of your fastq-file and in the right corresponding info in the format: TISSUE.PROTEIN.CONDITIONS.REPLICATE_NUMBER. Example:

Data.set  fastq.file
BRAIN.CTCF.vasa(-).1   P155_CGGATG0003.fastq.gz
BRAIN.CTCF.vasa(-).2   P155_ATTGCC0011.fastq.gz
BRAIN.DAM.vasa(-).1   P155_GGCATA0001.fastq.gz
BRAIN.DAM.vasa(-).2   P155_AGTACC0008.fastq.gz

About

A set of scripts used to analyze DamID-seq experiments in the Laboratory of Analysis of Gene Regulation of IMG RAS, Moscow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published