Skip to content

Tools used for generation and annotation of .gff files with ChIP-seq, ATAC-seq, MNase-seq and bisulfite datasets.

License

Notifications You must be signed in to change notification settings

hamishking/gff-annotation-tools

Repository files navigation

gff-annotation-tools

Tools developed as part of analysis for the manuscript "Polycomb repressive complex 1 shapes the nucleosome landscape but not accessibility at target genes."

Please contact h.king@qmul.ac.uk or drhamishking@gmail.com with questions or problems.

Requirements differ between scripts but most involve the use of:

For nucleosome feature annotation, output from the following may be required.

Where list of files is used, tab-limited format is required as

file.bam filename

General GFF annotation scripts

BED2GFF.pl

Used to annotate 3/4 column .bed file using HOMER's annotatePeaks.pl and convert it to 9 column .gff file.

GFF2TXT.pl

Converts 9 column .gff file to tab-delimited table for use in R or other database packages.

GFF_BED.Overlap.pl

Intersects GFF intervals with .bed file and annotates overlap as TRUE/FALSE. Requires BEDtools.

GFF_BAM.Count.pl

Annotates GFF file intervals with read counts or reads per kilobase per million (RPKM) from a list of indexed bam files. Requires samtools.

GFF_BedGraph.Coverage.pl

Annotates intervals in GFF file with mean coverage for bedgraph files in a provided list. Requires BEDtools.

GFF_BisulfitePercentMethylation.pl

Annotates intervals in GFF file with Percent Methylation using HOMER TagDirectories made from Bismark-derived bisulfite cytosine.cov files. Requires HOMER.

Annotation of GFF files with nucleosome features

GFF_ATAC.FragSizes.pl

Annotates intervals in GFF file with median and average fragment sizes of paired-end ATAC-seq reads. Requries samtools, st.

GFF_MNase.DANPOS2.Fuzziness.pl

Annotates GFF file with statistics of nucleosome fuzziness score calculated by DANPOS2 dpos. Requires a list of .smooth.positions.xls outputs from DANPOS2 with corresponding sample names. NB. Not to be used with differential DANPOS2 files ie ref_adjust or integrative.xls. Requires BEDtools, st, DANPOS2 output.

GFF_MNase.DANPOS2.InterDyadDistance.pl

Annotates GFF file with statistics of distances between nucleosome positions identified using DANPOS2 dpos command. Requires a list of .smooth.positions.xls outputs from DANPOS2 with corresponding sample names. NB. Not to be used with differential DANPOS2 files ie ref_adjust or integrative.xls. Requires BEDtools, st, DANPOS2 output.

GFF_MNase.DANPOS2.Occupancy.pl

Annotates GFF file with statistics of nucleosome summit occupancy calculated by DANPOS2 dpos. Requires a list of .smooth.positions.xls outputs from DANPOS2 with corresponding sample names. NB. Not to be used with differential DANPOS2 files ie ref_adjust or integrative.xls. Requires BEDtools, st, DANPOS2 output.

GFF_NucleoATAC.InterDyadDistance.pl

Annotates GFF file with statistics of distances between nucleosome positions identified using NucleoATAC. Requires a list of nucmap_combined.bed outputs from NucleoATAC with corresponding sample names. Requires BEDtools, st, NucleoATAC output.

GFF_NucleoATAC.NucPosFuzziness.pl

Annotates GFF file with statistics of nucleosome fuzziness score calculated by NucleoATAC. Requires a list of nucpos.bed outputs from NucleoATAC with corresponding sample names. Requires BEDtools, st, NucleoATAC output.

About

Tools used for generation and annotation of .gff files with ChIP-seq, ATAC-seq, MNase-seq and bisulfite datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages