Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

BulkSeq

This set of scripts is part of the chapter 'Identification of parent-of-origin-dependent QTLs using bulk-segregant sequencing (Bulk-Seq)', from the book Plant Chromatin Dynamics (Springer 2018)

GenomeSNPmask.py: Remove or replace known SNP positions from a genome sequence file (fasta)

mapping.sh: minimal set of commands to filter and map reads from a fastq file, call SNPs and output a out.vcf file with allele frequencies

snpFile.R: retrieve publicly available snp data for the Cvi-0 and Ler-1 accessions of Arabidopsis thaliana, merge and output a reformatted snp matrix (snpm.txt)

cleanCounts.R: merge information from the snp matrix (snpm.txt) with the measured allele frequencies (out.vcf), filters and outputs a counts.csv file with allele counts

pool.R: combine allele frequencies from two samples (obtained with cleanCounts.R) and calculate relative frequencies along chromosomes

Requires:

FastQC 0.11.3; cutadapt 1.8.3; Samtools 1.2 (using htslib 1.2.1); Bowtie 2 2.2.9; R 3.3.1; scales_0.4.0; ggplot2 2.1.0; zoo 1.7-13; Python 3.4.0; Bio 1.65;


Example fastq datasets that can be used in this analysis are available in the ArrayExpress database (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-5196:

WT_pool_1 (1.56GB)

mea_pool_1 (2GB)

About

Supporting material for book chapter 'Identification of parent-of-origin-dependent QTLs using bulk-segregant sequencing (Bulk-Seq)'

Resources

License

Releases

No releases published

Packages

No packages published