BulkSeq

This set of scripts is part of the chapter 'Identification of parent-of-origin-dependent QTLs using bulk-segregant sequencing (Bulk-Seq)', from the book Plant Chromatin Dynamics (Springer 2018)

GenomeSNPmask.py: Remove or replace known SNP positions from a genome sequence file (fasta)

mapping.sh: minimal set of commands to filter and map reads from a fastq file, call SNPs and output a out.vcf file with allele frequencies

snpFile.R: retrieve publicly available snp data for the Cvi-0 and Ler-1 accessions of Arabidopsis thaliana, merge and output a reformatted snp matrix (snpm.txt)

cleanCounts.R: merge information from the snp matrix (snpm.txt) with the measured allele frequencies (out.vcf), filters and outputs a counts.csv file with allele counts

pool.R: combine allele frequencies from two samples (obtained with cleanCounts.R) and calculate relative frequencies along chromosomes

Requires:

FastQC 0.11.3; cutadapt 1.8.3; Samtools 1.2 (using htslib 1.2.1); Bowtie 2 2.2.9; R 3.3.1; scales_0.4.0; ggplot2 2.1.0; zoo 1.7-13; Python 3.4.0; Bio 1.65;

Example fastq datasets that can be used in this analysis are available in the ArrayExpress database (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-5196:
WT_pool_1 (1.56GB)

mea_pool_1 (2GB)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BulkSeq

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
GenomeSNPmask.py		GenomeSNPmask.py
LICENSE		LICENSE
README.md		README.md
cleanCounts.R		cleanCounts.R
mapping.sh		mapping.sh
pool.R		pool.R
snpFile.R		snpFile.R

License

piresn/BulkSeq

Folders and files

Latest commit

History

Repository files navigation

BulkSeq

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages