IBDcalculation

In this repository you can find the pipeline to calculate identity by descent (IBD) from paper: Genomic and chemical evidence for local adaptation in resistance to different herbivores in Datura stramonium by De-la-Cruz et al. 2020

Paper https://onlinelibrary.wiley.com/doi/10.1111/evo.14097

ddRad-seq data to calculate IBD is available in GenBank under the Bioproject "comming soon"

Programs

Lep-MAP3
Samtools
BWA
awk

The first step after trimming bad sequences is align all the individuals to reference genomes. This can be achieved using BWA program

1.- Create a folder and stored all the sequences (PE) of each individual

2.- Is important to note that for hundreds of PE reads the name of the files should be like teo1_1.fq and teo1_2.fq. This script considers to eliminate the column of the PE direction. At the end you will have all the aligned sequences per each individual in a .bam file. Also the original name of the individuals files are mantained

for i in *.fq; do yolo=$(echo "$i" | rev | cut -c 6- | rev | uniq); bwa mem index_tic_geno ${yolo}_1.fq ${yolo}_2.fq -t 20 | samtools view -Sb -@ 20 - > ${yolo}.bam; done &> log.txt &

Then sort all the bam files

for f in *.bam; do samtools sort -@ 30 $f > ${f%\.*}_sorted.bam; done &

Calculating Identity by Descent values between all individuals and each parent.

The sequencing data processing pipeline for Lep-MAP3 is based on "SAMtools mpileup" and custom scripts. This scripts can be found in https://sourceforge.net/p/lep-map3/wiki/LM3%20Home/#ibd YOU CAN ALSO FIND DOCUMENTATION TO INSTALL Lep-MAP3 there

pileupParser2.awk and pileup2posterior.awk, provided in scripts.zip (also provided in this tutorial). The usage is

All the stored -bam files for each individual are de imput to obtain GENOTYPE POSTERIOR PROBABILITIES

samtools mpileup -q 10 -Q 10 -s $(cat sorted_bams)|awk -f pileupParser2.awk|awk -f pileup2posterior.awk|gzip >post.gz

This command requires two files, sorted_bams and mapping.txt, both containing exactly one line listing the file names for sorted bams and individual names, respectively and in the same order. If the data of each individual is in its own bam, then the files can be same (but it is more clear to remove the bam suffix from the individual names). Please note that this pipeline does not work with the old version of samtools (0.X). An example is provided in this tutorial.

For example (3 individuals in 4 bams):

sorted_bams: 1.bam 1a.bam 2.bam 3.bam

mapping.txt: 1 1 2 3

The next step is calculate the IBD values between all pair-wise individuals. Input from above script

zcat post_from_pipeline_above.gz|awk '{if (NR==1) {print;print;print;print;print}; print}'|java IBD data=- >ibd.txt

relatedness to a potential parent

sort -n -r -k 3,3  ibd.txt|grep -w possible_parent|less

imda@ecologia.unam.mx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IBDcalculation

Programs

The first step after trimming bad sequences is align all the individuals to reference genomes. This can be achieved using BWA program

Then sort all the bam files

Calculating Identity by Descent values between all individuals and each parent.

The sequencing data processing pipeline for Lep-MAP3 is based on "SAMtools mpileup" and custom scripts. This scripts can be found in https://sourceforge.net/p/lep-map3/wiki/LM3%20Home/#ibd YOU CAN ALSO FIND DOCUMENTATION TO INSTALL Lep-MAP3 there

The next step is calculate the IBD values between all pair-wise individuals. Input from above script

relatedness to a potential parent

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
README.md		README.md
mapping.txt		mapping.txt
pileup2posterior.awk		pileup2posterior.awk
pileupParser2.awk		pileupParser2.awk
sorted_bams		sorted_bams

icruz1989/IBDcalculation

Folders and files

Latest commit

History

Repository files navigation

IBDcalculation

Programs

The first step after trimming bad sequences is align all the individuals to reference genomes. This can be achieved using BWA program

Then sort all the bam files

Calculating Identity by Descent values between all individuals and each parent.

The sequencing data processing pipeline for Lep-MAP3 is based on "SAMtools mpileup" and custom scripts. This scripts can be found in https://sourceforge.net/p/lep-map3/wiki/LM3%20Home/#ibd YOU CAN ALSO FIND DOCUMENTATION TO INSTALL Lep-MAP3 there

The next step is calculate the IBD values between all pair-wise individuals. Input from above script

relatedness to a potential parent

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages