Population Structure: admixture/structure and dapc analysis
Diversity stats: Diversity and Tajima’s D and estimation of recombination events. \
SFS: Calaculation of the unfolded sfs by polarizing some snp with P. reichenowi outgroup. \
Final report can be found in ./report/report.pdf!
vcf files downloaded from: “ftp://ngs.sanger.ac.uk/production/malaria/Resource/34/Pf7_vcf/”
Filtering for Quality control pass (?), bialleic snps…
bcftools view \
--include 'FILTER="PASS" && N_ALT=1 && TYPE="snp"'\
--samples-file $samples_dir/Pf7_african_samples_ids.txt \
--output-type z\
--output-file Pf3D7_02_v3.SNP.vcf \
Pf3D7_02_v3.pf7.vcf.gz
bcftools index -t Pf3D7_02_v3.SNP.vcf
The average VQSLOD value for the 1846 african samples, filtered only by the above is 4.215 and 4.238 for chr11 and chr2, respectively.
chr2 | chr11 | |
---|---|---|
SNPs | 83,810 | 231,511 |
avg. VQSLOD | 4.238 | 4.215 |
chr2 | chr11 | |
---|---|---|
SNPs | 16,269 | 44,040 |
avg. VQSLOD | 7.481 | 7.427 |
bcftools filter -S . -e 'GT=="het"' chr2/Pf3D7_02_v3.afr_samples.qSNP.vcf.gz -o chr2/Pf3D7_02_v3.afr_samples.qSNP.GT_filtered.vcf -Oz
bcftools query -f '%POS %REF %ALT %AF %AC %AN\n' Pf3D7_02_v3.afr_samples.qSNP.GT_filtered.vcf.gz > Pf7.02.vcf.qSNP.AF_AC_AN.txt
https://wellcomeopenresearch.org/articles/7-136/v1
https://www.malariagen.net/apps/pf7/countries/KE
- 285 samples from the Year = 2010-2014 & QC.pass = “True” at the locations Kilifi and Kisumu.
https://www.malariagen.net/apps/pf7/countries/GM
- 452 samples from Upper River and the years 2013-2017
- 520 samples, 2012-2016, QC.pass = True
- 589 samples (2010-2014), QC.pass = True
- Myanmar, Kayin (631 samples, 2016-2017)
- Malawi, Chikawa (231 samples, 2011)
- …
Vivax data set from similar locations?
MUmer version 4 to align the outgroup PrCDCD to ref Pf3D7. Use R script to visualize alignment success: https://taylorreiter.github.io/2019-05-11-Visualizing-NUCmer-Output/
Nucleotide substitutions between Pf7 reference and PrCDC assemebled sequence: 25353
Overlapping with the snps from the vcf file: 927
The first P. reichenowi reference genome PrCDC was assembled by mapping to Pf7 reference genome (cite:&otto-2014-genom-sequen) Reference genome PrSY57 has been constructed by mapping to PrCDC (cite:&sundararaman-2016-genom-crypt).
Check the divergence between PrSY57 and Pf7. If there is too much intra-specific polymorphism, we cannot determine the ancestral state of the sample polymorphisms. How to do this? Alignment or mapping?
How to set the anchestral allele information in the outgroup
https://grunwaldlab.github.io/Population_Genetics_in_R/DAPC.html
use dapc tutorial to run dapc with k-1 retained components to avaoid overfitting the data.
- What does plasmodiums parasitic lifestyle traits contribute it’s transmission demography (ie. fluctuating population sizes or mulitple merger events) and adative processes (such as positive selection and/or coevolution)?
- Do we find signatures of multiple mergers or dormancy in P. falciparum populations?
- Maybe: What signatures do the Pfsa loci show, when accounting for their demography? What about compared to genes under positive selection?
Citations and PDF files can be found in the ~/biblio directory.
- Rich et al., 2000: Population structure and recent evolution of Plasmodium falciparum; cite:&rich-2000-popul-struc;
- Nderu et al., 2019: Genetic diversity and population structure of P. falciparum in Kenyan-Ugandan border areas; cite:&nderu-2019-genet-diver;
- Amambua-Ngwa et al., 2019: Major subpopulations of P. falciparum in sub-Saharan Africa; cite:&amambua-ngwa-2019-major-subpop;
- Meyer et al., 2002: Genetic diversity of P. falciparum: asexual stages; cite:&meyer-2002-review;
- Benavente, 2021: Genetic structure and selection patterns of Plasmodium vivax in South Asia and East Afrika; cite:&benavente-2021-distin-genet;
- Nygaard et al, 2010: Long- and Short-term selective forces on Malaria Parasite Genomes;cite:&nygaard-2010-long-short
- Sundararaman et al. 2016: Genomes of cryptic chimpanzee Plasmodium reveal key evolutionary events leading to human malaria;cite:&sundararaman-2016-genom-crypt
- Otto et al, 2014: Genome sequencing of chimpanzee malaria parasite reveals possible pathways of adaptation to human hosts; cite:&otto-2014-genom-sequen
- Pearson et al., 2016: Genomic analysis of local variation and recent evolution in Plasmodium vivax; cite:&nygaard-2010-long-short
- Naung et al., 2022: Global diversity and balancing selection of 23 leading Plasmodium falciparum candidate vaccine antigens;cite:&naung-2022-global-diver
- Noviyanti et al., 2015: Contrasting transmission dynamics of co-endemic Plasmodium;cite:&noviyanti-2015-contr-trans
- Band, 2021: Malaria Protection due to Sickle Haemoglobin Depends on Parasite Genotype; cite:&band-2021-malar-protec;
- Raberg, 2023: Human and Pathogen Genotype-By-Genotype Interactions in the Light of Coevolution theory; cite:&raberg-2023-human-pathog;
- Brown and Tellier, 2011: Plant-parasite coevolution: Bridging the Gap between Genetics and Ecology; cite:&brown-2011-plant-paras-coevol;
- Tellier and Brown, 2021: Theory of Host-Parasite Coevolution: From Ecology to Genomics; cite:&tellier-2021-theor-host;
- Maerkle, 2021: Genomic approaches to study antagonistic coevolution in host and parasites; cite:&maerkle-2021-novel-genom;
https://www.malariagen.net/apps/pf7/
ftp://ftp.sanger.ac.uk/pub/project/pathogens/Plasmodium/reichenowi/
https://kevinkorfmann.github.io/workshop-kenya/session_1.html