Skip to content
Scripts for Wang et al. (2017) "A major locus controls local adaptation and adaptive life history variation in a perennial plant". https://www.biorxiv.org/content/early/2017/12/13/178921 https://zenodo.org/badge/100579273.svg
R Shell Perl AngelScript
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
1-Sequencing quality checking_read mapping_post_mapping filtering
2-SNP and genotype calling
3-relatedness_population structure_IBD
4-Screening SNPs with local adaptation
5-Genotype imputation
6-Positive selection
CITATION.md
LICENSE
README.md

README.md

PhotoperiodLocalAdaptation

Scripts for Wang et al. (2017) "A major locus controls local adaptation and adaptive life history variation in a perennial plant". https://www.biorxiv.org/content/early/2017/12/13/178921

Documentation of Scripts

1. Sequencing quality checking, read mapping and post-mapping filtering

    Picard_markduplicates_SwAsp.sh - Use Picard to correct for artifacts of PCR duplication

    QualityControl.sh - Use Trimmomatic v0.30 and FastQC to do sequence quality checking

    realign.sh - Use GATK to do read realignment around indels

    runBWA_mem_asp201.SwAsp.sh - Use BWA-MEM to do read mapping

2. SNP and genotype calling

    HaplotypeCaller_mem_tremula.sh - Use GATK to do SNP calling

    snpEff.gatk.sh - Use snpEff to annotate the SNPs

    vcf_to_maf.sh vcf2maf.pl - Use perl script to map each variant to only one of all possible gene isoforms

3. Relatedness, population structure and isolation-by-distance

    eigen_pop_genetics.sh - Use smartpca program in EIGENSOFT to perform PCA analysis

    fst_matrix.R - create the matrix of Fst estimates

    Isolation_by_distance.R - create the matrix of geographic distance
    
    mantel.R - Mantel test

    plink_prunedLD.sh - Use PLINK to generate Linkage-disequilibrium(LD)-trimmed SNP sets
    
    SwAsp.IBD.plot.R - plot Isolation-by-distance

4. Screening for SNPs associated with local adaptation

    env_PC.error_bar.R - Relationship between the environmental PC1 scores and the number of days with degree higher than 5
   
    GEMMA.lmm.noPC.budset.sh - Use GEMMA to do GWAS for budset

    Ekebo_Savar_budset.as - Use Asreml to estimate the genetic values of bud set
     
    LEA.K1.R LEA.K2.R LEA.K3.R - use a latent factor mixed-effect model (LFMM) implemented in the R package LEA to detect SNPs associated with first environmental PC, with the latent factors (K) from 1 to 3.
     
    LEA_zscore.94samples.R - Transform the z-scores from LFMM results to p-values


    PCAdapt.SwAsp94.R - PCAadapt test

    pcadapt_fst_cor.R - The relationship between PCadapt results and Fst values

    pca_corr_env.R - PCA analysis for the environment dat

    Figure2 - scripts for recreating Figure 2

        chr10.gemma.ld_Dprime.R - calculate LD and colorise points relative to tthe top SNP in PtFT2
        
        excute_manhanttan.sh - shell script for manhattan plots
        
        FT2.gemma.ld_Dprime.R - plot associations and colorise according to LD for the PtFT2 SNPs
        
        legend.col.R - plot legend
        
        manhanttan_plot.R - manhattan plots (modified from https://cran.r-project.org/web/packages/qqman/index.html)
        
        manhanttan.3methods.R - plot manhattan plots and qqplots and combine for three types of analyses (PCAdapt, LFMM, GWAS)
        
        scaffold_likely_wrong.txt - scaffolds that are likely missasembled and therefore exluded from the plots

5. Genotype imputation

    Beagle_comparison.sh - Create several missing genotypes for simulation

    createFile_imputate.R - Create imputation files with different level of missing values

    Create_simu_missing_file.sh - Tor simulate by BEAGLE and calculate accuracy

    define_ancestral.SwAsp.pseudo_chr.sh - Use BEAGLE to do genotype imputation and also define ancestal and derived allele based on the sequences of outgroup species of P.tremuloides and P.trichocarpa

    impute_accuracy_beagle_vcf.pl - Calculate the imputation accuracy by each sample and SNP

    impute_evaluate.R - Evaluate the imputation accuracy

6. Positive selection

    ABCinference.R - Script for performing Approximate Bayesian Computation (ABC) to jointly estimate s (the strength of selection on the beneficial mutation causing the sweep) and T (the time since the beneficial allele fixed). Script modified from original provided by Ormond et al (2016) at http://jjensenlab.org/wp-content/uploads/2016/02/ABC_inference.zip
    
    angsd_SFS_SwAsp.all.sh - Use ANGSD to estimate the genetic diversity in specific groups of populations

    caviar.chr10.sh caviar.z-score.R - Run CAVIAR on specific region of Chr10

    chr10_genome.sig.angsd_tP_tajD.group_plot.R - Compare the genetic diversity between chr10 region and genome-wide levels

    chr10_genome.sig.fst.group_plot.R - Compare Fst between chr10 region and genome-wide levels

    chr10_genome.sig.h12.group_plot.R - Compare H12 and H2/H1 between chr10 region and genome-wide levels

    chr10_genome.sig.sweepfinder2.group_plot.R - Compare CLR between chr10 region and genome-wide levels

    chr10_genome.sig.ihs_nsl.R - Compare iHS and nSL values between chr10 region and genome-wide levels

    ehh.plot.sh colormap.plotting.R - Create EHH plot

    genome_wide.summary.compare.sh - Use vcftools, selscan and H12 test to perform a set of selection tests across the genome

    ms2sf2.pl - Convert msms output to input format for SweepFinder2

    plink.ldheatmap.sh LDheatmap.R - Use PLINK to calculate LD across SNPs

    run_sweep_sim.sg - Simulate independent selective sweep events using the coalescent simulation program msms and analyse the results using SweepFinder2 to assess region affected by sweep

    select.chr10.summary.sh - Use vcftools, selscan and H12 test to perform a set of selection tests on the specific region (~700kbp) on Chr10

    sweepfinder2.FT2paper.sh - Run SweepFinder2 to detect selection signals
You can’t perform that action at this time.