The gwask
package includes functions for performing post-GWAS analysis
of k-mer GWAS.
Some Bioconductor packages will need to be installed for this package to work:
- Biostrings
- GenomeInfoDb
- GenomicRanges
- IRanges
- MatrixGenerics
- Rsamtools
- S4Vectors
- VariantAnnotation
GAPIT3 also needs to be installed.
WARNING: The GAPIT3 package has recently changed its name to GAPIT, breaking compatibility with gwask. We have not tested the newly named package yet. Most users of the package will not need GAPIT functionality; in this case, simply remove the corresponding line from the DESCRIPTION file before installing. If you do need GAPIT compatibility, the recommended route at the moment is to install a version of GAPIT where it was still called GAPIT3.
Other required packages should be pulled in automatically from CRAN.
Package sources can be downloaded from GitHub by running git clone https://github.com/malemay/gwask
.
Then, running the following command in R
should install the package from source:
install.packages("gwask", repos = NULL, type = "source")
There is no vignette availble at the moment for gwask
, however all functions
included in the package are individually documented. A complete list of the
available functions can be found here:
adjust_gaps
: Adjust plotting position according to alignment gapscluster_haplotypes
: Cluster the most similar haplotypes togethercluster_ld
: Greedy clustering of an LD matrixextract_signals
: Extract signal ranges from GWAS resultsfill_gaps
: Fill the deleted positions with dashesformat_gapit_gwas
: Format GAPIT GWAS results for generating Manhattan plotsformat_haplotypes
: Format haplotypes for plotting using k-mer overlap informationformat_kmer_gwas
: Format k-mer GWAS results for downstream analysesgapit_vcf
: Launch a MLM GWAS analysis on a VCF file read with VariantAnnotation::readVcf using GAPITget_haplotypes
: Extract and filter the haplotypes from a set of sequencesgg_color_hue
: Select colors for a discrete scale as in ggplot2gg_hue
: Generating a vector of colors comparable to those used by ggplot2grid.colorscale
: Plot the color scale used in a haplotype plotgrid.haplotypes
: Plot a set of haplotypes using grid functionsgrid.phenotable
: Plot a contingency table of observed phenotypes and haplotypesis_valid_ld
: Check the validity of an LD matrixkmer_ld
: Compute the pairwise LD between k-mers based on their presence/absenceld_plot
: Plot an LD matrix using grid functionsld_sort
: Re-arrange the samples in an LD matrix according to user-specified criterialink_phenotypes
: Link haplotypes to their observed phenotypesmafft_align
: Align a set of sequences using mafftmanhattanGrob
: A function that returns a graphical object (grob) representing a manhattan plotmap_color
: Map a set of numeric values onto a color palettematch_kmers
: Match a set of k-mers to positions on sequencesnucdiff
: Find positions that differ between two sequencespvalueGrob
: A grob representing p-values of a GWAS analysis at a locus to plot using grid functionspvalue_tx_grob
: Arrange a transcript grob and several p-value grobs in the same plotread_fasta
: Read a fasta file into a named character vectorread_kmer_pvalues
: Read the (sorted) k-mer p-values from the output of a k-mer GWAS analysissubsample_kmers
: Subsample from a set of significant k-mers prior to computing LDtranscriptGrob
: Generate a grob representing a transcript using grid functionstranscriptsGrob
: Generate a grob of all the possible transcripts for genes in a genomic region using gridvcf_to_gapit
: Convert VCF records read with VariantAnnotation::readVCF into a format usable by GAPIT
If you use this software, plase cite our publication:
Lemay, M.-A., de Ronne, M., Bélanger, R., & Belzile, F. (2023). k-mer-based GWAS enhances the discovery of causal variants and candidate genes in soybean. The Plant Genome, 16, e20374. doi:10.1002/tpg2.20374
Voichek, Y. and Weigel, D., 2020. Identifying genetic variants underlying phenotypic variation in plants without complete genomes. Nature Genetics, 52(5), pp.534-540. https://doi.org/10.1038/s41588-020-0612-7