This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.
periodicDNA is available in Bioconductor. To install the current release use:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("periodicDNA")
For advanced users, the most recent periodicDNA can be installed from Github as follows (might be buggy):
install.packages("devtools")
devtools::install_github("js2264/periodicDNA")
library(periodicDNA)
If you are using periodicDNA in your research, please cite:
periodicDNA: an R/Bioconductor package to investigate k-mer periodicity in DNA
J. Serizay & J. Ahringer
F1000Research, 2021
Distinctive regulatory architectures of germline-active and somatic genes in C. elegans
J. Serizay, Y. Dong, J. Jänes, M. Chesney, C. Cerrato & J. Ahringer
Genome Research, 2020
periodicDNA includes a vignette where its usage is illustrated. To access the vignette, please use:
vignette('periodicDNA')
The two main user-level functions of periodicDNA are getPeriodicity()
and
getPeriodicityTrack()
.
getPeriodicity()
is used to compute the power spectral density (PSD) of a chosen k-mer (i.e.TT
) in a set of sequences. The PSD score at a given period indicates the strength of the k-mer at this period.getPeriodicityTrack()
can be used to generate linear tracks representing the periodicity strength of a given k-mer at a chosen period, over genomic loci of interest.
data(ce11_TSSs)
PSDs <- getPeriodicity(
ce11_TSSs[['Ubiq.']],
genome = 'BSgenome.Celegans.UCSC.ce11',
motif = 'TT',
BPPARAM = MulticoreParam(12),
n_shuffling = 100
)
plotPeriodicityResults(PSDs)
data(ce11_proms)
WW_10bp <- getPeriodicityTrack(
genome = 'BSgenome.Celegans.UCSC.ce11',
granges = ce11_proms,
motif = 'WW',
period = 10,
bw_file = 'WW-10-bp-periodicity_over-proms.bw',
BPPARAM = MulticoreParam(12)
)
Warning: It is recommended to run this command across many processors
using BiocParallel. This command typically takes one day to produce
a periodicity track over 15,000 GRanges of 150 bp (with default parameters)
using BPPARAM = MulticoreParam(12)
.
It is highly recommended to run this command in a new screen
session.
Code contributions, bug reports, fixes and feature requests are most welcome. Please make any pull requests against the master branch at https://github.com/js2264/periodicityDNA and file issues at https://github.com/js2264/periodicityDNA/issues
periodicDNA is licensed under the GPL-3 license.