Skip to content

Commit

Permalink
Add new datasets namibia_data and regional_allele_frequencies. Update…
Browse files Browse the repository at this point in the history
… vignette with code to use the new datasets. (#27)

Add new datasets namibia_data and regional_allele_frequencies. Update vignette with code to use the new datasets.

- Added new datasets namibia_data and regional_allele_frequencies.
- Updated vignette with code to use the new datasets.
  • Loading branch information
m-murphy committed Jul 10, 2024
1 parent 7b56ba7 commit af7ff3e
Show file tree
Hide file tree
Showing 12 changed files with 98 additions and 21 deletions.
3 changes: 1 addition & 2 deletions .lintr
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
linters: linters_with_defaults(
line_length_linter(120),
indentation_linter(2)
)
indentation_linter(2))
encoding: "UTF-8"
5 changes: 3 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ Imports:
stats,
purrr,
rlang,
ggplot2,
ggplot2
URL: https://github.com/EPPIcenter/moire, https://eppicenter.github.io/moire/, https://eppicenter.ucsf.edu/resources
BugReports: https://github.com/EPPIcenter/moire/issues
Roxygen: list(markdown = TRUE)
Expand All @@ -43,7 +43,8 @@ Suggests:
rmarkdown,
markdown,
forcats,
testthat (>= 3.0.0)
testthat (>= 3.0.0),
parallelly
VignetteBuilder: knitr
Depends:
R (>= 4.0.0)
Expand Down
25 changes: 25 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,28 @@
#' MCMC results from using the packaged simulated data and calling `run_mcmc()`
#'
"mcmc_results"

#' Genetic and epidemiological data from Namibia
#'
#' A dataset containing the genetic and epidemiological data from Namibia
#'
#' @format A data frame with 8 columns and 2585 rows:
#' \describe{
#' \item{sample_id}{Sample ID}
#' \item{HealthFacility}{Health facility}
#' \item{HealthDistrict}{Health district}
#' \item{Region}{Region}
#' \item{Country}{Country}
#' \item{locus}{Locus}
#' \item{allele}{Allele}
#' }
#' @source \url{https://doi.org/10.7554/eLife.43510.018}
"namibia_data"

#' Allele frequencies for different regions
#'
#' A list of allele frequencies for different regions, estimated from the pf7k dataset.
#'
#' @format A list of lists, where each list element is a list of allele frequencies
#' for a specific region.
"regional_allele_frequencies"
2 changes: 2 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,7 @@ reference:
- contents:
- simulated_data
- mcmc_results
- namibia_data
- regional_allele_frequencies


7 changes: 7 additions & 0 deletions data-raw/namibia_data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
namibia_data <- readxl::read_excel("data-raw/xls/elife-43510-supp1-v2.xlsx", skip = 1) |>
dplyr::rename(sample_id = ID) |>
tidyr::pivot_longer(cols = 6:31, names_to = "locus", values_to = "allele") |>
tidyr::separate_rows(allele, sep = ";") |>
tidyr::drop_na()

usethis::use_data(namibia_data, overwrite = TRUE, compress = "xz")
Binary file added data-raw/xls/elife-43510-supp1-v2.xlsx
Binary file not shown.
Binary file added data/namibia_data.rda
Binary file not shown.
Binary file added data/regional_allele_frequencies.rda
Binary file not shown.
28 changes: 28 additions & 0 deletions man/namibia_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions man/regional_allele_frequencies.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 10 additions & 5 deletions src/Makevars
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
ifdef ENABLE_GLIBCXX_DEBUG
PKG_CXXFLAGS+=-D_GLIBCXX_DEBUG
endif

ifdef ENABLE_PROFILER
PKG_LIBS +=-lprofiler
PKG_CXXFLAGS +=-DENABLE_PROFILER
PKG_LIBS+=-lprofiler
PKG_CXXFLAGS+=-DENABLE_PROFILER
endif
PKG_CXXFLAGS +=$(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS +=$(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS +=$(shell ${R_HOME}/bin/Rscript -e "RcppParallel::RcppParallelLibs()")

PKG_CXXFLAGS+=$(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS+=$(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS+=$(shell ${R_HOME}/bin/Rscript -e "RcppParallel::RcppParallelLibs()")
17 changes: 5 additions & 12 deletions vignettes/namibia.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ vignette: >
%\VignetteIndexEntry{Application to Namibia}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
%\VignetteDepends{parallelly, dplyr}
---

```{r setup, include=FALSE}
Expand All @@ -19,18 +20,10 @@ knitr::opts_chunk$set(
The following code reproduces our analysis of data from Namibia as described in our paper.

```{r namibia, eval=FALSE, include=TRUE}
library(moire)
library(readr)
library(dplyr)
# Download Nambia data from here https://doi.org/10.7554/eLife.43510.018
# and save it as namibia_data.xlsx in the working directory
full_dat <- readxl::read_excel("namibia_data.xlsx", skip = 1) |>
dplyr::rename(sample_id = ID) |>
tidyr::pivot_longer(cols = 6:31, names_to = "locus", values_to = "allele") |>
tidyr::separate_rows(allele, sep = ";") |>
tidyr::drop_na()
full_dat <- moire::namibia_data
epi_dat <- full_dat |>
dplyr::select(sample_id, HealthFacility, HealthDistrict, Region, Country) |>
Expand All @@ -42,15 +35,15 @@ all_hfs <- epi_dat |>
verbose <- F
allow_relatedness <- TRUE
burnin <- 5e3
num_samples <- 1e4
burnin <- 5e2
num_samples <- 1e2
r_alpha <- 1
r_beta <- 1
eps_pos_alpha <- 1
eps_pos_beta <- 1
eps_neg_alpha <- 1
eps_neg_beta <- 1
num_threads <- parallely::availableCores() - 1
num_threads <- parallelly::availableCores() - 1
for (hf in all_hfs) {
hf_dat <- full_dat |>
Expand Down

0 comments on commit af7ff3e

Please sign in to comment.