Skip to content

Commit

Permalink
Fixed some warnings, incremented to 0.99.3
Browse files Browse the repository at this point in the history
  • Loading branch information
SiminaB committed Dec 11, 2016
1 parent 0077a2f commit 38a712d
Show file tree
Hide file tree
Showing 12 changed files with 60 additions and 39 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: swfdr
Title: Science-wise false discovery rate estimation
Version: 0.99.2
Version: 0.99.3
Author: Jeffrey T. Leek, Simina M. Boca
Maintainer: Simina M. Boca <smb310@georgetown.edu>, Jeffrey T. Leek <jtleek@gmail.com>
Description: This package allows users to estimate the science-wise false discovery rate from Jager and Leek,
Expand All @@ -11,7 +11,7 @@ Description: This package allows users to estimate the science-wise false discov
Depends:
R (>= 3.4)
Imports:
stats4, ggplot2, reshape2
stats4, ggplot2, reshape2, stats, dplyr
License: GPL (>= 3)
Encoding: UTF-8
LazyData: true
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

export(calculateSwfdr)
export(lm_pi0)
import(dplyr)
import(ggplot2)
import(reshape2)
import(stats4)
Expand Down
6 changes: 4 additions & 2 deletions R/BMI_GIANT_GWAS_sample-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
#' @docType data
#'
#' @usage data(BMI_GIANT_GWAS_sample)
#'
#'
#' @return Object of class tbl_df, tbl, data.frame.
#'
#' @format A data frame with 50,000 rows and 9 variables:
#' \describe{
#' \item{SNP}{ID for SNP (single nucleotide polymorphism)}
Expand All @@ -18,7 +20,7 @@
#' \item{N}{Total sample size considered for association of SNP and BMI}
#' \item{Freq_MAF_Int_Hapmap}{Three approximately equal intervals for the Hapmap MAFs}
#' }
#'
#'
#' @keywords datasets
#'
#' @source \url{https://www.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files#GWAS_Anthropometric_2015_BMI}
Expand Down
1 change: 1 addition & 0 deletions R/calculateSwfdr.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#' @return n Number of rounded p-values between certain cutpoints (0.005, 0.015, 0.025, 0.035, 0.045, 0.05)
#'
#' @import stats4
#' @import dplyr
#' @importFrom stats dbeta lsfit pbeta smooth.spline
#'
#' @examples
Expand Down
6 changes: 4 additions & 2 deletions R/journals_pVals-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@
#'
#' @docType data
#'
#' @usage data(journals_pVals)
#' @usage journals_pVals
#'
#' @format A data frame with 15,653 rows and 7 variables:
#' @return Object of class tbl_df, tbl, data.frame.
#'
#' @format A tbl data frame with 15,653 rows and 7 variables:
#' \describe{
#' \item{pvalue}{P-value}
#' \item{pvalueTruncated}{Equals to 1 if the p-value is truncated, 0 otherwise}
Expand Down
Binary file modified data/journals_pVals.RData
Binary file not shown.
21 changes: 12 additions & 9 deletions inst/doc/swfdrTutorial.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,35 @@
library(swfdr)

## ------------------------------------------------------------------------
data(journals_pVals)
colnames(journals_pVals)

## ------------------------------------------------------------------------
table(journals_pVals$year)
table(journals_pVals$journal)

## ------------------------------------------------------------------------
journals_pVals1 <- journals_pVals[journals_pVals$year==2005 &
journals_pVals$journal == "American Journal of Epidemiology" &
journals_pVals$pvalue < 0.05,]
journals_pVals1 <- dplyr::filter(journals_pVals,
year == 2005,
journal == "American Journal of Epidemiology",
pvalue < 0.05)

dim(journals_pVals1)

## ------------------------------------------------------------------------
tt <- journals_pVals1[,2]
tt <- data.frame(journals_pVals1)[,2]
rr <- rep(0,length(tt))
rr[tt == 0] <- (journals_pVals1[tt==0,1] == round(journals_pVals1[tt==0,1],2))
pVals <- journals_pVals1[,1]
resSwfdr <- calculateSwfdr(pValues = pVals, truncated = tt, rounded = rr, numEmIterations=100)
rr[tt == 0] <- (data.frame(journals_pVals1)[tt==0,1] ==
round(data.frame(journals_pVals1)[tt==0,1],2))
pVals <- data.frame(journals_pVals1)[,1]
resSwfdr <- calculateSwfdr(pValues = pVals,
truncated = tt,
rounded = rr, numEmIterations=100)
names(resSwfdr)

## ------------------------------------------------------------------------
resSwfdr

## ------------------------------------------------------------------------
data(BMI_GIANT_GWAS_sample)
head(BMI_GIANT_GWAS_sample)
dim(BMI_GIANT_GWAS_sample)

Expand Down
25 changes: 14 additions & 11 deletions inst/doc/swfdrTutorial.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,8 @@ The science-wise false discovery rate (swfdr) is defined in @JagerEtAl2013 as th
### Example: Estimate the swfdr based on p-values from biomedical journals

We include a dataset containing 15,653 p-values from articles in 5 biomedical journals (American Journal of Epidemiology, BMJ, Jama, Lancet, New England Journal of Medicine), over 11 years (2000-2010).
This is obtained from web-scraping, using the code at \url{https://github.com/jtleek/swfdr/blob/master/getPvalues.R} and can be loaded via:
This is obtained from web-scraping, using the code at \url{https://github.com/jtleek/swfdr/blob/master/getPvalues.R} and is already loaded in the package.
```{r}
data(journals_pVals)
colnames(journals_pVals)
```

Expand All @@ -52,19 +51,24 @@ This function estimates the swfdr. It inputs the following parameters:

Given that it runs an EM algorithm, it is somewhat computationally intensive. We show an example of applying it to all the p-values from the abstracts for articles published in the American Journal of Epidemiology in 2015. First, we subset the `journals_pVals` and only consider the p-values below $0.05$, as in @JagerEtAl2013:
```{r}
journals_pVals1 <- journals_pVals[journals_pVals$year==2005 &
journals_pVals$journal == "American Journal of Epidemiology" &
journals_pVals$pvalue < 0.05,]
journals_pVals1 <- dplyr::filter(journals_pVals,
year == 2005,
journal == "American Journal of Epidemiology",
pvalue < 0.05)
dim(journals_pVals1)
```

Next, we define vectors corresponding to the truncation status and the rouding status (defined as rounding to 2 significant digits) and use these vectors, along with the vector of p-values, and the number of EM iterations, as inputs to the `calculateSwfdr` function:
```{r}
tt <- journals_pVals1[,2]
tt <- data.frame(journals_pVals1)[,2]
rr <- rep(0,length(tt))
rr[tt == 0] <- (journals_pVals1[tt==0,1] == round(journals_pVals1[tt==0,1],2))
pVals <- journals_pVals1[,1]
resSwfdr <- calculateSwfdr(pValues = pVals, truncated = tt, rounded = rr, numEmIterations=100)
rr[tt == 0] <- (data.frame(journals_pVals1)[tt==0,1] ==
round(data.frame(journals_pVals1)[tt==0,1],2))
pVals <- data.frame(journals_pVals1)[,1]
resSwfdr <- calculateSwfdr(pValues = pVals,
truncated = tt,
rounded = rr, numEmIterations=100)
names(resSwfdr)
```

Expand Down Expand Up @@ -97,9 +101,8 @@ prior probability that a hypothesis is true or false.

We consider an example from the meta-analysis of data from a genome-wide association study (GWAS) for
body mass index (BMI) from @LockeEtAl2015. A subset of this data, corresponding to 50,000
single nucleotide polymorphisms (SNPs), can be loaded using:
single nucleotide polymorphisms (SNPs) is already loaded with the package.
```{r}
data(BMI_GIANT_GWAS_sample)
head(BMI_GIANT_GWAS_sample)
dim(BMI_GIANT_GWAS_sample)
```
Expand Down
Binary file modified inst/doc/swfdrTutorial.pdf
Binary file not shown.
3 changes: 3 additions & 0 deletions man/BMI_GIANT_GWAS_sample.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 5 additions & 2 deletions man/journals_pVals.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 14 additions & 11 deletions vignettes/swfdrTutorial.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,8 @@ The science-wise false discovery rate (swfdr) is defined in @JagerEtAl2013 as th
### Example: Estimate the swfdr based on p-values from biomedical journals

We include a dataset containing 15,653 p-values from articles in 5 biomedical journals (American Journal of Epidemiology, BMJ, Jama, Lancet, New England Journal of Medicine), over 11 years (2000-2010).
This is obtained from web-scraping, using the code at \url{https://github.com/jtleek/swfdr/blob/master/getPvalues.R} and can be loaded via:
This is obtained from web-scraping, using the code at \url{https://github.com/jtleek/swfdr/blob/master/getPvalues.R} and is already loaded in the package.
```{r}
data(journals_pVals)
colnames(journals_pVals)
```

Expand All @@ -52,19 +51,24 @@ This function estimates the swfdr. It inputs the following parameters:

Given that it runs an EM algorithm, it is somewhat computationally intensive. We show an example of applying it to all the p-values from the abstracts for articles published in the American Journal of Epidemiology in 2015. First, we subset the `journals_pVals` and only consider the p-values below $0.05$, as in @JagerEtAl2013:
```{r}
journals_pVals1 <- journals_pVals[journals_pVals$year==2005 &
journals_pVals$journal == "American Journal of Epidemiology" &
journals_pVals$pvalue < 0.05,]
journals_pVals1 <- dplyr::filter(journals_pVals,
year == 2005,
journal == "American Journal of Epidemiology",
pvalue < 0.05)
dim(journals_pVals1)
```

Next, we define vectors corresponding to the truncation status and the rouding status (defined as rounding to 2 significant digits) and use these vectors, along with the vector of p-values, and the number of EM iterations, as inputs to the `calculateSwfdr` function:
```{r}
tt <- journals_pVals1[,2]
tt <- data.frame(journals_pVals1)[,2]
rr <- rep(0,length(tt))
rr[tt == 0] <- (journals_pVals1[tt==0,1] == round(journals_pVals1[tt==0,1],2))
pVals <- journals_pVals1[,1]
resSwfdr <- calculateSwfdr(pValues = pVals, truncated = tt, rounded = rr, numEmIterations=100)
rr[tt == 0] <- (data.frame(journals_pVals1)[tt==0,1] ==
round(data.frame(journals_pVals1)[tt==0,1],2))
pVals <- data.frame(journals_pVals1)[,1]
resSwfdr <- calculateSwfdr(pValues = pVals,
truncated = tt,
rounded = rr, numEmIterations=100)
names(resSwfdr)
```

Expand Down Expand Up @@ -97,9 +101,8 @@ prior probability that a hypothesis is true or false.

We consider an example from the meta-analysis of data from a genome-wide association study (GWAS) for
body mass index (BMI) from @LockeEtAl2015. A subset of this data, corresponding to 50,000
single nucleotide polymorphisms (SNPs), can be loaded using:
single nucleotide polymorphisms (SNPs) is already loaded with the package.
```{r}
data(BMI_GIANT_GWAS_sample)
head(BMI_GIANT_GWAS_sample)
dim(BMI_GIANT_GWAS_sample)
```
Expand Down

0 comments on commit 38a712d

Please sign in to comment.