dearseq #1307

borishejblum · 2019-11-25T18:42:33Z

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Repository: https://github.com/borishejblum/dearseq

Confirm the following by editing each check box to '[x]'

I understand that by submitting my package to Bioconductor,
the package source and all review commentary are visible to the
general public.
I have read the Bioconductor Package Submission
instructions. My package is consistent with the Bioconductor
Package Guidelines.
I understand that a minimum requirement for package acceptance
is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
Passing these checks does not result in automatic acceptance. The
package will then undergo a formal review and recommendations for
acceptance regarding other Bioconductor standards will be addressed.
My package addresses statistical or bioinformatic issues related
to the analysis and comprehension of high throughput genomic data.
I am committed to the long-term maintenance of my package. This
includes monitoring the support site for issues that users may
have, subscribing to the bioc-devel mailing list to stay aware
of developments in the Bioconductor community, responding promptly
to requests for updates from the Core team in response to changes in
R or underlying software.

I am familiar with the essential aspects of Bioconductor software
management, including:

The 'devel' branch for new packages and features.
The stable 'release' branch, made available every six
months, for bug fixes.
Bioconductor version control using Git
(optionally via GitHub).

For help with submitting your package, please subscribe and post questions
to the bioc-devel mailing list.

bioc-issue-bot · 2019-11-25T18:42:34Z

Hi @borishejblum

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: dearseq
Type: Package
Title: Differential Expression Analysis for RNA-seq data
Version: 0.99.0
Date: 2019-11-21
Authors@R: c(person("Denis", "Agniel", role = c("aut"),
          email = "denis.agniel@gmail.com"),
	           person("Boris P.", "Hejblum", role = c("aut", "cre"),
	                  email = "boris.hejblum@u-bordeaux.fr"),
	           person("Marine", "Gauthier", role = c("aut"),
	                  email = "marine.gauthier@u-bordeaux.fr"))
Depends: R (>= 3.6.0)
Imports: CompQuadForm, ggplot2, GSA, KernSmooth, methods, parallel, pbapply, stats, statmod
Suggests: Biobase, BiocManager, edgeR, DESeq2, GEOquery, knitr, limma, readxl, rmarkdown, 
S4Vectors, SummarizedExperiment, testthat, covr
Description: Differential Expression Analysis RNA-seq data with variance 
   component score test accounting for data heteroscedasticity through
   precision weights. Perform both gene-wise and gene set analyses, 
   and can deal with repeated or longitudinal data. Methods are 
   detailed in: Agniel D & Hejblum BP (2017) Variance component score 
   test for time-course gene set analysis of longitudinal RNA-seq 
   data, Biostatistics, 18(4):589-604.
LazyData: true
License: GPL-2 | file LICENSE
biocViews: BiomedicalInformatics, CellBiology, DifferentialExpression, DNASeq,
        GeneExpression, Genetics, GeneSetEnrichment, ImmunoOncology, Regression, 
        RNASeq, Sequencing, SystemsBiology, TimeCourse, Transcription, 
        Transcriptomics
BugReports: https://github.com/borishejblum/dearseq/issues
Encoding: UTF-8
RoxygenNote: 7.0.0
VignetteBuilder: knitr

Add SSH keys to your GitHub account. SSH keys
will are used to control access to accepted Bioconductor
packages. See these instructions to add SSH keys to
your GitHub account.

bioc-issue-bot · 2019-11-25T19:59:56Z

A reviewer has been assigned to your package. Learn what to expect
during the review process.

IMPORTANT: Please read the instructions for setting
up a push hook on your repository, or further changes to your
repository will NOT trigger a new build.

bioc-issue-bot · 2019-11-25T20:04:12Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

mtmorgan · 2019-11-25T20:04:51Z

As preliminary comment, please clean the root level of your repository, during the Bioconductor submission process, to contain only files and folders required of a Bioconductor package. For instance, you could copy the current branch to 'full' and then remove files etc from master to bring the repository into conformance. When your package is accepted it will be cloned into the bioconductor git repository, and we do not want to include material not relevant to the nightly build of your package.

Please ensure that your package interoperates smoothly with standard Bioconductor representations. For instance, dear_seq() must accept a SummarizedExperiment, and this is the preferred way to demonstrate it's functionality in your vignette, e.g.,

se <- SummarizedExperiment(assay = London, colData = ...)
## further data manipulation, e.g., coordinated subsetting `se[, ...]`
dear_seq(se, ...)

borishejblum · 2019-11-25T23:34:40Z

Thank you very much for this preliminary feedback. We have created such a 'full' branch and removed the unecessary files from the 'master' branch. In addition, we have added support for SummarizedExperiment objects, and now demonstrate the package functionnality in the vignette using this input class.

bioc-issue-bot · 2019-11-26T12:06:56Z

Received a valid push; starting a build. Commits are:

60ad1b4 bumping version for Bioconductor check

bioc-issue-bot · 2019-11-26T12:12:28Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot · 2019-11-26T13:10:18Z

Received a valid push; starting a build. Commits are:

3e8dba5 attempt at fixing warning on BiocCheck from doc th...

bioc-issue-bot · 2019-11-26T13:14:45Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot · 2019-11-26T13:21:42Z

Received a valid push; starting a build. Commits are:

09d34aa 2nd attempt at fixing warning on BiocCheck from do...

bioc-issue-bot · 2019-11-26T13:30:46Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

bioc-issue-bot · 2019-11-26T13:55:27Z

Received a valid push; starting a build. Commits are:

520e7fa forgot to BUMP version

bioc-issue-bot · 2019-11-26T14:04:51Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

mtmorgan · 2019-12-09T03:38:10Z

9 December, 2019

Specific comments should be addressed in all locations where they apply.

DESCRIPTION, NAMESPACE

Provide a title that differentiates your package from others

vignettes

dearseqUserguide.Rmd:72 vignettes should not install packages (i.e., this code chunk should be marked eval = FALSE); the build system installs packages listed in the Suggests: field.
dearseqUserguide.RMd:103 very long lines in vignette code chunks makes them difficult to read. Revise as
```
group[which(infos_df$Status[-group_London_LTBI]=="ActiveTB")] <-
    rep(1,length(which(infos_df$Status[-group_London_LTBI]=="ActiveTB")))
```
or (much better!!!)
```
idx <- infos_df$Status[-group_London_LTBI] == "ActiveTB"
group[idx] <- 1
```
please pay attention to readaiblity throughout your code, avoiding the long lines involving multiple computational transformations and repetitive code illustrated in this short code chunk. Please heavily revise your vignette and code, along the lines indicated above, to improve readaiblity, robustness, and efficiency.

dearseqUserguide.RMd:118: London is not included in the package

> library(dearseq)
> data("London")
Warning message:
In data("London") : data set ‘London’ not found

dearseqUserguide.Rmd: 153 rather than having to insert a comment to 'Remember the Latent TB samples are excluded' consider making a single SummarizedExperiment with all the data, and then subsetting se[, !se$Status %in% "LTBI"] or similar.
dearseqUserguide.Rmd:191 Consider using a standard gene set formulation like BiocSet to represent these data. I don't understand why this code chunk is marked 'eval = FALSE', the problem with unevaluated code chunks is that the chunk becomes only a 'picture of code', is inevitably not maintained when, e.g., arguments names are changed, and then presents the user with a misleading and frustrating understanding of how the code is supposed to work.
```
baduel =
    do.call(BiocSet, baduel_gmt$genesets) %>%
    mutate_set(description = unlist(baduel_gmt$geneset.description))
```

R / man

generally seems to be ok and well-documented
PBT_gmt format is reported as a gmt object but it is really just a list with three, potentially redundant, elements (e.g., the names of baduel_gmt gene sets are provided twice, causing confusion and potential for error). Describe the content of these lists more carefully or use a formal object (e.g., from BiocSet) to reduce ambiguity.
dgsa_seq.R:335 adding !!! to warnings just leads to a kind of arms race; if it's a warning, let the user know. If it's more important than a warning, make it an error.
dgsa_seq.R:50 consider more efficient rowVars() etc., including from the matrixStats package
dgsa_seq.R:366 it is almost never necessary to explicilty invoke rm(); if memory management is really important then a better strategy is to write modular code that, more-or-less as a convenient conseuqnce, isolates memory allocations to the body of the function, and can be garbage collected automatically
dgsa_seq.R:417 options() belong to users and not to package authors. This code
```
options(warn = -1)
N_possible_perms <- factorial(ncol(y_lcpm))
options(warn = 0)
```
changes the user option warn to 0, regardless of it's initial value. A better style is
```
oopt <- options(warn = -1)
N_possible_perms <- factorial(ncol(y_lcpm))
options(oopt)
```
as seen with
```
> options(warn = 2)
> oopt = options(warn = -1)
> options(oopt)
> getOption("warn")
[1] 2
```
but a first question is why one has to change this option instead of using suppressWarnings() and perhaps more importantly can the data be conditioned to avoid warning entirely? For instance, in the above, I'm not understanding why a warning is expected at all?
dgsa_seq.R:546 preferred style is to use tryCatch() rather than try() and if (inherits...)
sp_weights.R:112 usually warning(paste(...)) (similarly, stop(), message()) can be replaced with simply warning(...).
adding doPlot conflates the computational and the visualization aspects; it would be better to have a separate, stand-alone, function for plotting.

other

the tests/testthat/test_dear_seq.R serves no purpose and should be removed
for the purposes of review, please make sure that the master branch contains only material related to the package, e.g., the GSE107991 is not part of the package.

bioc-issue-bot · 2019-12-11T16:11:16Z

Received a valid push; starting a build. Commits are:

8b17258 update following bioconductor review

bioc-issue-bot · 2019-12-11T16:16:21Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot · 2019-12-11T18:20:22Z

Received a valid push; starting a build. Commits are:

d85f2e6 version bump

bioc-issue-bot · 2019-12-11T18:25:03Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot · 2019-12-11T19:22:48Z

Received a valid push; starting a build. Commits are:

57a5509 proper link to BiocSet + appveyor bioc update

bioc-issue-bot · 2019-12-11T19:28:58Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

borishejblum · 2019-12-11T19:33:16Z

11 December, 2019

First of all, thank you so much @mtmorgan for having spent time in reviewing our package and for your feedback !

Below are my specific responses to your comments.

9 December, 2019

Specific comments should be addressed in all locations where they apply.

DESCRIPTION, NAMESPACE

Provide a title that differentiates your package from others

The title has been changed to:
"Differential Expression Analysis for RNA-seq data through a robust variance component test"

vignettes

dearseqUserguide.Rmd:72 vignettes should not install packages (i.e., this code chunk should be marked eval = FALSE); the build system installs packages listed in the Suggests: field.

This chunk has been removed from the vignette.

dearseqUserguide.RMd:103 very long lines in vignette code chunks makes them difficult to read. Revise as
group[which(infos_df$Status[-group_London_LTBI]=="ActiveTB")] <-
    rep(1,length(which(infos_df$Status[-group_London_LTBI]=="ActiveTB")))
or (much better!!!)
idx <- infos_df$Status[-group_London_LTBI] == "ActiveTB"
group[idx] <- 1
please pay attention to readability throughout your code, avoiding the long lines involving multiple computational transformations and repetitive code illustrated in this short code chunk. Please heavily revise your vignette and code, along the lines indicated above, to improve readability, robustness, and efficiency.

The code has been revised to avoid such long lines. In this particular example, having a single Summarized Experiment (as suggested below) makes subsetting much clearer.

dearseqUserguide.RMd:118: London is not included in the package
> library(dearseq)
> data("London")
Warning message:
In data("London") : data set ‘London’ not found

This chunk has been fixed (indeed London data set is directly downloaded from GEO, and not included in the package).

dearseqUserguide.Rmd: 153 rather than having to insert a comment to 'Remember the Latent TB samples are excluded' consider making a single SummarizedExperiment with all the data, and then subsetting se[, !se$Status %in% "LTBI"] or similar.

A single Summarized Experiment with all the data has been created, and is subsetted as advised.

dearseqUserguide.Rmd:191 Consider using a standard gene set formulation like BiocSet to represent these data. I don't understand why this code chunk is marked 'eval = FALSE', the problem with unevaluated code chunks is that the chunk becomes only a 'picture of code', is inevitably not maintained when, e.g., arguments names are changed, and then presents the user with a misleading and frustrating understanding of how the code is supposed to work.
baduel =
    do.call(BiocSet, baduel_gmt$genesets) %>%
    mutate_set(description = unlist(baduel_gmt$geneset.description))

You are right, and this code chunk is now marked eval = TRUE.
Regarding the gene set object format, the gmt format seem to be quite well established as it is used by the Molecular Signature Database (http://software.broadinstitute.org/gsea/msigdb/index.jsp).
I have made the dgsa_seq() function compatible with BiocSet format inputs, however, such an input would add the following code lines to the vignettes:

baduel <- do.call(BiocSet, baduel_gmt$genesets) %>%
      mutate_set(description = unlist(baduel_gmt$geneset.description))
baduel_2gs <- baduel %>%
  filter_elementset(set %in% c("VR_TBG", "DE_KAvsTBG"))

I am not sure this improves the readability of the code, and I would therefore be enclined to stick with the gmt format given it is widely used as well.

R / man

generally seems to be ok and well-documented

PBT_gmt format is reported as a gmt object but it is really just a list with three, potentially redundant, elements (e.g., the names of baduel_gmt gene sets are provided twice, causing confusion and potential for error). Describe the content of these lists more carefully or use a formal object (e.g., from BiocSet) to reduce ambiguity.

The gmt format is now extensively documented inside the dearseq package (it is a format used by the Molecular Signature Database for instance).

dgsa_seq.R:335 adding !!! to warnings just leads to a kind of arms race; if it's a warning, let the user know. If it's more important than a warning, make it an error.

Those multiple ! have been removed as they are indeed unnecessary and misleading.

dgsa_seq.R:50 consider more efficient rowVars() etc., including from the matrixStats package

matrixStats::rowVars and matrixStats::colVars are indeed much more efficient and are now being used in place of apply(..., FUN = var).

dgsa_seq.R:366 it is almost never necessary to explicitly invoke rm(); if memory management is really important then a better strategy is to write modular code that, more-or-less as a convenient consequence, isolates memory allocations to the body of the function, and can be garbage collected automatically

All such (unnecessary) calls to rm() have been removed.

dgsa_seq.R:417 options() belong to users and not to package authors. This code
options(warn = -1)
N_possible_perms <- factorial(ncol(y_lcpm))
options(warn = 0)
changes the user option warn to 0, regardless of it's initial value. A better style is
oopt <- options(warn = -1)
N_possible_perms <- factorial(ncol(y_lcpm))
options(oopt)
as seen with
> options(warn = 2)
> oopt = options(warn = -1)
> options(oopt)
> getOption("warn")
[1] 2
but a first question is why one has to change this option instead of using suppressWarnings() and perhaps more importantly can the data be conditioned to avoid warning entirely? For instance, in the above, I'm not understanding why a warning is expected at all?

Indeed, these Warnings suppressions have been removed as only strictly positive integers should be supplied to factorial() in this code.

dgsa_seq.R:546 preferred style is to use tryCatch() rather than try() and if (inherits...)

try() + if (inherits...) have been replaced by tryCatch()

sp_weights.R:112 usually warning(paste(...)) (similarly, stop(), message()) can be replaced with simply warning(...).

The unnecessary paste()s have been removed

adding doPlot conflates the computational and the visualization aspects; it would be better to have a separate, stand-alone, function for plotting.

a standalone plotting function plot_weights() has been added, and the doPlot argument has been removed to better separate computation from visualization.

other

the tests/testthat/test_dear_seq.R serves no purpose and should be removed

This empty file has been removed.

for the purposes of review, please make sure that the master branch contains only material related to the package, e.g., the GSE107991 is not part of the package.

The master branch has been stripped of any material not directly related to the package.

Thanks again for this review,
Boris

bioc-issue-bot · 2020-01-07T16:20:50Z

Received a valid push; starting a build. Commits are:

71d1383 usual version bump forgetting

bioc-issue-bot · 2020-01-07T16:25:47Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

borishejblum · 2020-01-09T16:20:57Z

Dear @mtmorgan,

I have made the solicited changes to the dearseq package (please see my comment above).

Let me know If there remain things that would need to be fixed.

Thanks a lot for you time reviewing our package,
Boris

bioc-issue-bot · 2020-01-15T09:53:03Z

Your package has been accepted. It will be added to the
Bioconductor Git repository and nightly builds. Additional
information will be posed to this issue in the next several
days.

Thank you for contributing to Bioconductor!

mtmorgan · 2020-01-15T10:11:48Z

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/borishejblum.keys is not empty), then no further steps are required. Otherwise, do the following:

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/
https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("dearseq"). The package 'landing page' will be created at

https://bioconductor.org/packages/dearseq

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

bioc-issue-bot added the 1. awaiting moderation label Nov 25, 2019

bioc-issue-bot added 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place and removed 1. awaiting moderation labels Nov 25, 2019

bioc-issue-bot assigned mtmorgan Nov 25, 2019

bioc-issue-bot added the WARNINGS label Nov 25, 2019

bioc-issue-bot added the VERSION BUMP REQUIRED label Nov 26, 2019

bioc-issue-bot removed the VERSION BUMP REQUIRED label Nov 26, 2019

bioc-issue-bot added OK VERSION BUMP REQUIRED and removed WARNINGS labels Nov 26, 2019

bioc-issue-bot removed the VERSION BUMP REQUIRED label Nov 26, 2019

bioc-issue-bot added WARNINGS VERSION BUMP REQUIRED and removed OK labels Dec 11, 2019

bioc-issue-bot removed the VERSION BUMP REQUIRED label Dec 11, 2019

bioc-issue-bot added OK and removed WARNINGS labels Dec 11, 2019

bioc-issue-bot added the VERSION BUMP REQUIRED label Jan 7, 2020

bioc-issue-bot removed the VERSION BUMP REQUIRED label Jan 7, 2020

mtmorgan added 3a. accepted will be ingested into Bioconductor daily builder for distribution and removed 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place labels Jan 15, 2020

mtmorgan closed this as completed Jan 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dearseq #1307

dearseq #1307

borishejblum commented Nov 25, 2019

bioc-issue-bot commented Nov 25, 2019

bioc-issue-bot commented Nov 25, 2019

bioc-issue-bot commented Nov 25, 2019

mtmorgan commented Nov 25, 2019

borishejblum commented Nov 25, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

mtmorgan commented Dec 9, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

borishejblum commented Dec 11, 2019

bioc-issue-bot commented Jan 7, 2020

bioc-issue-bot commented Jan 7, 2020

borishejblum commented Jan 9, 2020 •

edited

Loading

bioc-issue-bot commented Jan 15, 2020

mtmorgan commented Jan 15, 2020

dearseq #1307

dearseq #1307

Comments

borishejblum commented Nov 25, 2019

bioc-issue-bot commented Nov 25, 2019

bioc-issue-bot commented Nov 25, 2019

bioc-issue-bot commented Nov 25, 2019

mtmorgan commented Nov 25, 2019

borishejblum commented Nov 25, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

bioc-issue-bot commented Nov 26, 2019

mtmorgan commented Dec 9, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

bioc-issue-bot commented Dec 11, 2019

borishejblum commented Dec 11, 2019

bioc-issue-bot commented Jan 7, 2020

bioc-issue-bot commented Jan 7, 2020

borishejblum commented Jan 9, 2020 • edited Loading

bioc-issue-bot commented Jan 15, 2020

mtmorgan commented Jan 15, 2020

borishejblum commented Jan 9, 2020 •

edited

Loading