Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RNASeqHelper #3261

Open
10 tasks done
mtekman opened this issue Dec 24, 2023 · 29 comments
Open
10 tasks done

RNASeqHelper #3261

mtekman opened this issue Dec 24, 2023 · 29 comments
Assignees
Labels
2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place OK

Comments

@mtekman
Copy link

mtekman commented Dec 24, 2023

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

  • I understand that by submitting my package to Bioconductor,
    the package source and all review commentary are visible to the
    general public.

  • I have read the Bioconductor Package Submission
    instructions. My package is consistent with the Bioconductor
    Package Guidelines.

  • I understand Bioconductor Package Naming Policy and acknowledge
    Bioconductor may retain use of package name.

  • I understand that a minimum requirement for package acceptance
    is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
    Passing these checks does not result in automatic acceptance. The
    package will then undergo a formal review and recommendations for
    acceptance regarding other Bioconductor standards will be addressed.

  • My package addresses statistical or bioinformatic issues related
    to the analysis and comprehension of high throughput genomic data.

  • I am committed to the long-term maintenance of my package. This
    includes monitoring the support site for issues that users may
    have, subscribing to the bioc-devel mailing list to stay aware
    of developments in the Bioconductor community, responding promptly
    to requests for updates from the Core team in response to changes in
    R or underlying software.

  • I am familiar with the Bioconductor code of conduct and
    agree to abide by it.

I am familiar with the essential aspects of Bioconductor software
management, including:

  • The 'devel' branch for new packages and features.
  • The stable 'release' branch, made available every six
    months, for bug fixes.
  • Bioconductor version control using Git
    (optionally via GitHub).

For questions/help about the submission process, including questions about
the output of the automatic reports generated by the SPB (Single Package
Builder), please use the #package-submission channel of our Community Slack.
Follow the link on the home page of the Bioconductor website to sign up.

@bioc-issue-bot
Copy link
Collaborator

Hi @mtekman

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: RNASeqHelper
Type: Package
Title: Generate Heatmaps and Volcano Plots for your DESeq2 Genes of Interest
Version: 0.99.0
Authors@R: 
    c(person(given="Mehmet", family="Tekman",
        email = "mtekman89@gmail.com",
        role = c("aut", "cre"), comment = c(ORCID = "0000-0002-4181-2676")),
        person(given="Sebastian", family="Arnold",
        email = "sebastian.arnold@pharmakol.uni-freiburg.de",
        role = c("fnd"), comment = c(ORCID = "0000-0002-2688-9210")))
Description:
    Perform a full DESeq2 analysis for your RNA-seq data, generating
    colourful Volcano and Kmeans-clustered Heatmaps, along with
    time-series gene plots for genes of interest. QC-metrics that such
    as PCA validation are built in, and the heatmaps generated show
    z-score scaled expression between contrasts for contrasted samples
    and all. Time series plots show gene trends on normalised,
    corrected, and scaled data, for varying cluster correlation
    thresholds.
License: GPL-3
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
VignetteBuilder: knitr
biocViews: RNASeq, QualityControl, GeneExpression
Imports:
    pheatmap (>= 1.0.12), patchwork (>= 1.1.3), DESeq2 (>= 1.40.2),
    dplyr (>= 1.1.3), readr (>= 2.1.4), tibble (>= 3.2.1),
    ggplot2 (>= 3.4.4), ComplexHeatmap (>= 2.16.0), ggrepel (>= 0.9.4),
    purrr (>= 1.0.2), tidyr (>= 1.3.0), grDevices, graphics,
    grid, stats, utils
Suggests:
    BiocStyle, knitr, rmarkdown, testthat (>= 3.0.0), svglite (>= 2.1.3)
Depends:
    SummarizedExperiment (>= 1.30.2)
URL: https://gitlab.com/mtekman/rnaseqhelper
BugReports: https://gitlab.com/mtekman/rnaseqhelper/issues
Config/testthat/edition: 3

@bioc-issue-bot bioc-issue-bot added the 1. awaiting moderation submitted and waiting clearance to access resources label Dec 24, 2023
@lshep
Copy link
Contributor

lshep commented Dec 27, 2023

Please use tempdir() to specify the temporary output directory instead of manually "/tmp" this ensure cross OS compatibility and write access by the user making the calls.

@lshep lshep added the 3e. pending pre-review changes review has indicated blocking concern that needs attention label Dec 27, 2023
@mtekman
Copy link
Author

mtekman commented Dec 28, 2023

Added changes:

  • now uses tempdir()
  • minor spacing fixes for other warnings

@lshep lshep added pre-check passed pre-review performed and ready to be added to git and removed 3e. pending pre-review changes review has indicated blocking concern that needs attention labels Jan 9, 2024
@bioc-issue-bot
Copy link
Collaborator

Your package has been added to git.bioconductor.org to continue the
pre-review process. A build report will be posted shortly. Please
fix any ERROR and WARNING in the build report before a reviewer is
assigned or provide a justification on why you feel the ERROR or
WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting
up remotes to push to git.bioconductor.org. All changes should be
pushed to git.bioconductor.org moving forward. It is required to push a
version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org
access. To manage keys and future access you may want to active your
Bioconductor Git Credentials Account

@bioc-issue-bot bioc-issue-bot added pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean and removed 1. awaiting moderation submitted and waiting clearance to access resources pre-check passed pre-review performed and ready to be added to git labels Jan 9, 2024
@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "TIMEOUT, skipped".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RNASeqHelper to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 8cffbf9ba1202d8a2d1100548feb12715c30f246

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
macOS 12.7.1 Monterey: RNASeqHelper_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RNASeqHelper to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 8d787e511e4c283fc431cd2ffebcce187d840803

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
macOS 12.7.1 Monterey: RNASeqHelper_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RNASeqHelper to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 297e7fb08038a4708dd73dcbc076163263f3cd0a

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
macOS 12.7.1 Monterey: RNASeqHelper_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RNASeqHelper to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@mtekman
Copy link
Author

mtekman commented Jan 12, 2024

I keep hitting this error with svglite. It's definitely a required module for the svg device to work.

It builds fine locally on my machine, but it seems to fail on the CI. Any ideas?

@lshep
Copy link
Contributor

lshep commented Jan 12, 2024

I just did a quick check and the version is 2.1.2 on merdia1 but 2.1.3 is required. @jwokaty can you check why this CRAN package is not picking up the most recent version on CRAN; https://cran.r-project.org/web/packages/svglite/index.html. I will note I just restarted the linux builder and checked the version there and that is 2.1.3 so on the next try hopefully you would get farther on that OS check

@mtekman
Copy link
Author

mtekman commented Jan 12, 2024

okay, I will push tomorrow just in case more time is needed, thanks!

@jwokaty
Copy link

jwokaty commented Jan 12, 2024

@mtekman svglite 2.1.3 is installed on merida1 now. It wasn't installed automatically because the CRAN macOS binaries lag behind the source packages, so I had to manually update it.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 24db05b29d1e244576bafeaee21a952e34b2d5c7

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
macOS 12.7.1 Monterey: RNASeqHelper_0.99.3.tar.gz
Linux (Ubuntu 22.04.2 LTS): RNASeqHelper_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RNASeqHelper to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@mtekman
Copy link
Author

mtekman commented Jan 13, 2024

Thanks all! The message I am getting now is:

WARNING: R CMD check exceeded 10 min requirement

My examples (and indeed the vignette) require at least 1000 features to function properly. Too small, the dispersion of the DESeq2 fitting become useless, and clustering produces nonsensical clustering that serves no educational purpose.

Is the 10min limit a hard requirement?

@lshep
Copy link
Contributor

lshep commented Jan 30, 2024

It will be important to try and cut back on the check time as much as possible.

@lshep lshep added 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place and removed pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean labels Jan 30, 2024
@bioc-issue-bot
Copy link
Collaborator

A reviewer has been assigned to your package for an indepth review.
Please respond accordingly to any further comments from the reviewer.

@lshep
Copy link
Contributor

lshep commented Feb 9, 2024

In your vignette, you currently are writing to the home directory when you are doing ~/myanalysis/ and ~/myanalysis2; This is very problematic! Please do not write to the home directory and write all output to tempdir() instead of ~/ .

@mtekman
Copy link
Author

mtekman commented Feb 13, 2024

Hi @lshep, thanks for the warning!

I've also noticed that my code falls apart if there is no "time" component in my phenotype data, so this is something I will fix before I ask you to review it again.

Cheers!

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 1ea73ad44b62b0a6a8dbbd13653dedd62e6a629d

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 22.04.3 LTS): RNASeqHelper_0.99.7.tar.gz
macOS 12.7.1 Monterey: RNASeqHelper_0.99.7.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RNASeqHelper to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@mtekman
Copy link
Author

mtekman commented Feb 17, 2024

Hello again, new version with:

  • Changed ~/tempdir()
  • Fix to phenotype data without time info
  • Linting checks, default checks, biocchecks
  • Reduced test time from 10m → 2m
  • BiocStyle in Vignette

Hopefully this is the one!
Happily awaiting your review.

@DarioS
Copy link

DarioS commented Feb 22, 2024

I have evaluated the software and have major revisions to request, mainly about improved Bioconductor interoperability..

  • Function and variable naming does not confirm to coding style requirements in Developer's Guide. camelCase not snake_case unlike tidyverse.
  • Depends on tidyverse variable types such as tibble rather than Bioconductor variable types such as DataFrame. e.g.
    \item{long_data}{A tibble containing columns: gene, condition, value, and optionally time}. Please switch.
  • Input RNA-seq data is required to be a plain numeric matrix.
    \item{tab}{a matrix of samples (columns) and genes (rows)}.
    Needs to support RangedSummarizedExperiment input to demonstrate improved interoperability with Bioconductor.
  • Package is named RNASeqHelper but it only supports DESeq2 as back-end. How about edgeR? edgeR is 24th-ranked and DESeq2 is 26th-ranked based on monthly downloads, so such a pipeline should support both frameworks at a minimum. Don't instantly exclude about 50% of your future citations!
  • The developers of DESeq2 have a special package named tximport for importing RNA-seq data. This is important for avoiding confounding by changes in gene length but not gene abundance. RNASeqHelper ignores this best-practice.
  • All of the code is in a single file 1344 lines long. Please modularise into separate .R files to make finding functions easy.
  • The vignette shows how RNASeqHelper operates on simulated data. Instead, please demonstrate how your package is useful for analysing a real data set. Please find a data set from Experiment Data view and use that instead. Also, note that using rnorm is unsuitable for simulating RNA-seq data. makeExampleDESeqDataSet from DESeq2 is ideal.
  • Vignette needs to follow the sections of the Developer's Guide Vignettes section. In particuar, there is no Installation section.
  • The unit tests seem incomplete. Most don't have expect or succeed or fail in them.

@mtekman
Copy link
Author

mtekman commented Feb 22, 2024

Dear Dario,

Thank you for your comprehensive review, you have definitely given me a
lot to think about.

  • RangedSummarizedExperiment, makeExampleDESeqDataSet, tximport,
    Installation, better unit tests.

Thanks for these hints, I will do these changes!

  • edgeR

I do not have too much experience with edgeR, but I think you're right that it would be a huge win to incorporate it into the library, and it would even offer a nice "keepgenes" alternative method.

  • tibble rather than DataFrame

I will remove tidyverse variable types, but can I at least keep dplyr %>% piping, or is that also bad?

  • Function and variable naming does not confirm to coding style requirements in Developer's Guide. camelCase not snake_case unlike tidyverse.

All of my functions were previously camelCase, but I was told that I failed many linting checks, and so changed them all to snake_case at the behest of lintr.

Since the default lintr rules do not seem to be encouraged, may I ask which lintr rules are? I found this, but the rules do not seem to work with the current version of lintr.

  • All of the code is in a single file 1344 lines long. Please
    modularise into separate .R files to make finding functions easy.

I find it much easier to keep things in a single R file, since I only need to mentally grapple with a single buffer. It also greatly helps when I produce a dependency map of my code (see image below). Is this a strict requirement?

image

Thanks again,
Mehmet

@DarioS
Copy link

DarioS commented Feb 29, 2024

The coding style requirements are in R Code chapter. You may keep functions in one file. How about base R pipe |> ?

@mtekman
Copy link
Author

mtekman commented Feb 29, 2024

You just blew my mind about the base R pipe, I genuinely thought that was a dplyr implementation only.

I guess I will try formatR as a linter, cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place OK
Projects
None yet
Development

No branches or pull requests

5 participants