Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multicrispr #1486

Closed
8 tasks done
bhagwataditya opened this issue Apr 24, 2020 · 33 comments
Closed
8 tasks done

multicrispr #1486

bhagwataditya opened this issue Apr 24, 2020 · 33 comments
Assignees

Comments

@bhagwataditya
Copy link

@bhagwataditya bhagwataditya commented Apr 24, 2020

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

  • I understand that by submitting my package to Bioconductor,
    the package source and all review commentary are visible to the
    general public.

  • I have read the Bioconductor Package Submission
    instructions. My package is consistent with the Bioconductor
    Package Guidelines.

  • I understand that a minimum requirement for package acceptance
    is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
    Passing these checks does not result in automatic acceptance. The
    package will then undergo a formal review and recommendations for
    acceptance regarding other Bioconductor standards will be addressed.

  • My package addresses statistical or bioinformatic issues related
    to the analysis and comprehension of high throughput genomic data.

  • I am committed to the long-term maintenance of my package. This
    includes monitoring the support site for issues that users may
    have, subscribing to the bioc-devel mailing list to stay aware
    of developments in the Bioconductor community, responding promptly
    to requests for updates from the Core team in response to changes in
    R or underlying software.

I am familiar with the essential aspects of Bioconductor software
management, including:

  • The 'devel' branch for new packages and features.
  • The stable 'release' branch, made available every six
    months, for bug fixes.
  • Bioconductor version control using Git
    (optionally via GitHub).

For help with submitting your package, please subscribe and post questions
to the bioc-devel mailing list.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 24, 2020

Hi @bhagwataditya

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: multicrispr
Title: Multi-locus multi-purpose Crispr/Cas design
Version: 0.99.0
Authors@R: c(person("Aditya",   "Bhagwat",    NULL, "aditya.bhagwat@mpi-bn.mpg.de",  c("aut", "cre")),
  person( "Johannes", "Graumann",   NULL, 'johannes.graumann@mpi-bn.mpg.de', "sad"),
  person( "Mette",    "Bentsen",    NULL, "mette.bentsen@mpi-bn.mpg.de",     "ctb"),
  person( "Jens",     "Preussner",  NULL, "jens.preussner@mpi-bn.mpg.de",    "ctb"),
  person( "Michael",  "Lawrence",   NULL, "lawrence.michael@gene.com",       "ctb"),
  person( "Herve",    "Pages",      NULL, "hpages@fredhutch.org",            "ctb"),
  person( "Mario",    "Looso",      NULL, "mario.looso@mpi-bn.mpg.de",     c("sad","rth")))
Description: Contains functions to facilitate the design of a multi-locus Crispr/Cas library.
License: GPL-2
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.0
Depends:
    R (>= 4.0)
Imports:
    assertive,
    AnnotationHub,
    BiocGenerics,
    Biostrings,
    BSgenome,
    CRISPRseek,
    data.table,
    GenomeInfoDb,
    GenomicFeatures,
    GenomicRanges,
    ggplot2,
    grid,
    Gviz, 
    karyoploteR,
    magrittr,
    methods,
    parallel,
    plyranges,
    Rbowtie,
    reticulate,
    rtracklayer,
    stringi, 
    tidyr,
    utils
Suggests: 
    BiocStyle,
    BSgenome.Hsapiens.UCSC.hg38,
    BSgenome.Mmusculus.UCSC.mm10,
    BSgenome.Scerevisiae.UCSC.sacCer1,
    dbplyr,
    dplyr,
    ensembldb,
    htmltools,
    htmlwidgets,
    IRanges,
    knitr,
    rmarkdown,
    testthat,
    TxDb.Mmusculus.UCSC.mm10.knownGene
VignetteBuilder: knitr
biocViews: CRISPR, Software
BugReports: https://github.com/loosolab/multicrispr/issues
URL: https://github.com/loosolab/multicrispr

Add SSH keys to your GitHub account. SSH keys
will are used to control access to accepted Bioconductor
packages. See these instructions to add SSH keys to
your GitHub account.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 24, 2020

A reviewer has been assigned to your package. Learn what to expect
during the review process.

IMPORTANT: Please read the instructions for setting
up a push hook on your repository, or further changes to your
repository will NOT trigger a new build.

@mtmorgan
Copy link
Contributor

@mtmorgan mtmorgan commented Apr 24, 2020

As preliminary comments

  • Can you please expand the Description: field of the DESCRIPITION file to provide a more comprehensive abstract of your package?

  • The large number of dependencies seemed surprising given the focused nature of your package. Please examine these very carefully to evaluate the trade-off between features enabled and maintenance cost.

  • It never seems like a good idea to run anything as admin, so I was surprised to see the instructions to do so in the vignette. Also consider using [basilisk][] as a 'Bioconductor standard' way of enabling use of python dependencies @LTLA
    [basilisk]: https://bioconductor.org/packages/devel/bioc/html/basilisk.html

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 24, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@LTLA
Copy link

@LTLA LTLA commented Apr 25, 2020

By and large, basilisk should be able to handle the Python dependencies that I can see (2.7, azimuth, scikit-learn). The main issues occur on Windows, as they typically do. The success of installation of Python 2.7 on Windows is unpredictable, and installation of Python packages via pip is dependent on whether non-system DLLs have found their way into the system directories on the user machine.

@bhagwataditya
Copy link
Author

@bhagwataditya bhagwataditya commented Apr 25, 2020

Thankyou Martin & Aaron,
Looking into how to integrate your recommendations. Will get back with an update soon.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 26, 2020

Received a valid push; starting a build. Commits are:

fad8068 BioC Review Fixes
5ceb40f Merge branch 'master' of gitlab.gwdg.de:loosolab/s...
1df3959 Bump version

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 26, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 26, 2020

Received a valid push; starting a build. Commits are:

0d5f08c Update vignette

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 26, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bhagwataditya
Copy link
Author

@bhagwataditya bhagwataditya commented Apr 26, 2020

Thankyou @mtmorgan and @LTLA

I followed your recommendations and:

  • updated DESCRIPTION with a more comprehensive abstract
  • removed dependencies that were not strictly required
  • removed the admin rights requirement (is actually only required if conda is installed under admin rights)

Regarding the failure to build: this is puzzling to me. I can successfully build it in a win R-devel as well as a linux R-devel environment, without getting the errors of the BioC machines.

Regarding basilisk: thank you for letting me know - looks interesting! That should be unrelated to the current issue of non-building, however. I have taken care in the (current) vignette to not run the python code (it's anyways only an optional last step in the whole gRNA design process). So I would prefer to first get the current vignette and package building.

Thank you for your feedback!

Aditya

@mtmorgan
Copy link
Contributor

@mtmorgan mtmorgan commented Apr 26, 2020

Since the vignette is failing, I installed the package

multicrispr master$ bioc-devel CMD INSTALL .

(using R 4.0.0 and BiocManager::install(version = "devel"), making sure BiocManager::valid()` returns (something close to) TRUE).

I noticed that the git repository contains 'build products', i.e., the file vignettes/designing_grna_libraries.R. This file should not be present, because it will be created when the package is built. Nonetheless, I then ran 'Stangle' on the vignette to get the R code

multicrispr master$ cd vignettes
multicrispr/vignettes master$ bioc-devel CMD Stangle designing_grna_libraries.Rmd

This gave me an up-to-date version of designing_grna_libraries.R. So I started R and sourced that file

multicrispr/vignettes master$ bioc-devel
> source("designing_grna_libraries.R", echo=TRUE)
...
> tfbs0 <- bed_to_granges(bedfile, genome = 'mm10')
	Read SRF.bed into GRanges
Error in bed$thickStart : $ operator is invalid for atomic vectors
> traceback()
12: .local(con, format, text, ...)
11: import(FileForFormat(con, format), ...)
10: import(FileForFormat(con, format), ...)
9: import(con, format = "bed", ...)
8: import(con, format = "bed", ...)
7: rtracklayer::import.bed(bedfile, genome = genome)
6: rtracklayer::import.bed(bedfile, genome = genome)
5: bed_to_granges(bedfile, genome = "mm10") at designing_grna_libraries.R#52
4: eval(ei, envir)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source("designing_grna_libraries.R", echo = TRUE)

digging a little more

> bedfile
[1] "/Users/ma38727/Library/R/4.0/Bioc/3.11/library/multicrispr/extdata/SRF.bed"
> rtracklayer::import(bedfile)
Error: $ operator is invalid for atomic vectors

a little more digging after methods("import")

> debug(import, signature = c("BEDFile", "ANY", "ANY"))
> rtracklayer::import(bedfile)
...
Browse[3]> next
debug: if (!is.null(bed$thickStart)) {
    thickEnd <- bed$thickEnd
    if (is.null(thickEnd))
        thickEnd <- bed$end
    bed$thick <- IRanges(bed$thickStart + 1L, thickEnd)
    bed$thickStart <- bed$thickEnd <- NULL
}
Browse[3]> next
Error in bed$thickStart : $ operator is invalid for atomic vectors

and the next time round

Browse[2]>
debug: bed <- DataFrame(read.table(con, colClasses = bedClasses, as.is = TRUE,
    na.strings = c(".", na.strings), comment.char = "", sep = sep,
    quote = ""))
Browse[2]>
debug: colnames(bed) <- bedNames[bedNames %in% colnames]
Browse[2]> class(bed)
[1] "DFrame"
attr(,"package")
[1] "S4Vectors"
Browse[2]> dim(bed)
[1] 3 1
Browse[2]> bed
DataFrame with 3 rows and 1 column
                                                                           V1
                                                                  <character>
1                                  version https://git-lfs.github.com/spec/v1
2 oid sha256:2e5c03b513ea39388a19c7b7aa3118320d644bfc81085d51a7170e2e155744a6
3                                                                  size 91414

which is looking kind of strange -- column is advertised as V1 but actually

Browse[2]> names(bed)
[1] "chrom"
Browse[2]> bed$chrom
[1] "version https://git-lfs.github.com/spec/v1"
[2] "oid sha256:2e5c03b513ea39388a19c7b7aa3118320d644bfc81085d51a7170e2e155744a6"
[3] "size 91414"

and there's no thickStart column...

Browse[2]> bed$thickStart
Error in bed$thickStart : $ operator is invalid for atomic vectors

This seems very funky. So I confirmed that

BiocManager::valid()

returns TRUE, turned off debugging

> undebug(import, signature = c("BEDFile", "ANY", "ANY"))

and checked whether the examples on the import.bed help page ran...

example(import.bed)

They did run successfully.

A first guess is that SRF.bed is malformed. Other than that, I'm at a loss, and call on @lawremi or perhaps @hpages for advice... The simple reproducible example is, in a new R session,

$ bioc-devel
> rtracklayer::import("multicrispr/inst/extdata/SRF.bed")
Error in bed$thickStart : $ operator is invalid for atomic vectors

@bhagwataditya
Copy link
Author

@bhagwataditya bhagwataditya commented Apr 27, 2020

Thankyou @mtmorgan!

I can reproduce the error with the installed data file on rocker-devel:

> rtracklayer::import.bed(system.file("extdata/SRF.bed", package = 'multicrispr'))
Error: $ operator is invalid for atomic vectors

Interestingly enough, the source data file does not give any issues:

> rtracklayer::import.bed("inst/extdata/SRF.bed")

GRanges object with 1974 ranges and 2 metadata columns:
         seqnames              ranges strand |         name     score
            <Rle>           <IRanges>  <Rle> |  <character> <numeric>
     [1]     chr2   83972890-83972905      + | SRF_MA0083.3   8.61532
     ...      ...                 ...    ... .          ...       ...
  [1974]     chr4 106237089-106237104      + | SRF_MA0083.3   9.73780
  -------
  seqinfo: 21 sequences from an unspecified genome; no seqlengths

What could be happening?

Aditya

@bhagwataditya
Copy link
Author

@bhagwataditya bhagwataditya commented Apr 27, 2020

Oh, this seems to be a git-lfs issue.
When I vim the installed file, I get:

> vim /usr/local/lib/R/site-library/multicrispr/extdata/SRF.bed

version https://git-lfs.github.com/spec/v1
oid sha256:2e5c03b513ea39388a19c7b7aa3118320d644bfc81085d51a7170e2e155744a6
size 91414

When I vim the source file, I get:

> vim multicrispr/inst/extdata/SRF.bed

chr2    83972889        83972905        SRF_MA0083.3    8.61532 +
chr1    85603908        85603924        SRF_MA0083.3    9.7378  -
chr17   70963622        70963638        SRF_MA0083.3    9.7378  -
...

So I guess I should remove these files from git LFS tracking - will do and then push

@mtmorgan
Copy link
Contributor

@mtmorgan mtmorgan commented Apr 27, 2020

Yes, the build system is just going to git clone without LFS support.

If these are large files then probably you should either (a) identify a (much) more modest subset for demonstrating your software or (b) make the files available via ExperimentHub as a package.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Received a valid push; starting a build. Commits are:

8fe7da3 rm vignettes/designing_grna_libraries (BioC review...
dfec823 git-lfs untrack "*.bed" and rm unused extdata file...

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Received a valid push; starting a build. Commits are:

c80df81 rm lfs-tracked BED file
e64fafd Add (lfs untracked) example BED file

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Received a valid push; starting a build. Commits are:

b43ff81 Update .gitignore to satisfy BiocCheckGitClone()
7d186cd rm multicrispr.Rproj and .gitattributes
5698f22 Add multicrispr.Rproj and .gitattributes (but now ...
1f35518 Fix path issue in add_target_counts()

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT, WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Received a valid push; starting a build. Commits are:

2706eeb rm unused (and timeout causing) functions EnsDb.Mm...
ef7c4ef rm duplicated NEWS and shorten line

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Received a valid push; starting a build. Commits are:

808d04a Re-roxygenize

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Received a valid push; starting a build. Commits are:

fb5063a AnnotationHub: Imports -> Suggests

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Apr 27, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

@nturaga
Copy link
Member

@nturaga nturaga commented May 14, 2020

Review

R CMD build

  • ok

R CMD INSTALL

  • ok

DESCRIPTION

  • ok

NAMESPACE

  • ok

NEWS

  • ok

R

  • Remove commented out code.

  • Use BiocParallel instead of parallel::mclapply().

  • Are you sure you have cut down dependencies enough? There are still quite a few. Loading the package takes a while.

vignette

  • ok

man

  • ok

extdata

  • Just saw a .pptx file in the folder, and are you sure you want this?

@bhagwataditya
Copy link
Author

@bhagwataditya bhagwataditya commented May 14, 2020

Hi Nitesh, thank you for your nice feedback :-), which I followed up on:

R

  • I removed commented out code
  • There is reason why I am using parallel::mclapply rather than BiocParallel:
    • I first used BiocParallel::bplapply().
      That works great on linux, but it fails on windows.
      support.bioconductor.org/p/92587/ explains likely why
      Windows doensn't allow forking (where a child process inherits parent env)
      Instead bplapply defaults to multithreading.
      In that scenario, the reticulation env is not correctly passed.
      mclapply seems like best choice: fork on linux - execute serially on win
  • package dependencies: I am actually really using functionality from each of these. Cutting down on dependencies would also mean cutting down on functionality. I can do this if it is really required, but it would be a pity.

extdata

  • :-) This ppt is not a real presentation, it's just a single slide with the "source" of the overview figure in my vignette.

Okay then, let me push my updates

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented May 14, 2020

Received a valid push; starting a build. Commits are:

ffa1738 Bump version

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented May 14, 2020

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

@bioc-issue-bot
Copy link
Collaborator

@bioc-issue-bot bioc-issue-bot commented Jun 1, 2020

Your package has been accepted. It will be added to the
Bioconductor Git repository and nightly builds. Additional
information will be posed to this issue in the next several
days.

Thank you for contributing to Bioconductor!

@mtmorgan
Copy link
Contributor

@mtmorgan mtmorgan commented Jun 1, 2020

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/bhagwataditya.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/
https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("multicrispr"). The package 'landing page' will be created at

https://bioconductor.org/packages/multicrispr

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

@mtmorgan mtmorgan closed this as completed Jun 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants