Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create packrat projects using jetpack's CLI. #9

Open
grabear opened this issue Oct 12, 2018 · 5 comments
Open

Create packrat projects using jetpack's CLI. #9

grabear opened this issue Oct 12, 2018 · 5 comments

Comments

@grabear
Copy link
Member

grabear commented Oct 12, 2018

Along with #7 this will be another easy, addition to rut. Simply install https://github.com/ankane/jetpack, and then install the jetpack CLI:

jetpack::cli()
@grabear grabear added this to the Hackseq 2018 - Standalone CLI milestone Oct 12, 2018
@grabear grabear added this to New Issues in Hackseq 2018 - rut via automation Oct 12, 2018
@BrunoGrandePhD
Copy link
Contributor

Looking into whether packrat/jetpack can address our needs.

@BrunoGrandePhD BrunoGrandePhD self-assigned this Oct 12, 2018
@BrunoGrandePhD
Copy link
Contributor

BrunoGrandePhD commented Oct 12, 2018

I want to make I understand what packrat and jetpack do.

packrat

Features

From their website:

We built packrat to solve these problems. Use packrat to make your R projects more:

Isolated: Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.

Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.

Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.

Here are its essential features and my thoughts associated with each one:

  1. It installs libraries in a local directory as opposed to the global or user library.
    • This is a must for rut.
    • I have experience doing this manually using .Rprofile and .Renviron.
  2. It offers a mean to install the same environment on another computer.
    • We could rely on files like requirements.txt for Python/pip/conda.
    • Key question: do we keep track of "top-level" packages (i.e. the ones that the user installs directly), or all of the dependencies?

Other thoughts

Apparently, packrat should be compatible with Bioconductor.

As for performance, it seems like we could either address the issues with packrat (see this issue), or we can implement our own package that achieves these features while avoiding packrat's mistakes. I need to figure out what does packrat do that makes it so slow. The linked issue sheds some light on this.

jetpack

As far as I can tell, the essential features of jetpack are the following:

  1. It enables the installation of specific package versions (unlike install.packages()) and other similar functions (e.g. remove, update).
    • It presumably uses devtools for this, i.e. install_version().
  2. It leverages packrat for maintaining local libraries.
    • If we achieve what packrat offers without using packrat, we would increase performance (if it's at all possible though).
  3. It exposes a command-line interface for the R functions.
    • This is something we could leverage instead of developing our own interface.

Other thoughts

It's not clear whether jetpack is compatible with Bioconductor. Their README says it isn't compatible, but this issue hints that it is.

Conclusion

I think jetpack achieves a lot of what we want. The main caveats are its reliance on packrat (slow...) and the issue of compatibility with Bioconductor (perhaps a non-issue). Regarding its relationship with packrat, the jetpack developer is looking into getting rid of this dependency in the next release (source). Scratch that: I misread the issue. The developer wants to remove all dependencies except for packrat and remotes. Accordingly, I'm thinking we could fork jetpack and replace the dependency on packrat with a method relying strictly on .Rprofile and/or `.Renviron.

@sdhutchins
Copy link
Member

I think you have everything well thought on both packages.

I would personally opt for jetpack given it will probably be a lot easier to initially set up.

You made a great point about implementing our own package. I think that should be our (I guess) down the line goal. However, if we fork jetpack and can make changes that give us what we need, then we may solve those issues.

@BrunoGrandePhD
Copy link
Contributor

I talked with @grabear about this. Here's the outcome from that conversation:

  • I didn't realize that the bulk of the package fits in less than 900 lines of R code. I thought it would be more.
  • At this rate, it might take as much time understanding and tweaking his code as reimplementing a version in Python. This is worth considering.
  • While Python might make it easier to develop a CLI, we also have to balance that against the technical debt incurred by having to deal interacting with R (e.g. through rpy2, system commands, or something else).
  • Another factor on deciding whether we should fork jetpack or develop our own package is the features we want to see in rut. Issue Implement rut's features #17 lists several features that are probably beyond the scope of jetpack.

(I might add more thoughts tomorrow.)

@sdhutchins
Copy link
Member

Great thoughts.

  • An additional thought on rpy2...if at any point we intend for beRi to be used on Windows, rpy2 makes that far more daunting and less likely. Tons of issues installing rpy2 on Windows.

@BrunoGrandePhD BrunoGrandePhD removed their assignment Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Hackseq 2018 - rut
  
New Issues
Development

No branches or pull requests

3 participants