# Install R packages



## Generally useful R packages

In [1]:
# This code defines a time-saving command, that checks to see if the needed libraries are already installed and
# installs them only if they're missing. The `install_if_missing,` command will be used below.

install_if_missing <- function(packages) {
    if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
        install.packages(setdiff(packages, rownames(installed.packages())), repos = "https://cloud.r-project.org/")
    }
}

In [2]:
# Use the command defined above to install the R packages (in parentheses) below 
install_if_missing(c('devtools', 'tidyverse','viridis', 'ggthemes', 'pryr', 'skimr',
                     'testthat', 'reticulate', 'data.table', 'RCurl', 'xml2'))

# There may be a warning in pink that says 'lib' is unspecified, which you can ignore. 

## Leonardo R package

Leonardo is a service that provides access to interactive tools like Jupyter, RStudio, and Hail running in the cloud inside the Terra security boundary.

In [1]:
# Install the package of libraries that Leonardo needs, which are hosted on github 
devtools::install_github('DataBiosphere/ronaldo')

Skipping install of 'Ronaldo' from a github remote, the SHA1 (426459ff) has not changed since last install.
  Use `force = TRUE` to force installation



# Confirm that the R packages loaded properly

In [7]:
# Warnings that objects are masked between R packages are to be expected and you can ignore them 
library(viridis)    # A nice color scheme for plots.
library(ggthemes)   # Common themes to change the look and feel of plots.
library(scales)     # Graphical scales map data to aesthetics in plots.
library(testthat)   # Testing functions.
library(assertthat) # Assertion functions.
library(pryr)       # Memory usage functions.
library(skimr)      # Summary statistics for dataframes.
library(bigrquery)  # BigQuery R client.
library(tidyverse)  # Data wrangling packages.
library(reticulate) # Python R client.
library(Ronaldo)    # Leonardo R package.

Loading required package: viridisLite


Attaching package: ‘scales’


The following object is masked from ‘package:viridis’:

    viridis_pal



Attaching package: ‘testthat’


The following object is masked from ‘package:devtools’:

    test_file



Attaching package: ‘skimr’


The following object is masked from ‘package:testthat’:

    matches


── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.1 ──

[32m✔[39m [34mggplot2[39m 3.3.5     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.1.5     [32m✔[39m [34mdplyr  [39m 1.0.7
[32m✔[39m [34mtidyr  [39m 1.1.4     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 2.0.2     [32m✔[39m [34mforcats[39m 0.5.1

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mreadr[39m::[32mcol_factor()[39m    masks [34mscales[39m::col_factor()
[31m✖[39m [34mpurrr[39m::[32mcompose()[39m       masks [34mpryr

This notebook installs the most recent versions of R packages from [CRAN](https://cran.r-project.org/) and Python packages from [pip](https://pypi.org/project/pip/) on to your VM. Additionally, some packages come from [GitHub](https://github.com/) or [Cloud Source Repositories](https://cloud.google.com/source-repositories/).

1. If you encounter any errors, first try restarting the kernel and running all: `Kernel -> Restart & Run All`.
1. If an R package still fails to install:
 - Open a terminal by clicking on the terminal icon next to 'Notebook Runtime' in the upper top right corner of the window
 - Type `R` to start R in the terminal
 - Type `install.packages("qwraps2")` to get a more detailed error message. Replace `qwraps2` with the name of which ever package is failing to install.
1. If that error message tells you what you need to do to resolve the issue, great! If not, copy and paste the error message into Google Search for more help. 

# Install the SRA Toolkit

Download and install sratoolkit.

see https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc for documentation.

Note: we are not adding the path, so invoke with full pathname:
e.g:  srapath SRR000001 --> ~/sratoolkit/bin/srapath SRR000001

In [29]:
tk_download <- "https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.11.2/sratoolkit.2.11.2-ubuntu64.tar.gz"
fname <- basename(tk_download)
dname <- tools::file_path_sans_ext(tools::file_path_sans_ext(fname))
if (!file.exists(dname)) {
    tool_path <- normalizePath(file.path(".", "sratoolkit"))
    setwd("~")
    system2("curl", c("-LO", tk_download))
    system2("tar", c("-xf", fname))
    system2("mv", c(dname, tool_path))
    system2("rm", fname)
    path_add <- file.path("PATH=$PATH:", tool_path, "bin")
    system2("export", path_add)
    system2("echo", c(path_add, ">>", "~/.bashrc"))
}

In [35]:
# TODO: Only once
system2("sh" ,c("-c", "$(curl -fsSL ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/install-edirect.sh)"))


# Provenance

Provenance is a record of exactly the environment used to run the notebook. It's useful for collaborating, and also helpful when you return to a notebook months after your initial analysis. It's also Best Practices for reproducible research.

In [8]:
# Output all session information
devtools::session_info()

─ Session info ───────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.1.1 (2021-08-10)
 os       macOS Big Sur 10.16         
 system   x86_64, darwin17.0          
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/New_York            
 date     2021-10-13                  

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date       lib source                                
 assertthat  * 0.2.1   2019-03-21 [1] CRAN (R 4.1.0)                        
 backports     1.2.1   2020-12-09 [1] CRAN (R 4.1.0)                        
 base64enc     0.1-3   2015-07-28 [1] CRAN (R 4.1.0)                        
 bigrquery   * 1.4.0   2021-08-05 [1] CRAN (R 4.1.0)                        
 bit           4.0.4   2020-08-04 [1] CRAN (R 4.1.0)              

Copyright 2019 The Broad Institute, Inc., Verily Life Sciences, LLC All rights reserved.

This software may be modified and distributed under the terms of the BSD license. See the LICENSE file for details.