-
Notifications
You must be signed in to change notification settings - Fork 0
/
rocker_setup.Rmd
164 lines (111 loc) · 6.62 KB
/
rocker_setup.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: "Connecting to R Studio Server via a docker on a remote server"
author:
- name: "David Zhang"
affiliation: UCL
date: "`r format(Sys.time(), '%d %B %Y')`"
output:
bookdown::html_document2:
figure_caption: yes
code_folding: show
theme: spacelab
highlight: kate
df_print: paged
toc: true
toc_float: true
number_sections: true
---
```{r setup, include = FALSE}
library(knitr)
knitr::opts_chunk$set(echo = TRUE, message = FALSE)
```
> Aim: Connect to R Studio server using docker
<br><br>
# Background
- [R Studio server](https://www.rstudio.com/products/rstudio/download-server/) allows user's to code using a convenient IDE, whilst harnessing the processing power of a computing cluster/server.
- However, the installation and maintenance of the R Studio server software can be time-consuming. Also, the free version of R Studio server only permits a single version of `R` to be used. Assuming you are not willing to pay, this can limit analyses that depend on packages across multiple versions of R.
- Using an docker image from the [rocker project](https://hub.docker.com/r/rocker/rstudio) with R/R Studio server pre-installed, you can easily bypass the two limitations above. A user can connect to the `rocker` image via an `ssh` tunnel, accessing R Studio server through their local web browser. Each docker image can have a unique installation of R Studio server, thus any number of `R` versions can be used.
- The following guide with demonstrate how to install and connect to a R Studio server via a `rocker` image on a remote server.
- This guide does **not** cover the fundamentals of docker itself and it is recommended that anyone using this guide should already have a basic proficiency with docker.
<br><br>
# Guide
## Docker installation
You must have `docker` installed on your system. To check you have `docker` installed, you can use:
```{bash check-docker-version, eval = FALSE}
# based on: https://www.digitalocean.com/community/questions/how-to-check-for-docker-installation
docker -v
echo $?
```
If you don't, install `docker`. A guide to installing `docker` can be found [here](https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04).
<br><br>
## Download a docker image with R Studio server pre-installed
In order to use R Studio server, a `docker` image with R Studio server pre-installed must be downloaded. [Bioconductor](https://www.bioconductor.org/help/docker/) releases it's own image based on the `rocker project`, with other useful resources for analyses of biological data pre-installed, such as core Bioconductor packages. You can download the Bioconductor docker image using:
```{bash download-rocker, eval = FALSE}
# the release 3.13 version of Bioconductor rocker image is used here
# be sure to check for an updated version as and when you use this guide
sudo docker pull bioconductor/bioconductor_docker:RELEASE_3_13
```
<br><br>
## Start a R Studio server process
Next, we will create a process running R Studio server on the Bioconductor docker image downloaded above. To do this, you need to use various flags together with the `docker` command. For convenience, I have created a `R` wrapper function to run these docker commands within the function [rutils::docker_run_rserver](https://github.com/dzhang32/rutils/blob/master/R/docker_shell.R). This calls the relevant docker commands from within `R` with arguments for the relevant flags, which are explained below.
First, open an `R` terminal and install the `rutils` package from GitHub via:
```{r install-rutils, eval = FALSE}
# this requires R version >= 4.0
devtools::install_github("dzhang32/rutils")
```
Next, run the `rutils::docker_run_rserver()` function within the `R` terminal. At a minimum, you should set the `image`, `port` and `name` arguments explained below. Setting the `verbose` argument to TRUE will print the flags that were used within the `docker` command and can be useful for debugging or logging your session.
```{r docker-run-rserver-ex1, eval = FALSE}
rutils::docker_run_rserver(
image = "bioconductor/bioconductor_docker:RELEASE_3_13", # rocker image
port = 8787, # port on which the host will have present R Studio server
name = "example", # name of docker process
verbose = TRUE # whether to print out the flags passed to the docker command
)
```
<br><br>
## Connecting to a R Studio server process
Now that the R Studio server process is running, you can now map the `localhost` of your local machine to the `port` on the remote server presenting R Studio server (specified above as 8787). An example `ssh` command is shown below and should be run on your **local** terminal:
```{bash ssh-tunnel, eval = FALSE}
# make sure the port specified here matches the port you have used above in
# rutils::docker_run_rserver()
ssh -i path_to_pem.pem \
-X -N -f -L localhost:8787:localhost:8787 \
user@ip
```
If the above `ssh` command has run successfully, you will now be able to access R Studio server by going to the address `localhost:8787` on your local browser. The default login details for the Bioconductor docker are:
Username: **rstudio**
Password: **bioc**
More details of the Bioconductor docker can be found [here](https://www.bioconductor.org/help/docker/).
<br><br>
## Mounting volumes
Most analyses relies on data that is stored on the original host, therefore not (by default) accessible by the docker process. Therefore, it is often useful to mount the required files, allowing them to be accessible by the `docker` process. Mounting can be configured using `rutils::docker_run_rserver()` via the arguments `volumes`, `volumes_ro`.
The user permissions for accessing the mounted `volumes` are dictated by the `USERID` and `GROUPID` arguments. These should be set matching the user you would like to mirror the permissions of. On linux, the `USERID` and `GROUPID` of the current user can be obtained via the `bash` command `id`.
Below is an example of running `rutils::docker_run_rserver()` whilst mounting volumes:
```{r docker-run-rserver-ex2, eval = FALSE}
# volumes - paths will be mounted with user permissions
# matching user specified by the USERID and GROUPID arguments
# volumes_ro - paths will be mounted with read-only access
rutils::docker_run_rserver(
image = "bioconductor/bioconductor_docker:RELEASE_3_13",
port = 8787,
name = "example_2",
verbose = TRUE,
volumes = c(
"/path/to/mounted/dir"
),
volumes_ro = c(
"/path/to/mounted/dir"
),
permissions = "match",
USERID = 1000,
GROUPID = 1000
)
```
<br><br>
# Reproducibility
```{r reproducibility, echo = FALSE}
# Session info
library("sessioninfo")
options(width = 120)
session_info()
```