Skip to content

Commit

Permalink
Add documentation for Overlay-RStudio dependency install (#283)
Browse files Browse the repository at this point in the history
* add installing dependencies section for RStudio and make minor changes to RStudio section

* finalize some language

* incorporate Trevors suggestion for sentence change

* remove loading of apptainer module since we are no longer creating a module for apptainer

* remove additional reference to Apptainer module

* fix a couple of grammar mistakes
  • Loading branch information
b-reyes committed Dec 6, 2023
1 parent 4288449 commit 5a8a505
Show file tree
Hide file tree
Showing 4 changed files with 65 additions and 2 deletions.
67 changes: 65 additions & 2 deletions docs/gateways/OnDemand.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,10 +193,73 @@ Now that we have our environment correctly created, we can launch a Jupyter sess
2. Click “Launch” to submit the RStudio job to the queue. The wait time depends on the number of cores and time requested. The preset options provided generally start within a few moments.
3. Once your RStudio session is ready, you can click “Connect to RStudio Server”. An interactive RStudio session will be started in a new window.
![](OnDemand/rstudio_session_custom_launch.png)
- Please note that the first time you launch the session it may take awhile before you can connect to the session. This is because we are creating a unique [persistent overlay](https://apptainer.org/docs/user/main/persistent_overlays.html) for you that can give you the ability to install dependencies. Subsequent launches will not take as long.
4. To shut down an RStudio server, go to the "File" menu at the top and choose "Quit session...". If you have made changes to the workspace, then you will be asked if you would like to save them to `~/.RData`, this is not necessary, but can be helpful. Once completed, a prompt will notify you that your R session has ended and will give you the option to restart a server, if desired. However, it is important to note that quitting the session will not cancel the job you are running. Additionally, closing the window will not terminate the job. To terminate the job, you can use the “My Interactive Sessions” tab in Open OnDemand to terminate running sessions.

**_Notes:_**
* We have designed the RStudio app in Open OnDemand such that it employs versions of R that match the versions of R that are also available in the CURC module stack. This is done to facilitate moving between using RStudio for interactive work, and running larger R workflows as batch jobs on Alpine or Blanca. Due to system constraints, packages you install in a given version of R in RStudio will not be available if you load the equivalent version of the R module, and vice versa. You will need to (re-)install the packages you need when using the equivalent module.
**_Important Notes:_**
* We have designed the RStudio app in Open OnDemand such that it employs versions of R that match the versions of R that are also available in the CURC module stack. This is done to facilitate moving between using RStudio for interactive work, and running larger R workflows as batch jobs on Alpine or Blanca. Due to system constraints, packages you install in a given version of R in RStudio will not be available if you load the equivalent version of the R module, and vice versa. You will need to (re-)install the packages you need when using the equivalent module. This is due to the fact that RStudio is run from an Ubuntu [container](../Software/Containerizationon.html).

###### Installing dependencies for RStudio (currently available only on Alpine)

As previously mentioned, the RStudio application is run from an Ubuntu [container](../Software/Containerizationon.html). More specifically, the application uses an Ubuntu container paired with a [persistent overlay](https://apptainer.org/docs/user/main/persistent_overlays.html) that is unique to each user. For this reason, when installing a library via `install.packages`, you may receive an error because the container and overlay do not have a dependency required by the library. For example, let's try to install the library `XVector` using the Bioconductor package manager `BiocManager`, using the below commands in the R command prompt.
```
install.packages("BiocManager")
library(BiocManager)
BiocManager::install("XVector")
```
- Please note that if you are ever provided the prompt "Update all/some/none? [a/s/n]:", always choose "n". You will not be able to update the items because RStudio needs to be launched using a read only container, which cannot be modified. However, choosing the wrong option should not harm anything.

When the above lines are executed, we will eventually reach a state in the `XVector` install where we receive the following error.

![](OnDemand/xvector_install_error.png)

This install failed because our container and overlay do not have `zlib` installed. To remedy this, we can install `zlib` by modifying our overlay. To do this, we must first completely close the RStudio session __AND__ delete the job. This is necessary because our overlay cannot be changed if it is being used. Next, open up a terminal in Open OnDemand by selecting "Clusters" -> "Alpine Shell" from the top menu bar.

![](OnDemand/alpine_shell_depiction.png)

Next, start an interactive session on a compute node.
```
acompile --ntasks=4
```
Once on a compute node, we can now modify the overlay by launching the overlay using fakeroot.
```
apptainer shell --fakeroot --bind /projects,/scratch/alpine,$CURC_CONTAINER_DIR_OOD --overlay /projects/$USER/.rstudioserver/rstudio-server-4.2.2_overlay.img $CURC_CONTAINER_DIR_OOD/rstudio-server-4.2.2.sif
```
You should now be in a terminal starting with `Apptainer>`. In this shell we can install anything using the standard Ubuntu package manager. Let's go ahead and install `zlib1g-dev`, which will give us `zlib.h`.
```
apt-get update
apt install zlib1g-dev
```
Once completed, the overlay will be updated and you can exit the shell and compute node by executing `exit` twice.
```
exit
exit
```
Now, we can startup a new Rstudio session and attempt the XVector install.
```
BiocManager::install("XVector")
```
We should now see that the XVector install goes through!

![](OnDemand/successful_x_vector_install_rstudio.png)

**_Important Notes:_**
- Currently, this functionality is only available on Alpine. Once we update the operating system on Blanca, we will enable this functionality.
- For users who want to utilize the command line version of R or run a script without RStudio, this can be done using Apptainer. Below we provide two methods that can be used once a user has access to an Alpine compute node:
- To utilize R in an interactive session, you can execute the following command to start the container.
```
apptainer shell --bind /projects,/scratch/alpine,$CURC_CONTAINER_DIR_OOD --overlay /projects/$USER/.rstudioserver/rstudio-server-4.2.2_overlay.img:ro $CURC_CONTAINER_DIR_OOD/rstudio-server-4.2.2.sif
```
You can then launch R and interact with it (you can also utilize `Rscript` here too).
```
Apptainer> R
> library(XVector)
```
- To execute the script "test_R.r" without an interactive session, you can execute the following command.
```
apptainer exec --bind /projects,/scratch/alpine,$CURC_CONTAINER_DIR_OOD --overlay /projects/$USER/.rstudioserver/rstudio-server-4.2.2_overlay.img:ro $CURC_CONTAINER_DIR_OOD/rstudio-server-4.2.2.sif Rscript test_R.r
```


##### VS Code-Server

Expand Down
Binary file added docs/gateways/OnDemand/alpine_shell_depiction.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/gateways/OnDemand/xvector_install_error.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 5a8a505

Please sign in to comment.