# BFX Workshop Prerequisite Review

Welcome back to the Fall 2022 and Spring 2023 Bioinformatics (BFX) Workshop. 


## Unix command line familiarity

Whether on a macOS or Linux operating system, the Unix family of operating systems all have a similar look and feel. There are standard commands for navigating a filesystem and utilities for performing common tasks on files and directories. This workshop does not specifically cover Unix commands, but a few good resources:

* [Computing Basics – The Unix Shell and Command Line Basics](https://becker.wustl.edu/civicrm/?page=CiviCRM&q=civicrm/event/info&reset=1&id=533) is a virtual workshop offered by [Bernard Becker Medical Library](https://becker.wustl.edu/) in collaboration with WUIT's Research Infrastructure Services (RIS) and will introduce the Unix command line interface, the Unix file system, and basic Unix commands.
    * Attendees will learn to create, explore and manage Unix folders and files.
    * [Register Now](https://becker.wustl.edu/civicrm/?civiwp=CiviCRM&q=civicrm%2Fevent%2Fregister&id=533&reset=1) September 15th, 2022 10:00 AM - 11:30 AM [Add to Calendar](https://becker.wustl.edu/civicrm/?civiwp=CiviCRM&q=civicrm%2Fevent%2Fical&reset=1&id=533)
    * Do you have questions about this event? Contact ndonwimaze at wustl.edu
* [LinkedIn Learning: Unix for Mac OS X Users](https://www.linkedin.com/learning/unix-for-mac-os-x-users) is an easy to follow macOS video tutorial introducing users to the Unix features shared with other Linux/Unix based Operation Systems (OS).
    * Although the video is several years old and designed for Max OS X specifically, not much has changed with Unix.
    * All current Washington University students, faculty, and staff have access to LinkedIn Learning’s resources, which cover a broad range of topics.
    * Please visit this [WUSTL IT page for LinkedIn Learning login information](https://it.wustl.edu/items/linkedin-learning/).
* For a more traditional user experience, the classic look and feel of [this UNIX Introduction](http://www.ee.surrey.ac.uk/Teaching/Unix/unixintro.html) will provide a background on the Operation System itself as well as introductions to UNIX commands, utilities, and navigation tools.
* Numerous other Linux/UNIX tutorials exist, feel free to post your favorite in the [#bfx-workshop](https://ictsprecisionhealth.slack.com/archives/C040Q704WS2) Slack channel.


## Software you'll need to have installed:

There are numerous ways to install Python, R, Notebooks, etc. In fact, you will see various ways of installing software and configuring your environment throughout the workshop. Each presenter and/or software toolkit may require or suggest a different approach. The Install sections below, ex. Miniconda, Jupyter, Git, R, JAVA, Docker, etc. are one list of requirements used to load, execute, and edit Jupyter Notebooks for completing some workshop tutorials on local compute devices, ie. Mac OS X laptop or other Unix/Linux OS personal device.   

## Windows
Following the steps below is not advised for Windows users as many of the Jupyter notebooks will not successfully run on the windows environment. To circumvent this it is recommended to use a Linux based approach. The course has been tested using Ubuntu. [Follow these instructions](https://wustl.box.com/s/atjrz45e03gk8ks8mhdebkbz7b4ho70y)


# Miniconda
Conda is a package manager that works on Windows, Linux, and macOS. We will use a lightweight version of the package manager called Miniconda to install packages and manage the system environment. This tool should help minimize dependencies and differences in system-level libraries or operatings system across the participants in the BFX Workshop.

Available installers for Miniconda can be found [here](https://docs.conda.io/en/latest/miniconda.html#miniconda) through the official Conda project documentation. NOTE: Please choose the latest version of Python, ex. 3.9, although some users may already be using Python 3.8 or 3.7. Ignore Python 2 installers and please update existing Python 2 installs.  

For macOS, assumed to be the most common OS used in this course, a Python 3.8 installer (from v2020-2021) is circled in red on the image of the avaiable installers (Please use 3.9 or the latest when installing for the first time):
![macOS Miniconda Installers](https://github.com/genome/bfx-workshop/raw/master/archive/v2020-2021/images/miniconda_macos_installers.png)
UPDATE: You will need to choose an Installer appropriate for your Mac chipset, ex. Intel x86 vs. Apple M1 architecture

Once you've complete the install process, open a fresh terminal and check that conda is now installed by listing the Help menu:

```
conda -h
```

An example image of what a terminal might look like when running the above command:
![Example Conda Help](https://github.com/genome/bfx-workshop/raw/master/archive/v2020-2021/images/conda_help.png)

### Jupyter
Now that conda is installed, installing Jupyter is straightforward. Other installation methods will work, but for the sake of simplicity, we will install using conda like so:

```
conda install jupyter
```

Check to see that a recent version of Python3 is now used in your base conda environment:
```
python -V
```

As of September 8th, 2022, the base conda Python 3.8 version is 3.8.13, see the example image below from 2020 (v3.8.3):
![Example Python Version](https://github.com/genome/bfx-workshop/raw/master/archive/v2020-2021/images/python_version.png)


### Docker
We will not be walking through all of the steps involved with installing Docker. For complete and updated instructions for installing Docker for Mac, please see this [link](https://docs.docker.com/docker-for-mac/install/).

```
docker --version
```

Now let's walk through the most basic Docker example:

```
docker run hello-world
```

What just happened?

### Java
Java is required to use the Integrated Genomics View (IGV) locally on your workstation or laptop. Please see the operating system install instructions for Java located [here](https://www.java.com/en/download/help/download_options.xml).

### R

We will use conda to install the essential R libraries to at minimum create an R Jupyter Notebook.

```
conda install r-essentials
```

### Git
We will not be installing git as part of this tutorial. 

However with macOS, you can install git as part of the XCode Command Line Tools which are Apple distributed developer tools for compiling and developing software on macOS. Only the Command Line Tools are needed to use `git` commands.

```
git --version
```
#### Additional Git Resources
* Linkedin Learning [Learning Git and Github](https://www.linkedin.com/learning-login/share?account=57884865&forceAccount=false&redirect=https%3A%2F%2Fwww.linkedin.com%2Flearning%2Flearning-git-and-github-14213624%3Ftrk%3Dshare_ent_url%26shareId%3DBNNkL8hRQAqPOHmqiNOyZg%253D%253D)

#### Git Exercise
Let's clone the class git repository from GitHub, see [link](https://github.com/genome/bfx_workshop).

![Example GitHub Clone/Download Code](https://github.com/genome/bfx-workshop/raw/master/archive/v2020-2021/images/github_clone_repo.png)

NOTE: You have the option of using either HTTPS or SSH to clone the class git repo. Below is an example of using SSH.

![Example Git Clone Commmand](https://github.com/genome/bfx-workshop/raw/master/archive/v2020-2021/images/github_clone_cmd.png)


### Jupyter Notebook
Now that we have Miniconda, Jupyter, and Git installed and functional, we can begin using the Jupyter Notebook as an interactive shell and development environment.

First we should navigate on the filesytem, using `cd`, to the directory where we cloned the course repository.

A list, or `ls`, of the directory should show a `README.md` file and an `images` directory and perhaps several `*.ipynb` files.

NOTE: What does the `*` character represent above?

From the terminal command line in the git repo directory, start a Jupyter Notebook. The following command should launch a browser window showing the contents of the repo. From there, you can launch *THIS* tutorial in an interactive browser session.

```
jupyter notebook
```

NOTE: at this point, you can switch from the static GitHub

#### iPython and Shell Commands

Many common shell commands can be executed directly through a Notebook using iPython
For more on iPython, see this [link](https://jakevdp.github.io/PythonDataScienceHandbook/01.05-ipython-and-shell-commands.html).


In [1]:
cd ~

/Users/jwalker


In [2]:
ls

[34m94334a6a-b3c4-4a2b-ae31-5c4bdaef2021[m[m/
[34mApplications[m[m/
[34mBox Sync[m[m/
[34mCreative Cloud Files[m[m/
[34mDesktop[m[m/
[34mDocuments[m[m/
[34mDownloads[m[m/
[34mLibrary[m[m/
MFN2_Part1.png
MFN2_Part2.png
MFN2_Part3.png
[34mMovies[m[m/
[34mMusic[m[m/
NCBI_SARS-CoV-2.fa
NCBI_SARS-CoV-2.fa.fai
[35mOneDrive - Washington University in St. Louis[m[m@
[34mPictures[m[m/
[34mPublic[m[m/
RP-2315_AC0025_v1_Exome_GCP.hs_metrics
[34mbashrc.d[m[m/
[34mbd0e9f28-8708-4288-9beb-79b59a91285b[m[m/
[34mbfx-workshop[m[m/
[34md293fe48-db49-41f0-8d19-1a5caa12721d[m[m/
[34mf2e01a48-b2a0-4af0-8aba-8ecdc8c1293e[m[m/
fast5.txt
[34mgit[m[m/
[34mgoogle-cloud-sdk[m[m/
[34migv[m[m/
[34mminiconda[m[m/
miniconda.sh
[34mopt[m[m/


In [3]:
mkdir -p ~/bfx-workshop

In [4]:
cd ~/bfx-workshop

/Users/jwalker/bfx-workshop


In [5]:
echo "Hello World"

SyntaxError: invalid syntax (568906501.py, line 1)

In [6]:
!echo "Hello World"

Hello World


## Compute and Storage Access

### Google Cloud
***UNDER CONSTRUCTION***

* [WUIT Google Cloud](https://it.wustl.edu/services/cloud-computing/google-cloud-platform/)
* [Google Cloud Console](https://console.cloud.google.com/)


### WashU Local Compute and Storage
For additional information on Computing Basics with WUIT's Research Infrastructure Services (RIS) supported [Scientific Compute Platform](https://ris.wustl.edu/services/compute/), please see the following workshops offered through [Bernard Becker Medical Library](https://becker.wustl.edu/):

* [Computing Basics – Software Applications Needed to Work in High Performance Computing Environments](https://becker.wustl.edu/civicrm/?page=CiviCRM&q=civicrm/event/info&reset=1&id=535) is a virtual workshop offered by [Bernard Becker Medical Library](https://becker.wustl.edu/) in collaboration with WUIT's Research Infrastructure Services (RIS) and will introduce text editors, shell and batch scripts, Docker container technology, and the ssh protocol. 
    * Attendees will learn how to write, edit and run shell scripts, how to create a simple Docker container, and how to connect to the RIS compute platform.
    * [Register Now](https://becker.wustl.edu/civicrm/?civiwp=CiviCRM&q=civicrm%2Fevent%2Fregister&id=535&reset=1) September 22nd, 2022 10:00 AM - 11:30 AM [Add to Calendar](https://becker.wustl.edu/civicrm/?civiwp=CiviCRM&q=civicrm%2Fevent%2Fical&reset=1&id=535)
    * Do you have questions about this event? Contact ndonwimaze at wustl.edu
* [Computing Basics – Submitting Jobs to the RIS Scientific Compute Platform](https://becker.wustl.edu/civicrm/?page=CiviCRM&q=civicrm/event/info&reset=1&id=534) is a virtual workshop is offered by [Bernard Becker Medical Library](https://becker.wustl.edu/) in collaboration with WUIT's Research Infrastructure Services (RIS) and will introduce basic commands for submitting jobs to the queue on RIS compute platform. 
    * Attendees will learn how to submit and run jobs/tasks using Docker and queuing system commands.
    * NOTE: This event is currently full.
    * September 29th, 2022 10:00 AM - 11:30 AM [Add to Calendar](https://becker.wustl.edu/civicrm/?civiwp=CiviCRM&q=civicrm%2Fevent%2Fical&reset=1&id=534)
    * Do you have questions about this event? Contact ndonwimaze at wustl.edu

* The home for RIS docs supporting the compute1/storage1 environment: https://docs.ris.wustl.edu/
    * A complete list of RIS workshop recordings: https://docs.ris.wustl.edu/doc/compute/compute-workshops.html
* Virtual workshops are offered in collaboration with WUIT's Research Infrastructure Services (RIS) and will introduce  basic commands for submitting jobs on RIS compute cluster. Please see ##


#### Virtual Private Network (VPN)
This course will at times require access to the compute1/storage1 High Perfomance Computing & Storage solutions managed by the Research Infrastructure Services (RIS) team through WUSTL IT. In order to log into this compute environment, first one needs access to the WUSM VPN. For more information, please see the VPN section of this [link](https://it.wustl.edu/items/connect/).

#### Scientific Compute/Storage Platform

This workshop will cover workflows and pipelines that use the compute1/storage1 environment. This compute and storage environment is not required for all workshops. However, there is a complete RIS seminar series from April 2020 available [here](https://www.youtube.com/playlist?list=PLc5dxOEco26RhSbhBaRLeZoUFOTn7GM-Y) if you need more instruction than listed below.

##### Getting Started
Please see the "Getting Started" section of this [link](https://ris.wustl.edu/services/compute/) if you do not have access to the compute1 platform.

##### Getting Connected
If you have access, but are unsure how to connect to compute1, please see this [link](https://confluence.ris.wustl.edu/display/ITKB/Compute+Quick+Start#ComputeQuickStart-1.GettingConnected). 

NOTE: The above link is only accessible from within WUSM's network.


###  LSF and Docker information

Working through this [LSF and Docker tutorial](https://gist.github.com/chrisamiller/4b17a8dd310374f078da2bf12b3e2a49) might prove useful in conjunction with this week's homework.

## Homework Assignments

Please attempt the following exercises on your own and post questions to the ICTS Precision Health [#bfx-workshop](https://ictsprecisionhealth.slack.com/archives/C040Q704WS2) Slack channel.

1. Install and/or confirm installation of Prerequisites
2. Register using the sign-up form for future communications and cloud access: https://redcap.link/bfx
3. Drop into the #bfx-workshop slack channel and introduce yourself!
4. (OPTIONAL) Login to the Google Cloud Console using your WUSTL Key
