# Hello, Workbench! 
### Here's how to get started using notebooks to analyze _All of Us_ Research Program data.
---

## What is a Workspace?

A **Workspace** is your place to store and analyze data for a specific project. Your Workspace is where you will go to build concept sets and cohorts and launch Notebooks to perform analyses. You can share a Workspace with other users, allowing them to view or edit your work. To learn more about our tools, please refer to the **Quick Tour** located on the Workbench homepage.

## What are Notebooks?
**Jupyter Notebooks** (like this one) are web applications that you can use to create and share documents that contain live code, equations, visualizations and narrative text. Like a traditional lab notebook, Jupyter Notebooks allow you to capture a complete record of your procedures, analyses, and observations such that another scientist may reproduce your observations.

### If you are familiar with using  notebooks for data analysis, here are some resources to get started:
* Learn about our data model, which is based on the Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) common data model: [CDR metadata documentation](https://github.com/all-of-us/pyclient/blob/master/py/aou_workbench_client/cdr/README.md)
    > Ex. Use our CDR documentation to learn that the Person table contains fields such as "person_id" and "year_of_birth"
* To search standard vocaularies used in OMOP, please use [ODHSI's Athena tool](http://athena.ohdsi.org/)
    > Ex. Use Athena to learn that a gender_concept_id = 8532 means Female
* Learn how to access the All of Us Workbench API from a notebook: [AllofUs Python Client Library README](https://github.com/all-of-us/pyclient/blob/master/py/README.md#materializecohortrequest) 
    > Ex. Use our Client Library documentation to learn what function to use to load data from our database into your notebook.
* Learn how to use concept sets and cohorts in a notebook: **Using Cohorts and Concept Sets** *_currently only available in Python_
    > Ex. Use this example notebook as a template for starting an analysis on cohort you created with our cohort builder tool.

### [IMPORTANT] Even if you are familiar with using notebooks, read Part II below to learn what packages you need to install to analyze All of Us data in R
-----

# Using Notebooks - A Quick Overview

## Notebook Overview Part I: Cells
A notebook contains a list of rectangular boxes called **cells**. Cells contain either explanatory text or executable code and its output. 
Click once select a cell. Double-click to edit a cell (cells in edit mode appear grey).

### This is a text cell
Add text to your notebook using Markdown cells. Change the cell type to Markdown by using the Cell menu or selecting "Markdown" from the toolbar above. To learn more, see [Working with Markdown Cells](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html).

Add math to text cells using [LaTeX](http://www.latex-project.org/) by placing the statement
within a pair of **\$** signs. 
> For example `$\int_a^b \!f(x) \, \mathrm{d}x$` becomes
$\int_a^b \!f(x) \, \mathrm{d}x$

### The next cell is a code cell
Workbench currenlty supports code written in Python 2, Python 3, and R. To change the programming language (called a **"kernel"**), use the Kernel menu above to choose the language/version you want.

>To execute or "run" code, select the cell, then press Shift + Enter. The cell is done executing when [*] at the left of the cell turns into [a number].

In [1]:
# This is a code cell
# Background: The All of Us Research Program is calling on one million people to join us as we try to change the future of health.

AllofUs = 1000000
x = 1
you = AllofUs/x
print(you)


[1] 1e+06


### Adding and Removing Cells

**To add a new cell:**
* click the + icon in the menubar
* Insert -> Choose Cell Above or Below your current cell
* or press ESC A

> Try adding a new code cell below this cell

**To remove a cell:**
* click the scissors icon in the menubar
* Edit -> Delete or Cut cells
* or press ESC X (for cut) ESC d d (for delete)

> Now try removing that cell you added


---
## Notebook Overview Part II. Modules, Packages, Libraries, and Extensions

### Modules
A **module** is a piece of software with specific functionality. In the R language, modules are fundamental units of reproducible R code. You can import modules into your notebook to get access to specific functions, objects, or classes (which are combinations of variables and functions).


### Packages and Libraries
A **package** is a collection of related modules or functions. In R, A **library** is simply a directory containing installed packages. 

**Important** The current AllofUs client library is written in Python, so to do analysis in R you will have to:
1. Install the reticulate package, which provides a comprehensive set of tools for interoperability between Python and R (details here: https://github.com/rstudio/reticulate)
2. Install additional R data analysis packages reccomended in this notebook


## You must install this reticulate package to use our client library 
The Reticulate package provides a comprehensive set of tools for interoperability between Python and R (details here: https://github.com/rstudio/reticulate)

> Execute the cell below cell to import reticulate. Note that `install.packages()` command is required to install reticulate, and the `library()` command is required to load the reticulate on your system 

In [2]:
# install reticulate
install.packages("reticulate")
library(reticulate)

Installing package into ‘/home/jupyter-user/.rpackages’
(as ‘lib’ is unspecified)


### Popular Packages and Libraries For Data Analysis in R
Below are popular packages we reccomend installing for data analysis in R. 

> Execute each cell below cell to import the package.

In [3]:
# For skimming summary statistics (details here: https://github.com/ropensci/skimr)
install.packages('skimr')
library(skimr)

Installing package into ‘/home/jupyter-user/.rpackages’
(as ‘lib’ is unspecified)
also installing the dependency ‘pander’



In [4]:
# A collection of helpful functions for summarizing data and formatting results (details here: https://github.com/dewittpe/qwraps2)
install.packages('qwraps2')
library(qwraps2)

Installing package into ‘/home/jupyter-user/.rpackages’
(as ‘lib’ is unspecified)
also installing the dependency ‘RcppArmadillo’



In [5]:
# An "opnionated" collection of R packages designed for Data Science 
install.packages('tidyverse') 
library(tidyverse)

Installing package into ‘/home/jupyter-user/.rpackages’
(as ‘lib’ is unspecified)
── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.0.0     ✔ purrr   0.2.5
✔ tibble  1.4.2     ✔ dplyr   0.7.6
✔ tidyr   0.8.1     ✔ stringr 1.3.1
✔ readr   1.1.1     ✔ forcats 0.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()


In [6]:
# A nice color scheme for plots.(details here: https://github.com/sjmgarnier/viridis)
install.packages('viridis')
library(viridis)

Installing package into ‘/home/jupyter-user/.rpackages’
(as ‘lib’ is unspecified)
also installing the dependency ‘gridExtra’

Loading required package: viridisLite


In [7]:
# Common themes to change the look and feel of plots ( details here: https://github.com/jrnold/ggthemes)
install.packages('ggthemes')
library(ggthemes)

Installing package into ‘/home/jupyter-user/.rpackages’
(as ‘lib’ is unspecified)


### Extensions

**Extensions** are optional plug-ins that add functionality to notebooks. Below are are few extensions that the open source community created to add functionality to the Jupyter Notebook. For a more complete list of extensions, please see http://jupyter-contrib-nbextensions.readthedocs.io/en/latest/index.html

To install the below notebook extenstions, three steps are required: 
1. The Python pip package needs to be installed. 
2. Tthe notebook extensions themselves need to be copied to the Jupyter data directory. 
3. The installed notebook extensions can be enabled, either by using built-in Jupyter commands, or more conveniently by using the jupyter_nbextensions_configurator server extension. 

> For more details on how to enable extenstions, see [Installing jupyter_contrib_nbextensions](https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/install.html) or follow the example below

In [8]:
# 1. Run this cell install the nbextensions package

system('pip3 install jupyter_contrib_nbextensions jupyter_nbextensions_configurator', intern = TRUE)

In [9]:
# 2. Run this cell to install an extension that allows you to collapse headings

system('jupyter nbextension install ~/.local/lib/python3.4/site-packages/jupyter_contrib_nbextensions/nbextensions/collapsible_headings --user',
       intern = TRUE)
system('jupyter nbextension enable collapsible_headings/main', intern=TRUE)

In [10]:
# 3. Run this cell to enable the collapsible heading extension installed above

system('jupyter nbextension enable python-markdown/main', intern = TRUE)

In [11]:
# 4. Run this cell to disable the collapsible heading extension enabled above

system('jupyter nbextension disable python-markdown/main', intern = TRUE)

---
## Notebook Overview Part III. Importing Notebooks into a Workspace

To import an existing notebook from your local machine into your workspace:

1. Click the "File" menu and select "Open"
2. A new browser window will open up with a list of all the notebooks in your workspace. Click Upload. 
3. Navigate to where your notebook is saved. Cick Open.

and you should be ready to go!