# The Jupyter Project

Today, if you use more than one programming language for data science, you probably also use different programs to edit and interact with those programs. R users, for example, often use RStudio, Python users use Spyder, and Julia users use Juno. 

But in recent years, an amazing effort has been underway to provide a single set of tools that work with nearly *any* underlying programming language: Jupyter (as in **Ju** (Julia) - **py** (Python) - te**R** (R)).

The idea of Jupyter is to seperate the interface you are working with from the underlying programming language doing your analysis. This makes it possible to create one interface (a text editor, a window where results are displayed, etc.) that can be used to run your analyses in any number of different programs. In the Jupyter ecosystem, the program being used to actually run your analysis (i.e. Python, R) is referred to as a **kernel**. 

Jupyter was originally focused on unifying Julia, Python, and R, it actually now supports dozens and dozens of different [kernels](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels) including javascript, Go, Haskell, Matlab, Stata, bash, Scala, and so much more.

(Note: Jupyter Notebooks used to be called IPython Notebooks before they expanded to support more languages, so if you see people talking about IPython Notebooks, just think of that as an early, Python specific version of Jupyter Notebooks). 

## Jupyter Notebooks

Jupyter notebooks are a tool for easily integrating text, code, and code output into a single document. This not only makes them *incredibly* useful for instructional materials (this entire site is actually built with Jupyter Notebooks), but also makes them useful as a method of sharing analyses. Using Jupyter Notebooks, you can not only share the conclusions of your analysis with colleagues, but also the code that generated those analyses, making it easy for others to see how you reached your conclusions and, crucially, play with that code to see what happens if the analysis is changed slightly. Indeed, Notebooks are so useful for sharing analyses that they've become the *de facto* standard for sharing information at many companies, including [Netflix](https://medium.com/netflix-techblog/notebook-innovation-591ee3221233).

OK, I know, that all sounds really abstract. What makes Jupyter Notebooks special is their interactivity, so it's hard to understand their value without seeing them in action.

### Jupyter Notebook Tutorial

To learn the basics of Jupyter Notebooks, please watch **[watch this tutorial](https://www.youtube.com/watch?v=HW29067qVWk)**. As you watch the video, follow along yourself -- since you've installed [conda](setting_up_your_computer.ipynb), you can install jupyter with the command `conda install jupyterlab` (no space between `jupyter` and `lab` -- `jupyterlab`) from the command line.

I recognize that there's a lot in here, so please give it your attention -- jupyter notebooks really are a de facto standard in data science, and you'll be using them **a lot** not only in this class, but also your other courses at Duke and in your post-Duke careers!


## Jupyter Lab

Notebooks have been around since 2014, but while they are great for instructional materials and sharing analyses, they weren't appropriate for managing full projects. In 2018, though, Jupyter launched a new tool: Jupyter Lab. 

Like the jupyter notebook interface you saw in the tutorial you just watched, jupyter lab offers full support for Jupyter Notebooks. But it can *also* support other workflows:

- Easily run and write Jupyter Notebooks (and do things you couldn't do before, like drag cells from one notebook to another, collapse cells, etc.).
- Work with a text-editor in one pane and an active kernel session in another, just like you do in RStudio or Spyder. 
- Edit popular data science file formats with live preview, such as Markdown, JSON, CSV, Vega, VegaLite, and more.

Again, this is all a little abstract for a tool that's fundamentally about interactivity, so to see Jupyter Lab in action, watch [this video](https://youtu.be/Gzun8PpyBCo?t=542) **starting 9 minutes in** and **stopping 30 minutes in (when they switch speakers)**. Don't worry about absorbing every detail --- we'll get lots of practice with Jupyter Lab below --- the goal is just to give you a sense of how Jupyter Lab works generally. 

To follow along with this video: 

- Make sure you have the most recent version of Jupyter Lab installed (the version that comes with the Anaconda distribution you already installed is a little behind) by running `conda update jupyterlab` (no space between jupyter and lab).  
- Type `jupyter lab` (**with** a space between jupyter and lab this time) on the command line to start a jupyter lab session!


## So many terms...

Yeah, sorry. Let's try and unpack things a little. 

- "jupyter": The umbrella name for the organization that aims to provide language-agnostic tools for data science programming.
- "jupyter notebooks": Notebooks made up of cells which may either contain text (written in markdown) or actual code which can be executed interactively. Language agnostic. 
- The thing that gets launched when you type `jupyter notebook` on the command line: the first interface created for interacting with jupyter notebooks. Very basic, and now basically surpassed by `jupyter lab`.
- "jupyter lab": a full editor that allows you to work with jupyter notebooks **or** using text files and an interactive console.
- "IPython": an augmented way of interacting with Python that is much more feature rich than the default python console (where each line starts with `>>>`). Supports in-line graphics, interactive widgets, "magics", tab completions, etc. When you work with Python in a Jupyter Notebook, or in Jupyter Lab, you're working with an IPython session.

## Setting Up R with Jupyter 

When you [installed Anaconda](setup_environment.ipynb), you actually also installed Jupyter, and you so you can already open up Jupyter and use it with a Python Kernel. However, since most people taking this course are Duke MIDS students who are also doing coursework in R, let's set up Jupyter to work with R as well. 

1. If you do *not* have R installed, [download and install it here](https://cloud.r-project.org/). If you have R installed, skip to step 2. 
2. Open R by openning your command line tool (Oh-My-Zsh on Mac, Cmder in Windows) and typing `R`. *Don't open it by double clicking its icon!*
    - If you can't open R by just typing `R`, you have to launch it by putting in the absolute path to your R installation. On a Mac, doing this requires typing something like the following into your command line (depending on exactly where R is installed on your system): `/Applications/R.app/Contents/MacOS/R`. Similarly, using Cmder on Windows you need to type something like (depending on installed version): `/c/Program Files/R/R-3.6.0/bin/R.exe`.
3. Run `install.packages("IRkernel")`
4. After installation is complete, execute the command: `IRkernel::installspec()` in R. 

That should be it. To see if R installed correctly, open a new session of Jupyter Lab (open a new console and type `jupyter lab`), and you should see buttons for both "Python 3" and "R" (though you won't have Bash, Julia, or Stata listed like I do): 

![jupyter_launch_page](images/jupyter_launch_page.png)

If you don't see a button for `R`, make sure you followed all the steps above!

**If the command `IRkernel::installspec()` generates this error:**

```R
Error in IRkernel::installspec() : 
  jupyter-client has to be installed but “jupyter kernelspec --version” exited with code 127.
In addition: Warning message:
In system2("jupyter", c("kernelspec", "--version"), FALSE, FALSE) :
  error in running command
```

That means R can't find your installation of jupyter. That probably means anaconda isn't set up with your command line tool, so please go back and see `setup_environment.ipynb`. 

**If the command `IRkernel::installspec()` generates this error:**

```R
 xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
```

If that means that R want you to install apple developer tools. Just run the following command at the command line (not in R, in your bash or oh-my-zsh session): `xcode-select --install`, or if that doesn't work, run `sudo xcode-select --reset`. This should lead to a somewhat slow software installation.

## Jupyter Lab Exercises!

Click here for some [Jupyter Lab exercises](exercises/Exercise_jupyterlab.ipynb)!