# Introduction to JupyterLab

In [1]:
# Colab setup ------------------
import os, sys, subprocess
if "google.colab" in sys.modules:
    cmd = "pip install --upgrade watermark"
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()

In [2]:
import numpy as np
import scipy.integrate

import bokeh.io
import bokeh.plotting
bokeh.io.output_notebook()

In this lesson, you will learn about different ways of interacting with the Python interpreter and importantly the basics on how to use JupyterLab. All of your homework will be submitted as Jupyter notebooks, so this is something you will need to master. It will be useful for you to go over [the intro to LaTeX](intro_to_latex.ipynb) to learn how to use $\LaTeX$ in your Jupyter notebooks.

You should, of course, read [the official JupyterLab documentation](https://jupyterlab.readthedocs.io/en/stable/) as well.

We will start by introducing the Python interpreter.

## The Python interpreter

Before diving into the Python interpreter, I pause here to remind you that this course is not meant to teach Python syntax (though you will learn that). The things you learn here are meant to help you understand how to use your computer for data analysis more generally. Think of it this way: part of the mission of this course is to help you unleash the power of your computer on your biological problems. Python is just the language of instruction. That said, let's start talking about how Python works.

Python is an **interpreted language**, which means that each line of code you write is translated, or *interpreted*, into a set of instructions that your machine can understand by the **Python interpreter**. This stands in contrast to **compiled languages**.  For these languages (the dominant ones being Fortran, C, and C++), your entire code is translated into machine language before you ever run it. When you execute your program, it is already in machine language.

So, whenever you want your Python code to run, you give it to the Python interpreter.

There are many ways to launch the Python interpreter. One way is to type

    python
    
on the command line. This launches the vanilla Python interpreter. We will never really use this in the class. Rather, we will have a *greatly* enhanced Python experience, either using **IPython**, a feature-rich, enhanced interactive Python available through JupyterLab's console, or using a **notebook**, also launchable in JupyterLab.

## "Hello, world." and the print() function

Traditionally, the first program anyone writes when learning a new language is called "`Hello, world.`"  In this program, the words "`Hello, world.`" are printed on the screen.  The original `Hello, world.` was likely written by [Brian Kernighan](https://en.wikipedia.org/wiki/Brian_Kernighan), one of the inventors of Unix, and the author of the classic and authoritative [book](https://en.wikipedia.org/wiki/The_C_Programming_Language) on the C programming language.  In his original, the printed text was "`hello, world`" (no period nor capital `H`), but people use lots of variants.

We will first write and run this little program using a JupyterLab console. After launching JupyterLab, you probably already have the Launcher in your JupyterLab window. If you do not, you can expand the `Files` tab at the left of your JupyterLab window (if it is not already expanded) by clicking on that tab, or alternatively hit `ctrl+b` (or `cmd+b` on macOS). At the top of the `Files` tab is a `+` sign, which gives you a Jupyter Launcher.

In the Jupyter Launcher, click the `Python 3` icon under `Console`. This will launch a console, which has a large white space above a prompt that says `In []:`. You can enter Python code in this prompt, and it will be executed.

To print `Hello, world.`, enter the code below. To execute the code, hit `shift+enter`.

In [3]:
print('Hello, world.')

Hello, world.


## .py files

Now let's use our new knowledge of the `print()` function to have our computer say a bit more than just `Hello, world.` Type these lines in at the prompt, hitting `enter` each time you need a new line. After you've typed them all in, hit `shift+enter` to run them.

In [4]:
# The first few lines from The Zen of Python by Tim Peters
print('Beautiful is better than ugly.')
print('Explicit is better than implicit.')
print('Simple is better than complex.')
print('Complex is better than complicated.')

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.


Note that the first line is preceded with a `#` sign, and the Python interpreter ignored it. The `#` sign denotes a **comment**, which is ignored by the interpreter, *but very important for the human!*

While the console prompt was nice entering all of this, a better option is to store them in a file, and then have the Python interpreter run the lines in the file. This is how you typically store Python code, and the suffix of such files is `.py`.

So, let's create a `.py` file. To do this, use the JupyterLab Launcher to launch a text editor. Once it is launched, you can right click on the tab of the text editor window to change the name. We will call this file `zen.py`. Within this file, enter the four lines of code  you previously entered in the console prompt. Be sure to save it.

To run the code in this file, you can invoke the Python interpreter at the command line, followed by the file name. I.e., enter

    python zen.py
   
at the command line. Note that when you run code this way, the interpreter exits after completion of running the code, and you do not get a prompt.

To run the code in this file using the Jupyter console, you can use the `%run` **magic function**.

    %run zen.py

To shut down the console, you can click on the `Running` tab at the left of the JupyterLab window and click on `SHUTDOWN` next to the console.

## Jupyter

At this point, we have introduced JupyterLab, its text editor, and the console, as well as the Python interpreter itself. You might be asking....

### What is Jupyter?

From the [Project Jupyter website](http://jupyter.org):
>Project Jupyter is an open source project was born out of the IPython Project in 2014 as it evolved to support interactive data science and scientific computing across all programming languages.

So, Jupyter is an extension of IPython the pushes interactive computing further. It is language agnostic as its name suggests. The name "Jupyter" is a combination of [Julia](http://julialang.org/) (a newer excellent language for scientific computing), [Python](http://python.org/) (which you know and love), and [R](https://www.r-project.org) (the dominant tool for statistical computation).  However, you can run over 40 different languages in a JupyterLab, not just Julia, Python, and R.

Central to Jupyter/JupyterLab are **Jupyter notebooks**. In fact, the document you are reading right now was generated from a Jupyter notebook. We will use Jupyter notebooks extensively in the course, along with `.py` files.

### Why Jupyter notebooks?

When writing code you will reuse, you should develop fully tested modules using `.py` files. You can always import those modules when you are using a Jupyter notebook. So, a Jupyter notebook is not good for an application where you are building reusable code or scripts. However, Jupyter notebooks are **very** useful in the following applications.

1. *Exploring data/analysis.* Jupyter notebooks are great for trying things out with code, or exploring a data set. This is an important part of the research process. The layout of Jupyter notebooks is great for organizing thoughts as you synthesize them.
2. *Developing image processing pipelines.* This is really just a special case of (1), but it worth mentioning separately because Jupyter notebooks are especially useful when figuring out what steps are best for extracting useful data from images, which happens all-too-often in biology. Using the Jupyter notebook, you can write down what you hope to accomplish in each step of processing and then graphically show the results as images as you go through the analysis.
3. *Sharing your thinking in your analysis.* Because you can combine nicely formatted text and executable code, Jupyter notebooks are great for sharing how you go about doing your calculations with collaborators and with readers of your publications. Famously, LIGO used [a Jupyter notebook](https://losc.ligo.org/s/events/GW150914/GW150914_tutorial.html) to explain the signal processing involved in their first discovery of a gravitational wave.
4. *Pedagogy.* All of the content in this class, including this lesson, was developed using Jupyter notebooks!

Now that we know what Jupyter notebooks are and what the motivation is for using them, let's start!

### Launching a Jupyter notebook

To launch a Jupyter notebook, click on the `Notebook` icon of the JupyterLab launcher. If you want to open an existing notebook, click on it in the `Files` tab of the JupyterLab window and open it.

### Cells

A Jupyter notebook consists of **cells**.  The two main types of cells you will use are **code cells** and **markdown cells**, and we will go into their properties in depth momentarily.  First, an overview.

A code cell contains actual code that you want to run.  You can specify a cell as a code cell using the pulldown menu in the toolbar of your Jupyter notebook.  Otherwise, you can can hit `esc` and then `y` (denoted "`esc, y`") while a cell is selected to specify that it is a code cell.  Note that you will have to hit enter after doing this to start editing it.

If you want to execute the code in a code cell, hit "`shift + enter`."  Note that code cells are executed in the order you shift-enter them.  That is to say, the ordering of the cells for which you hit "`Shift + Enter`" is the order in which the code is executed.  If you did not explicitly execute a cell early in the document, its results are not known to the Python interpreter. **This is a very important point and is often a source of confusion and frustration for students.**

Markdown cells contain text. The text is written in **markdown**, a lightweight markup language. You can read about its syntax [here](http://daringfireball.net/projects/markdown/syntax). Note that you can also insert HTML into markdown cells, and this will be rendered properly. As you are typing the contents of these cells, the results appear as text.  Hitting "`Shift + Enter`" renders the text in the formatting you specify.

You can specify a cell as being a markdown cell in the Jupyter toolbar, or by hitting "`esc, m`" in the cell. Again, you have to hit enter after using the quick keys to bring the cell into edit mode.

In general, when you want to add a new cell, you can click the `+` icon on the notebook toolbar. The shortcut to insert a cell below is "`esc, b`" and to insert a cell above is "`esc, a`." Alternatively, you can execute a cell and automatically add a new one below it by hitting "`alt + enter`."

### Code cells

Below is an example of a code cell printing `hello, world.` Notice that the output of the print statement appears in the same cell, though separate from the code block.

In [5]:
# Say hello to the world.
print('Hello, world.')

Hello, world.


If you evaluate a Python expression that returns a value, that value is displayed as output of the code cell. This only happens, however, for the last line of the code cell.

In [6]:
# Would show 9 if this were the last line, but it is not, so shows nothing
4 + 5

# I hope we see 11.
5 + 6

11

Note that if the last line does not return a value, such as if we assigned a variable, there is no visible output from the code cell.

In [7]:
# Variable assignment, so no visible output.
a = 5 + 6

In [8]:
# However, now if we ask for a, its value will be displayed
a

11

### Display of graphics

We will be using [Bokeh](http://bokeh.pydata.org/) almost exclusively during the course. To make sure the Bokeh plots get shown in the notebook, you should execute

    bokeh.io.output_notebook()
    
in your notebook. It is good practice to execute this in the first cell of the notebook. Let us now make a plot using Bokeh.

In [9]:
# Generate data to plot
x = np.linspace(0, 2 * np.pi, 200)
y = np.exp(np.sin(np.sin(x)))

# Set up plot
p = bokeh.plotting.figure(
    frame_height=200,
    frame_width=250,
    x_axis_label='x',
    y_axis_label='y',
    x_range=[0, 2 * np.pi],
)

# Populate glyph
p.line(x, y, line_width=2)

bokeh.io.show(p)

### Proper formatting of cells

Generally, it is a good idea to keep cells simple. You can define one function, or maybe two or three closely related functions, in a single cell, and that's about it. When you define a function, you should make sure it is properly commented with a descriptive doc string. Below is an example of how I might generate a plot of the Lorenz attractor (which I choose just because it is fun) with code cells and markdown cells with discussion of what I am doing. (The doc string in this function is nice, but longer than that is necessary for submitted homework in class. At least something akin to the first line of the doc string must appear in function definitions in your submitted notebooks.)

Between cells, you should explain with text what you are doing. Let's look at a fun example.

We will use `scipy.integrate.odeint()` to numerically integrate the Lorenz attractor. We therefore first define a function that returns the right hand side of the system of ODEs that define the Lorentz attractor.

In [10]:
def lorenz_attractor(r, t, p):
    """
    Compute the right hand side of system of ODEs for Lorenz attractor.
    
    Parameters
    ----------
    r : array_like, shape (3,)
        (x, y, z) position of trajectory.
    t : dummy_argument
        Dummy argument, necessary to pass function into 
        scipy.integrate.odeint
    p : array_like, shape (3,)
        Parameters (s, k, b) for the attractor.
        
    Returns
    -------
    output : ndarray, shape (3,)
        Time derivatives of Lorenz attractor.
        
    Notes
    -----
    .. Returns the right hand side of the system of ODEs describing
       the Lorenz attractor.
        x' = s * (y - x)
        y' = x * (k - z) - y
        z' = x * y - b * z
    """
    # Unpack variables and parameters
    x, y, z = r
    s, p, b = p
    
    return np.array([s * (y - x), 
                     x * (p - z) - y, 
                     x * y - b * z])

With this function in hand, we just have to pick our initial conditions and time points and run the numerical integration.

In [11]:
# Parameters to use
p = np.array([10.0, 28.0, 8.0 / 3.0])

# Initial condition
r0 = np.array([0.1, 0.0, 0.0])

# Time points to sample
t = np.linspace(0.0, 30.0, 4000)

# Use scipy.integrate.odeint to integrate Lorentz attractor
r = scipy.integrate.odeint(lorenz_attractor, r0, t, args=(p,))

# Unpack results into x, y, z.
x, y, z = r.transpose()

Now, we'll construct a plot of the trajectory using Bokeh.

In [12]:
# Set up plot
p = bokeh.plotting.figure(
    frame_height=200,
    frame_width=200,
    x_axis_label='x',
    y_axis_label='z',
)

# Populate glyph
p.line(x, z)

bokeh.io.show(p)

### Best practices for code cells

Here is a summary of some general rules for composing and formatting your code cells.

1. Keep the width of code in cells below 80 characters. This is not a hard limit, but you should strive for it and consider 88 characters a hard limit.
2. Keep your code cells short. If you find yourself having one massive code cell, break it up.
3. Provide complete doc strings for any functions you define. You can and should have comments in your code, but you really should not need much because your markdown cells around the code cells should clearly describe what you are trying to do.
4. Do all of your imports in the first code cell at the top of the notebook. With the exception of "`from ... import ...`" imports, import one module per line. You should also include `bokeh.io.output_notebook()` in the top cell as well when using Bokeh.
5. For submitting assignments, **always** display your graphics in the notebook.

### Markdown cells

Markdown cells contain text.  The text is written in **markdown**, a lightweight markup language.  The list of syntactical constructions at [this link](http://daringfireball.net/projects/markdown/syntax) are pretty much all you need to know for standard markdown.  Note that you can also insert HTML into markdown cells, and this will be rendered properly.  As you are typing the contents of these cells, the results appear as text.  Hitting "`shift + enter`" renders the text in the formatting you specify.

You can specify a cell as being a markdown cell in the Jupyter tool bar, or by hitting "`esc, m`" in the cell.  Again, you have to hit enter after using the quick keys to bring the cell into edit mode.

In addition to HTML, some $\LaTeX$ expressions may be inserted into markdown cells. $\LaTeX$ (pronounced "lay-tech") is a document markup language that uses the $\TeX$ typesetting software. It is particularly well-suited for beautiful typesetting of mathematical expressions. In Jupyter notebooks, the $\LaTeX$  mathematical input is rendered using software called MathJax. This is usually run off of a remote server, so if you are not connected to the internet, your equations may not be rendered. You will use $\LaTeX$ extensively in preparation of your assignments. There are plenty of resources on the internet for getting started with $\LaTeX$, but you will only need a tiny subset of its functionality in your assignments, and [the next part of this lesson](intro_to_latex.ipynb), plus cheat sheets you may find by Google (such as [this one](http://users.dickinson.edu/~richesod/latex/latexcheatsheet.pdf)) are useful.

### Quick keys

There are some keyboard shortcuts that are convenient to use in JupyterLab. (They do not all work in Colab.) We already encountered `Shift + Enter` to run a code cell. Importantly, pressing `Esc` brings you into command mode in which you are not editing the contents of a single cell, but are doing things like adding cells. Below are some useful quick keys. If two keys are separated by a `+` sign, they are pressed simultaneously, and if they are separated by a `-` sign, they are pressed in succession.

|Quick keys | mode | action |
|:---:|:---:|:---:|
|`Esc - m` | command | switch cell to Markdown cell|
|`Esc - y` | command | switch cell to code cell|
|`Esc - a` | command | insert cell above|
|`Esc - b` | command | insert cell below|
|`Esc - d - d` | command | delete cell|
|`Alt + Enter` | edit | execute cell and insert a cell below |

There are many others (and they are shown in the pulldown menus within JupyterLab), but these are the ones I seem to encounter most often.

## Rendering of notebooks as HTML

When you submit homework, you will also submit an HTML rendering of your notebooks. To save a notebook as HTML, you can click `File` → `Export Notebook As...` → `Export Notebook to HTML`.

## Computing environment

At the end of every lesson, and indeed at the end (or beginning) of any notebook you make, you should include information about the computing environment including the version numbers of all packages you use. This helps reproducibility. The [watermark package](https://github.com/rasbt/watermark) is quite useful for this. The watermark package is an **IPython magic extension**. These extensions allow convenient functionality within IPython or Jupyter notebooks. In general, to use magic functions, you precede them with a `%` sign (or a double `%%`) in a cell. We use the built-in `%load_ext` magic function to load watermark, and then we use `%watermark` to invoke it.

We use the `-v` flag to ask watermark to give us the Python and IPython verison numbers and the `-p` flag to give us version numbers on specified packages we've used. We can also use a `-m` flag to give information about the machine running the notebook, and you should do that, but I will not do that for this course to avoid clutter.

Your versions might not always match (especially if you are using Colab), but doing this is good practice and can help with debugging.

In [13]:
%load_ext watermark
%watermark -v -p numpy,scipy,bokeh,jupyterlab

Python implementation: CPython
Python version       : 3.11.5
IPython version      : 8.15.0

numpy     : 1.24.3
scipy     : 1.11.1
bokeh     : 3.2.1
jupyterlab: 4.0.6

