# Lab 0: Introduction to Jupyter Notebooks in Noteable
#### Accelerated Natural Language Processing (INFR11125)

## General goals of the ANLP labs

Welcome to our first lab! The materials in our labs are designed to help you get familiar with:
- some commonly used **tools and packages** for NLP, which you will also need to use in the assignment.
- some of the fundamental **concepts** covered in the course, and how they apply in practice.

This first lab is self-paced on your own. After this one, each lab is designed to be done during the timetabled lab sessions, when you can get help from your partner and the lab demonstrators. Some labs might take a bit longer, so if you don't finish during the session, please complete the lab on your own time afterward.


## What you will learn in this lab

This lab introduces Jupyter notebooks, which we will use for the labs throughout this course. We'll be using Noteable, which is a special version of JupyterLab. If you get stuck or want to learn more about Jupyter/Notebooks/JupyterLab/Notable, you can check out the following links:
    
- [Noteable User Guide](https://noteable.edina.ac.uk/user-guide/) This guide is the most Noteable-specific, and also includes overview information about notebooks and the JupyterLab interface.
- [JupyterLab Documentation](https://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html) Less Noteable-specific, but even more information!

There's also a lot of help within the Notable interface itself! In the Help tab at the top right, you'll find links to lots of reference materials for various tools we'll use in this course.

While we aren't teaching Python in this course, we know many of you are new to it. In this lab, we'll also talk about sources you can use to get help with Python and the Python packages we'll use in this course, as well as some debugging techniques for Jupyter notebooks. Don't worry if you can't understand all the code yet! You might want to come back to the code in this notebook a bit later in the course if you get stuck, or come to the help hours!

The below tutorial is focused on JupyterLab and Noteable, but Jupyter notebooks can be run in many environments. Most things will be the same, but the appearance, commands, and the layout of the interface might be a bit different from how it's described here. Further, different editors might have different features: for example, Colab allows real-time collaboration and access to GPUs, but Noteable comes with all the packages you'll need for this course pre-installed and sophisticated debugging features (which we use in this tutorial!)

## 1. What is Jupyter?
Jupyter is a platform for interactive computational documents called notebooks. A Jupyter notebook is a file type that allows you to weave together code, text, equations/mathematical notation, visualizations and multimedia. They allow you to run code live alongside formatted explanatory text and see the outputs of the code directly alongside the code. Jupyter notebooks are organized into cells--short snippets which are either text or code.

## 2. Command mode and edit mode

In the notebook, there are two modes: *edit mode* and *command mode*. By default the notebook begins in *command mode*, where you can select cells. In order to edit a cell, you need to be in *edit mode*.

When you are in command mode, you can press `Enter` to switch to edit mode. The contents of the cell you currently selected will become editable, and a cursor will appear. You can edit the cell, then hit `Escape` when you're done to return to command mode.

**YOU TRY:** Select this cell by clicking on it with the mouse. You should see a blue bar on the left. (You can also use the arrow keys to move the selection around.) Now, press `Enter` to enter edit mode. Notice how the appearance of the cell changes. Make an edit to the cell, then press `Escape` to return to command mode.

You'll notice that while you're now in command mode (arrow keys move between cells), the previous cell still looks like code! This is because Jupyter will only *render* the markdown when you tell it to. To do this, we need to "run" the cell by pressing **`Ctrl-Enter`**, and then it will go back to looking like it did originally.

**YOU TRY:** Use the arrow keys to select the previous cell, then hit `Ctrl-Enter`. The cell should now look like nicely formatted text again, with the edit you made!

## 3. Text cells
While in this course you'll primarily need to edit code cells, it's worth knowing a little bit about how to edit and format text in Jupyter notebooks. Jupyter notebooks use Markdown, a simple way of typing formatted text without needing buttons for styles. You can make text **bold** or *italic*, or type an equation using LaTeX ($y=m\cdot x+b$). You can also make headings, lists, and more. 

If you're interested, you can learn more markdown here:
[Markdown cheat sheet](http://nestacms.com/docs/creating-content/markdown-cheat-sheet)

**YOU TRY:** Select this cell and hit `Enter` to see the Markdown code used to make this cell. How are we making headings, bold text, and italic text? Try formatting some text of your own. Don't forget to hit `Ctrl-Enter` when you're done to see the result!

## 4. Code cells

Cells in a Jupyter notebook are either code cells or markdown (text) cells. In this course, we will use a Python *kernel*, which means Python will be used for executing the code cells. You can also use Jupyter for other programming languages, like R and Julia. When you run a code cell using `Ctrl-Enter`, the output of the code will display below the cell. To the left of a cell, you will see a number in brackets. This helps you keep track of the order in which code cells were executed. Empty brackets indicate that a cell hasn't been run yet. If you re-execute a cell, the number will change.

<div class="alert alert-danger">
Values defined in one cell stay defined in later cells!! This can cause trouble if you're not careful.
</div>

**YOU TRY:** Execute the following code cell a few times! What happens?

In [None]:
import random

print("I'm a code cell! I print a random number: " + str(random.randint(1,10)))

You don't actually have to include a print statement to display a value in a Jupyter notebook.

**YOU TRY:** Run the following code cells. You might find `Shift-Enter` useful--it also runs the current cell, but rather than keeping the same cell selected like `Ctrl-Enter` does, it selects the following cell, so you can run a lot of cells in a row without using the arrow keys. Which cells produce an output, and which don't? What value is displayed, and why?

In [None]:
a = 10
b = 20
a

In [None]:
b

In [None]:
a
b = 30

In [None]:
b
a

**YOU TRY:** Create a new cell below this one , either by hitting the + symbol in the menu bar just above the notebook, or by typing `b` in Command Mode. Use `print` to produce more than one line of output from a Jupyter cell (Hint: use two print statements!)

![](images/new_cell.png)



## 5. Changing cell types

When you're writing a Jupyter notebook, you might accidentally make a code cell when you need a markdown cell, or a markdown cell when you need a code cell. In Noteable, there's a dropdown menu for this just above the notebook:
![](images/cell_type_dropdown.png)

There's also commands for this in Command Mode!
- `m` - Changes the current cell into a Markdown cell
- `y` - Changes the current cell into a code cell

**YOU TRY:** The following four cells have the wrong cell type! Follow the instructions in the cells to convert them to the correct types.

In [None]:
I'm  a *code* cell, but I should be a *markdown* cell. Use the *dropdown menu* to make me a markdown cell!

# I'm a markdown cell, but I should be a code cell. Use the dropdown menu to make me a code cell!

import random
print("Success!")
for _ in range(30):
    print(random.choice(["✨","⭐","🌸","💖","🌟","💫"]), end=" ")

In [None]:
I'm  a *code* cell, but I should be a *markdown* cell. Use `m` in Command Mode to make me a markdown cell!

# I'm a markdown cell, but I should be a code cell. Use y in Command Mode to make me a code cell!

import random
print("Success!")
for _ in range(30):
    print(random.choice(["✨","⭐","🌸","💖","🌟","💫"]), end=" ")

## 6. Documentation

Sometimes, you might not know or remember exactly how a function works or what argument it takes, or what attributes an object has. Jupyter has some built-in functionality to help with this. You can use `?` in a code cell to pull up the documentation for an object or function.



Let's figure out how this line of python code from earlier works using `?`: 
```python
print("I'm a code cell! I print a random number: " + str(random.randint(1,10)))
```

**YOU TRY:** Create a new cell below this one and put the code `random.randint?` in it, then run it with `Ctrl-` or `Shift-Enter`. What random values can the previous line of python print?


**YOU TRY:** Create a new cell below this one and put the code `str?` in it, then run it with `Ctrl-` or `Shift-Enter`. What does `str()` do, and why do you think we needed to use it in the example?


Noteable includes a special functionality to check this easily. You can use `<ctrl> + i` or navigate to **Help > Contextual Help** to open a pane which shows you the documentation for whatever python object is under your cursor.

**YOU TRY:** Look at the code below. Activate contextual help, then put your cursor inside the `print` function to figure out what the argument `end` does.

In [None]:
for _ in range(30):
    print(random.choice(["✨","⭐","🌸","💖","🌟","💫"]), end=" ")

## 7. Tab completion

That's useful for *understanding* code, but what about when you need to *write* code, and you don't know the name of the function you need? Jupyter includes tab completion for this! When you've typed the name of an object followed by a `.` and your cursor is after the dot, you can hit the `Tab` key to see possible completions. You can use arrow keys or keep hitting `Tab` to cycle through all the possibilities If you remember the start of a function's name, you can type the first few characters, then hit tab, and Jupyter will auto-complete it for you!

**YOU TRY:** Below, we have a string that starts with "My". In Python, strings have a built in function that lets you check if they start with another string. Use tab completion to find this function and check if the string starts with "My" Remember to put your cursor right after the dot!

In [None]:
sentence = "My start is My"

In [None]:
sentence. # finish me!

Some of you might have gotten an error just now! If so, it's probably because you forgot to run the cell defining `sentence` before running your solution. In Jupyter, *the order in which you execute cells matters.* This can be very dangerous and easily lead to mistakes!

## 8. The kernel

When you first start a notebook, you are also starting what is called a *kernel*. This is a special program that runs in the background and executes Python code. Whenever you run a code cell, you are telling the kernel to execute the code that is in the cell, and to print the output (if any).

### Restarting the kernel

It is generally a good idea to periodically restart the kernel and start fresh, because you may be using some variables that you declared at some point, but at a later point deleted that declaration. This means your code won't work when someone tries to run it from scratch.

<div class="alert alert-danger">
Your code should <b>always</b> be able to work if you run every cell in the notebook, <i>in order</i>, starting from a new kernel. </div>

To test that your code can do this, first restart the kernel by clicking the restart button:

![](images/restart.png)

Then, run all cells in the notebook in order by choosing **Cell$\rightarrow$Run All** from the menu above.

## Interrupting the kernel

Sometimes, you might start running a Python cell that takes a long time to execute, and realize you don't want it to (Because it has a bug, because it takes too much memory and will crash the kernel, because you don't want to wait for it to execute). In these cases, you don't have to restart the kernel, but you can *interrupt* it, by hitting the square stop symbol just to the left of the restart kernel button in the menu bar.

## 9. Debugging in Noteable

As we mentioned previously, one thing to be very careful about with Jupyter notebooks is that variables will retain their value across cells until they are redefined or the session is restarted. This means it is easy to accidentally make errors by accessing a variable that is incorrectly defined or has been deleted. 

As an example, say your lab partner wrote the following code, which finds the longest word in `sentence` and tries to print it out letter by letter. Your partner meant to type `j` inside the loop in the second cell, but they're wondering why it printed out "r" 5 times instead of crashing.

**YOU TRY:** Run the below code.


In [None]:
# lets find the longest word!
sentence = "My start is My".split() # check what the value of sentence is, and the documentation for split() if you're not sure!
longest_word = ""
for i in range(len(sentence)):
    if len(sentence[i]) > len(longest_word):
        longest_word = sentence[i]

In [None]:
# print all the letters in the longest word
for j in range(len(longest_word)):
    print(longest_word[i]) # oops, we meant to type j

Jupyter allows us to inspect the current value of all variables using its built in debugger. You can open the debugger using the bug icon in the menu bar:
![](images/debug.png)

**YOU TRY:** Open the debugger, and find the value of `i`. Why do you think it has this value? Does it make sense why the code prints out "r" 5 times?

The debugger is also useful when your code has an error--you can inspect the value of the variables before the error to figure out why the error is happening. 

Pretend again that your lab partner wrote the below cell, thinking that the value of sentence is the string "My start is My", and they're wondering why they get an error when they run it. 

**YOU TRY:** Run the below cell. Find the value of sentence in the debugger, and use it to explain why 6 is out of range. Bonus: use `?` or contextual help to show the documentation for `split` and explain how it made the value of sentence what it is.

In [None]:
sentence[6]

Another useful command you can use in a code cell is `%history`. This is a special Jupyter functionality called a "magic command". They allow you to access extra information or process different types of data than vanilla Python. In this case, `%history` shows us all the Python code which has been executed and the order.

While we won't be relying on magic commands in this course, you can learn more about the available commands [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html).

**YOU TRY:** Run the below cell and use it to find where in the command history `sentence` was redfined to have a length shorter than 6.

In [None]:
%history

## 10. Advanced Command Mode.
So far in this tutorial, we've introduced a range of commands, summarized below:
- `Ctrl + Shift + h` - Brings up help window (contains full list of commands!)
- `Up` - select the previous cell
- `Down` - select the next cell
- `Enter` - Enters Edit Mode
- `m` - Converts a cell to Markdown
- `y` - Converts a cell to code
- `b` - Creates a new cell below the currently selected one

However, there are many more useful commands:
- `a` - Creates a new cell above the currently selected one
- `Shift + Up/Down` - select multiple cells above/below
- `dd` - Deletes the selected cell (press 'd' twice in a row)
- `Shift + m`  - Merge the selected cells with the following one
- `c` - Copy selected cell(s)
- `v` - Paste cell(s)
- `z` - Undo cell(!) action
- `Shift + z` - Redo cell action

**YOU TRY:** Use `Shift + Down` to select the following blank cells, and `Shift + m` to merge them. Then use `dd` to delete the merged cell, and `z` twice to restore the merged cell and unmerge it.