# Lab 0: Introduction to Jupyter Notebooks in Noteable
#### Accelerated Natural Language Processing (INFR11125)

## General goals of the ANLP labs

Welcome to our first lab! The materials in our labs are designed to help you get familiar with:
- some commonly used **tools and packages** for NLP, which you will also need to use in the assignment.
- some of the fundamental **concepts** covered in the course, and how they apply in practice.

This first lab is self-paced on your own. After this one, each lab is designed to be done during the timetabled lab sessions, when you can get help from a partner and the lab demonstrators. Some labs might take a bit longer, so if you don't finish during the session, please complete the lab on your own time afterward.


## What you will learn in this lab

This lab does not cover any NLP concepts, but is designed to introduce you to Noteable, which we will be using for the rest of the labs.  After completing it, you should be ready to get going on Lab 1 next week, where we will start working with language data!

The lab has three parts:

### Part 1: Basics of notebooks

- What is a notebook; how to make basic edits and understand the kernel.

### Part 2: Tools for productivity and debugging

- Ways to access documentation; the debugger and defensive coding tips; and useful keyboard shortcuts.

### Part 3: Tips for the remaining labs

- A very brief section to read before attending the first lab session.

  

## Different paths for different students

### If you've used Jupyter notebooks before:
- You can skim through **Part 1** quickly, but don't skip it entirely since there will likely be small differences between Noteable and other notebook interfaces you've used.
- Please work through **Parts 2 and 3** in full! Noteable has more/better debugging tools than many notebooks, so you'll almost certainly find some new things here.

### If you are new to programming/Python:
- You should aim to complete all parts of the lab. However, if you are finding the debugging sections in Part 2 hard to follow in Week 1, you may skip over them and come back to them in Week 2 or 3 after you've had slightly more programming practice. Or, visit the TA help hour!

### All students:
- The lab includes many parts for you to do, indicated with **YOU TRY:**. We hope these should be self-explanatory, but things can always proceed unexpectedly. If something isn't working for you, please do re-read the instructions and check again, but don't spend more than 5-10 minutes on any given section if you get stuck! Post to Piazza or visit the TA help hour.

# Part 1: Basics of notebooks

## 1. What is Jupyter? What is Noteable?
Jupyter is a platform for interactive computational documents called notebooks. A Jupyter notebook is a file type that allows you to weave together code, text, equations/mathematical notation, visualizations and multimedia. Notebooks allow you to run code and see its outputs, right alongside formatted explanatory text. Jupyter notebooks are organized into *cells*--short snippets which are either text or code.

In this course, we'll be using *Noteable*, which is the University of Edinburgh's own version of JupyterLab. 
This tutorial therefore focuses on JupyterLab and Noteable, but Jupyter notebooks can be run in many environments. The way the notebooks work will basically be the same, but the appearance, commands, and the layout of the interface might be a bit different from how it's described here. 

Further, different notebook editors might have different features: for example, Colab allows real-time collaboration and access to GPUs, but Noteable comes with all the packages you'll need for this course pre-installed and sophisticated debugging features (which we will demonstrate in this tutorial!)


## 2. Where to get help beyond this tutorial

This tutorial should be self-contained, but if you do get stuck or want to learn more about Jupyter/Notebooks/JupyterLab/Notable, you can check out the following links:
    
- [Noteable User Guide](https://noteable.edina.ac.uk/user-guide/) This guide is the most Noteable-specific, and also includes overview information about notebooks and the JupyterLab interface.
- [JupyterLab Documentation](https://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html) Less Noteable-specific, but even more information!

There's also a lot of help within the Notable interface itself! In the Help tab at the top right, you'll find links to lots of reference materials for various tools we'll use in this course.

## 3. Command mode and edit mode

In the notebook, there are two modes: *edit mode* and *command mode*. By default the notebook begins in *command mode*, where you can select cells. In order to edit a cell, you need to be in *edit mode*.

When you are in command mode, you can press `Enter` to switch to edit mode. The contents of the cell you currently selected will become editable, and a cursor will appear. You can edit the cell, then hit `Escape` when you're done to return to command mode.

**YOU TRY:** Select this cell by clicking on it with the mouse. You should see a blue bar on the left. (You can also use the arrow keys to select a different cell.) Now, press `Enter` to enter edit mode. Notice how the appearance of the cell changes. Make an edit to the cell, then press `Escape` to return to command mode.

You'll notice that while you're now in command mode (arrow keys move between cells), the previous cell still looks like code! This is because Jupyter will only *render* the text when you tell it to. To do this, we need to "run" the cell by pressing **`Ctrl-Enter`**, and then it will go back to looking like it did originally.

**YOU TRY:** Use the arrow keys to select the previous cell, then hit `Ctrl-Enter`. The cell should now look like nicely formatted text again, with the edit you made!

## 4. Text cells

Cells in a Jupyter notebook are either *code cells* or *markdown (text) cells*. In this course you'll mainly need to edit code cells. Still, it's worth knowing a little bit about how to edit and format text in Jupyter notebooks. Jupyter notebooks use *Markdown*, a simple way of typing formatted text without needing buttons for styles. You can make text **bold** or *italic*, or type an equation using LaTeX ($y=m\cdot x+b$). You can also make headings, lists, and more. 

If you're interested, you can learn more markdown here:
[Markdown cheat sheet](http://nestacms.com/docs/creating-content/markdown-cheat-sheet)

**YOU TRY:** Select this cell and hit `Enter` to see the Markdown code used to make this cell. How are we making headings, bold text, and italic text? Try formatting some text of your own. Don't forget to hit `Ctrl-Enter` when you're done to see the result!

## 5. Code cells

The other kind of cell is a *code cell*, where you can write executable code. In this course, we will use a Python *kernel*, which means Python will be used for executing (running) the code cells. You can also use Jupyter for other programming languages, like R and Julia. 

When you run a code cell using `Ctrl-Enter`, the output of the code will display below the cell. To the left of a cell, you will see a number in brackets. This helps you keep track of the order in which code cells were executed. Empty brackets indicate that a cell hasn't been run yet. If you re-run a cell, the number will change.

<div class="alert alert-danger">
Values defined in one cell stay defined in later cells!! This can cause trouble if you're not careful.
</div>

**YOU TRY:** Select the following code cell, and execute it a few times! (Remember, `Ctrl-Enter` to execute.) What happens?

In [None]:
import random

print("I'm a code cell! I print a random number: " + str(random.randint(1,10)))

You don't actually have to include a print statement to display a value in a Jupyter notebook.

**YOU TRY:** Run the following code cells, in order. You might find `Shift-Enter` useful--it runs the current cell, but rather than keeping the same cell selected like `Ctrl-Enter` does, it selects the following cell, so you can run a lot of cells in a row without using the arrow keys. Which cells produce an output, and which don't? What value is displayed, and why?

In [None]:
a = 10
b = 20
a

In [None]:
b

In [None]:
a
b = 30

In [None]:
b
a

Notice that only the value of the final line of code is shown as output (and some lines, like `b=30`, don't output anything). If you want to print more than one value, you need to use `print` statements. 

**YOU TRY:** Create a new cell below this one , either by hitting the + symbol in the menu bar just above the notebook, or by typing `b` in Command Mode. 

![](images/new_cell.png)

**YOU TRY:** Now, in your new cell, add `print` statement(s) to output the values of both `a` and `b`. Before you  execute the cell, look at the cells you ran above, and predict what values will be printed. (Hint: you can use a single print statement to print both variables on one line, or two print statements to print them on separate lines.) Once you've made your prediction, run the cell and check you were right!





**YOU TRY:** A common source of bugs/confusion in Jupyter notebooks is forgetting to run cells, or running them out of order. To see why, try re-running just the first cell in the sequence above, where `a` and `b` were assigned the values of 10 and 20. Now re-run the cell you added, that prints out `a` and `b`. What is the result? Is it the same or different from what you would get if you ran all the cells in order?

<div class="alert alert-danger">
You must ensure that your code <b>always</b> works if you run every cell in the notebook <b>in order</b>, starting from a new kernel (i.e., without any variables defined). </div>

We'll talk about how to test for this a bit later, when we discuss the kernel.

## 6. Changing cell types

When you're writing a Jupyter notebook, you might accidentally make a code cell when you need a markdown cell, or vice versa. In Noteable, there's a dropdown menu just above the notebook where you can change the cell type:

![](images/cell_type_dropdown.png)

There's also commands for this in Command Mode!
- `m` - Changes the current cell into a Markdown cell
- `y` - Changes the current cell into a code cell

**YOU TRY:** The following four cells have the wrong cell type! Follow the instructions in the cells to convert them to the correct types.

In [None]:
I'm  a *code* cell, but I should be a *markdown* cell. Turn me into a markdown cell!

In [None]:
# I'm a markdown cell, but I should be a code cell. Turn me into a code cell!

import random
print("Success!")
for _ in range(30):
    print(random.choice(["✨","⭐","🌸","💖","🌟","💫"]), end=" ")

**YOU TRY:** If you haven't already, run the two cells you just converted, to make sure they work as intended. Then, try changing them back to the wrong kind of cell using whichever method you didn't use before: if you used the menu bar, try using Command Mode; or vice versa.

## 7. Saving your work

Normally, your work will automatically be saved, so if you start a new Noteable session, you can pick up where you left off. However,

<div class="alert alert-danger">
If you leave your Noteable window open in a browser but are not working on it for a long time, you may get logged out (possibly without any immediately obvious signs), and changes you make might <b>not</b> be saved.
</div>

You'll likely notice problems relatively soon (e.g., your code may not run, you may get an error message about the kernel or be unable to restart it). But to be safe, we suggest you **close any Noteable windows if you're taking a break**, and restart/login again when you return, using https://noteable.edina.ac.uk/login. 

## 8. The kernel

When you first start a notebook, you are also starting what is called a *kernel*. This is a special program that runs in the background and executes Python code. Whenever you run a code cell, you are telling the kernel to execute the code that is in the cell, and to print the output (if any).

### Restarting the kernel

It is generally a good idea to periodically restart the kernel and start fresh, because you may be using some variables that you declared at some point, but at a later point deleted that declaration. This means your code won't work when it's run from scratch.

**YOU TRY:** To test that this notebook works from scratch, first restart the kernel by clicking the restart button. You'll get a pop-up reminder that all variables will be lost. Since that's exactly what we want now, confirm that you want to restart.

![](images/restart.png)

**YOU TRY:** Before you do anything else, go back to the earlier part of the lab where you printed out the values of `a` and `b`. Notice that the output is still visible. However, if you re-run the cell that prints both `a` and `b`, you should get an error. That's because restarting the kernel does not clear output, but it does clear all the variables in your code.

So, how do we test the code? You could re-run each cell by hand, but that's annoying. There are a few other options, available in the **Run** menu.

**YOU TRY:** For now, choose **Run$\rightarrow$Run All Above Selected Cell**, because we'd like to walk you through the cells after this point one at a time. You should now see that your `print` statement is working correctly.

To test a whole notebook at once, you can simply click on the double triangle, or on **Run$\rightarrow$Run All Cells**. 

### Interrupting the kernel

Sometimes, you might start running a Python cell that takes a long time to execute, and realize you don't want it to (e.g., because it has a bug, because it takes too much memory and will crash the kernel, or because you don't want to wait for it to execute). In these cases, you don't have to restart the kernel, but you can *interrupt* it, by hitting the square stop symbol just to the left of the restart kernel button in the menu bar. This just stops the execution of the current cell.

# Part 2: Tools for productivity and debugging

## 9. Python help using `?`

Sometimes, you might not know or remember exactly how a function works or what argument it takes, or what attributes an object has. You probably know that you can access help in Python by using the `help()` function. Jupyter also has some built-in functionality that's similar, but with slightly more information and nicer formatting: just use `?` in a code cell to pull up the documentation for an object or function.

Let's figure out how this line of python code from earlier works using `?`: 
```python
print("I'm a code cell! I print a random number: " + str(random.randint(1,10)))
```

**YOU TRY:** Create a new cell below this one and put the code `random.randint?` in it, then run the cell. According to the documentation, what is the range of possible random values that might be printed?

**YOU TRY:** Create a new cell below this one and put the code `str?` in it, then run the cell. What does `str()` do, and why do you think we needed to use it in the example? *Hint:* If you aren't sure, or to check your answer, try executing the line of code *without* the `str` wrapper, that is: 
```python
print("I'm a code cell! I print a random number: " + random.randint(1,10))
```

## 10. The Contextual Help pane

In Noteable, you can also view documentation for a function in the *Contextual Help* pane, which you can open from the menu bar using **Help > Show Contextual Help**. (The keyboard shortcut `Ctrl+i` does not seem to work consistently.)

You will see documentation for whichever function your cursor is inside of.

**Note:** Contextual help will only work for functions and variables that the Python kernel "knows" about. Essentially, this means that you must have already run the code that defines the function or variable. For example, in order to see the contextual help for the `random.choice()` function in the code below, you must have run the cell earlier in the lab that imports the `random` module.

If you are *not* seeing contextual help for a function or variable, this might be a clue that you accidentally forgot to run the code that defines it. (It doesn't *necessarily* mean that, though: there are other reasons contextual help may not appear for a function. Don't forget, you can still use `?`.)

**YOU TRY:** Activate contextual help, then click on the code cell below and move the cursor around using the arrow keys (but don't run the cell yet). Notice how the documentation changes depending on which function your cursor is inside of. 

In [None]:
for _ in range(30):
    print(random.choice(["✨","⭐","🌸","💖","🌟","💫"]), end=" ")

**YOU TRY:** Still without running the cell, can you predict what the output will be if you run it? Use contextual help to figure out what the argument `end` does. 

**YOU TRY:** Now run the cell and see if you were right. Try changing the `" "` to some other character, re-run the cell, and see if it changes the output in the way you expect.

## 11. Tab completion

Contextual help is useful for *understanding* code, but what about when you need to *write* code, and you don't know the name of the function you need? In Jupyter, you can use *tab completion* for this! 

When you've typed the name of an object followed by a `.` and your cursor is after the dot, you can hit the `Tab` key to see possible completions. You can use arrow keys or keep hitting `Tab` to cycle through all the possibilities.

Also, if you remember the start of a function's name, you can type the first few characters, then hit tab, and Jupyter will auto-complete it for you!

**YOU TRY:** Below, we define a string called `sentence` that starts with "My". In Python, strings have a built-in function that lets you check if they start with another string. Use tab completion in the second cell to find this function and check if `sentence` starts with "My". Remember to put your cursor right after the dot to use tab completion! (You will also see lots of other functions that are available for strings.)

**YOU TRY:** When you think your code is correct, run it! Don't panic if you get an error, but do see the note below.

In [None]:
sentence = "My start is My"

In [None]:
sentence. # finish me!

If you wrote and ran your code correctly, you should get the output `True`. However, some of you might have gotten an error saying that `sentence` is undefined, because you forgot to run the cell defining `sentence` before running your solution. This is just one more reminder that in Jupyter notebooks, *the order in which you execute cells matters.* This can easily lead to mistakes, and is one more reason to periodically re-start the kernel and run all cells.

## 12. Debugging in Noteable

As you start writing code, you will get errors and will need to debug! Noteable has an integrated debugger, which you can access by clicking on the bug icon in the menu bar:

![](images/debug.png)

The debugger has several windows, but in this tutorial we will just focus on the top window, where you can see the values of all variables that have been defined so far.

**YOU TRY:** Open the debugger. If you've completed and run all sections of the lab so far, you should see something like the image below. The bug icon is red, indicating that the debugger is on, and there's an alphabetical list of all the variables and their values. Most of these you can ignore, but you should see the values for the `a`, `b`, and `sentence` variables that we defined above. You can also see some modules, like `random`, that have been imported. You can resize the Variables window if you want to.

![](images/variables.png)

<div class="alert alert-danger">
If you restart the kernel, you will also need to restart the debugger. 
</div>

Now we'll go through an example of using the debugger on some poorly written code.

As we mentioned previously, one thing to be very careful about with Jupyter notebooks is that variables will retain their value across cells until they are redefined or the session is restarted. This means it is easy to accidentally make errors by accessing a variable that is incorrectly defined or has been deleted. 

As an example, say your lab partner wrote the code in the two cells below, which finds the longest word in `sentence` and tries to print it out letter by letter. Your partner meant to type `j` inside the loop in the second cell, but used `i` instead.




**YOU TRY:** Look at the code in the first cell, and (before you run it) try to understand how it works. If you aren't sure what `split()` does, look up the documentation by running `str.split?` in a new cell. What do you think the value of `sentence` will be after running this code? What about `longest word`?

**YOU TRY:** Now run both cells and check if you were right about what the first cell does, by looking at the values in the debugger. (You can click on the > next to `sentence` in the debugger to see what's in the list.)

In [None]:
# lets find the longest word!
sentence = "My start is My".split() 
longest_word = ""
for i in range(len(sentence)):
    if len(sentence[i]) > len(longest_word):
        longest_word = sentence[i]

In [None]:
# print all the letters in the longest word, one per line
for j in range(len(longest_word)):
    print(longest_word[i]) # oops, we meant to type j

Your partner accidentally used `i` instead of `j` in the loop in the second cell. But now they're wondering why it printed out "r" 5 times --- `i` isn't defined in that cell, so your partner thinks it should give an error message about an undefined variable. 

**YOU TRY:** Can you explain to your partner what happened here? You can look at the variables in the debugger if you need to.


## 13. Defensive coding


While it's hard to completely avoid problems caused by typos (like the one above), there are other ways you can pattern your code to try to avoid bugs in the first place.

We've actually created an example in this lab of a pattern you should *avoid*. We defined the variable `sentence` earlier in the lab, assigning it the value `"My start is My"`. But then, in a later cell, we used the same variable and gave it a different value. 

If we accidentally run only some of the cells, or run them out of order, we could end up later on using a value of `sentence` that's not what we expect, and it may be hard to notice that error.

To avoid this problem, a good practice to follow is:
- if you are redefining the value of a variable, try to do it within the same cell.
- if you are working across different cells, use a new variable with an appropriate name, rather than changing the value of an old variable.
This way, if you forget to run one of the cells, or run them out of order, you will get an undefined variable error, which is easy to find and fix, rather than wrong results, which are harder to notice.

For example, if we have already defined `sentence` earlier on, then it would be better to write the code for finding the longest word using the new variable `sentence_words`, as below. (We've also fixed the bug in the second cell.)

In [None]:
# lets find the longest word!
sentence_words = "My start is My".split() 
longest_word = ""
for i in range(len(sentence_words)):
    if len(sentence_words[i]) > len(longest_word):
        longest_word = sentence_words[i]

In [None]:
# print all the letters in the longest word, one per line
for j in range(len(longest_word)):
    print(longest_word[j]) # oops, we meant to type j

## 14. Viewing the history

Another useful command you can use in a code cell is `%history`. The `%` means this is a special Jupyter functionality called a "magic command". Magic commands allow you to access extra information or process different types of data than standard Python. In this case, `%history` shows us all the Python code which has been executed and in what order.

We won't be relying on magic commands in this course, but if you are interested, you can learn more about the (many!) available commands [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html). Also, if you're used to using the standard Python debugger (pdb) and prefer that to the Noteable debugger, you can access it with the `%debug` magic command.

**YOU TRY:** Run the below cell and use it to find where in the command history `sentence` was redfined.

In [None]:
%history

## 15. Keyboard shortcuts and advanced command mode

You can always use menus to do what you want, but when coding, it's often quicker to use keyboard shortcuts (once you learn them)!  You can use `Ctrl + Shift + h` at any time to see a complete list of keyboard shortcuts, but it's a bit overwhelming. Here are a few of the most useful ones in Noteable. Note that these might not be the same in all versions of Jupyter!

### Top keyboard shortcuts

Hopefully you're already very familiar with these from this lab!

- `Enter`/`Esc` - Enter Edit/Command Mode
- `Ctrl-Enter` - Run this cell and maintain selection
- `Shift-Enter` - Run this cell and move to the next cell

### Keyboard shortcuts in Edit mode

If you are used to using keyboard shortcuts, you are probably familiar with many of these already, such as:

- `Ctrl-x` - Cut selected region
- `Ctrl-c` - Copy selected region
- `Ctrl-v` - Paste
- `Ctrl-z`/`Shift-Ctrl-z` - Undo/Redo edit

### Keyboard shortcuts in Command mode

The commands introduced so far in this tutorial are summarized below:
- `Up`/`Down` - select the previous/next cell
- `m`/`y` - Converts a cell to Markdown/code
- `a`/`b` - Creates a new cell above/below the currently selected one

However, there are many more useful commands. Here are just a few of them:
- `Shift-Up/Down` - select multiple cells above/below
- `dd` - Deletes the selected cell (press 'd' twice in a row)
- `Shift-m`  - Merge the selected cells with the following one
- `x` - Cut selected cell(s)
- `c` - Copy selected cell(s)
- `v` - Paste cell(s)
- `z`/`Shift-z` - Undo/Redo cell action

**YOU TRY:** Use `Shift-Down` to select the following blank cells, and `Shift-m` to merge them. Then use `dd` to delete the merged cell, and `z` twice to restore the merged cell and unmerge it.

# Part 3: tips for the remaining labs

## Using the lab computers

The computers in the Appleton Tower labs are set up with DICE, our School's version of Linux.

A few of the monitors have cables for you to plug in your own laptop, but most don't. We'd like you to use the full-sized monitors to make it easier to work with a partner (see below), so we expect that one student in each pair will need to log in using their DICE account.

You should already have a DICE account and password if you are registered for this or any other Informatics course. Please make sure you have this information with you when you attend the lab! 

If you don't have a DICE account yet, please make sure you pair up with someone who does.

You don't need to know how to use Linux for this class, but if you are interested in learning more about DICE or Linux, you can start with this quick introduction provided by our computing support team:

- [Introduction to DICE](https://opencourse.inf.ed.ac.uk/DICE-UG) (optional)

## Working with a partner

From our past experience in this course, we strongly recommend that you do the labs during your timetabled session, so that you can work with another student and get help from demonstrators. 

Please try to find a partner for the lab! Just sit down next to another student and introduce yourself, and work through the lab together. You should try to share a single keyboard and screen, and talk through each question as you go, to stay synchronized. 

Working this way will help you meet other students and learn more, since your partner may have questions or comments you hadn't thought of, and you'll learn to communicate about technical concepts, which is important for your understanding (and for the exam! and for your future job!)

For this to work well, each person should
ensure that both they *and* their partner understand what is going
on. Try to:
- Pay attention to what your partner is doing.
- Talk through the code that's provided, so you both understand how it works.
- Tell your partner, or ask them a question, if you don't understand or don't agree with their answer.
- Ask your partner if they are following along and if they agree with your answer to a question.

If neither of you knows the answer, ask one of the demonstrators in
the lab, and we can try to help you. However, *don't* expect the
demonstrator to simply tell you the answer. They may instead ask
questions or suggest ideas to help you find the answer together
with your partner.

If one person is much more familiar with Python than the other,
try putting the less experienced person at the keyboard or at least
switching frequently, so they will get more practice with basic
coding skills.


## &#127881; &#127881; Congratulations! You're done! &#x1F680; 

We know this lab had a lot of reading to do, but we hope it was useful! See you in the lab sessions next week, where we get to start working with data!