# Hello World!

- [Jupyter Notebooks](#Jupyter-Notebooks)
    - [The Kernel](#The-Kernel)
    - [Cells](#Cells)
    - [Cell Execution](#Cell-Execution)
    - [Basic Markdown Syntax](#Basic-Markdown-Syntax)
        - [Headers](#Headers)
        - [Emphasis](#Emphasis)
        - [Ordered Lists](#Ordered-Lists)
        - [Unordered Lists](#Unordered-Lists)
        - [Complex Lists](#Complex-Lists)
        - [Links](#Links)
        - [Table of Contents](#Table-of-Contents)
    - [Code Cells](#Code-Cells)
        - [Writing Python Code](#Writing-Python-Code)
        - [Python Console](#Python-Console)
    - [Navigating the Jupyter Environment](#Navigating-the-Jupyter-Environment)
        - [The Launcher](#The-Launcher)
        - [Navigating Folders](#Navigating-Folders)
        - [Saving Notebook Status](#Saving-Notebook-Status)
        - [Exporting Notebooks](#Exporting-Notebooks)
        - [Managing Active Notebooks](#Managing-Active-Notebooks)

## Jupyter Notebooks

Jupyter Notebooks are an interactive computational environment for creating and sharing reproducible computational workflows and analyses. The name "Jupyter" is a reference to the three original programming languages and (Julia, Python, and R) supported by the Jupyter Notebook project. 

### The Kernel

The kernel is an instance of a programming language running in a Jupyter Notebook. It is responsible for executing the code in the notebook. Look at the top right corner of the notebook interface. Since we are using Python, you should see something like "Python 3". If you are not connected to a kernel, you will see "No Kernel" instead. When you see "No Kernel", all you have to do is click it and you will be shown a list of available kernels to select from. 

### Cells

A cell is a container for text to be displayed in the notebook or code to be executed by the kernel. As such there are two types of cells:

1. **Markdown cells** - these contain text that is displayed in the notebook. We can format the text using Markdown syntax ([ref 1](https://www.ibm.com/docs/en/watson-studio-local/1.2.3?topic=notebooks-markdown-jupyter-cheatsheet), [ref 2](https://daringfireball.net/projects/markdown/syntax))
2. **Code Cells** - these contain the code that is executed by the kernel

A new cell can be added to the notebook with the "+" button the top left of the notebook tab. This adds a new cell below the currently selected cell. The cell type (code or markdown) of the currently selected cell is displayed at the top of the tab as well and can be changed from there as well. You can also use the "y" key (**code**) and "m" (**markdown**) to set the type of the selected cell. Note that these shortcuts only work when the the cursor is not in the cell.

### Cell Execution

A selected cell can be executed using the play button at the top of the tab or more conviniently with "Shift + Enter" key combination. 

### Basic Markdown Syntax

#### Headers 

The # symbol preceding text is used for denoting headers in Markdown. The number of # indicated the header level. 

# Header 1

## Header 2

### Header 3

#### Header 4

#### Emphasis

Wrapping text with the * symbol is used for emphasizing text in Markdown. One * for *italicizing*, two for **bold** text, and three for ***bold and italic***

#### Ordered Lists 

1. First item in the list
2. Second item in the list
3. Third item in the list
    1. First sub item 
    2. Second sub item

#### Unordered Lists

- An Item 
- Another Item
- Yet Another Item
    - An item within an item
    - Another item within an item

#### Complex Lists

1. First item in the list
2. Second item in the list
3. Third item in the list
   - Unordered item within an item
   - Another unordered item within an item

- An Item 
- Another Item
- Yet Another Item
    1. First item within an item
    2. Second item within an item

#### Links

URLs and email addresses in markdown cells are automatically detected and stylized by the notebook. A link with a text label can be added using the format \[Label\]\(Link\). For example, here's the link to the [Week 1 Lab GitHub Repository](https://github.com/UM-CSS/CSSLabs-NLP)

#### Table of Contents

You can use the markdown links format to also link to sections denoted by headers to manually create a table of contents. Note the special construction of the section links in the TOC below (no spaces between # and header text and other spaces are replaced by "-")

- [Jupyter Notebooks](#Jupyter-Notebooks)
    - [The Kernel](#The-Kernel)
    - [Cells](#Cells)
    - [Cell Execution](#Cell-Execution)
    - [Basic Markdown Syntax](#Basic-Markdown-Syntax)
    - and so on.

Jupyter now has a built in document outline and TOC. But it doesn't get added to the document when exporting the notebook to other file formats (Ex: pdf, html, latex etc.). So this is worth learning

### Code Cells

In Jupyter, a code cell can be recognized by the square brackets to the left of the cell. An unexecuted code cell has **[ ]:**, while an executed code cell has **[integer]:**. The integer within the square brackets in a code cell denotes the order of execution. At any given moment, the code cell with the highest number is the one that has been executed last. Sometimes code cells take time to finish executing. A code cell that is still executing or is waiting to execute will have **[\*]:** next to it.

<font color='red'>Caution: </font> Since Jupyter is an interactive environment, you can execute code cells in any order. Further a code cell can be executed any number of times. It is very common to end up with an inconsistent or unexpected notebook state due to losing track of the order in which cells were executed. In general, we recommended the following;

1. Execute code cells in order
2. When a change is made to a code cell at some intermediate point in the notebook, execute the entire notebook from the first code cell. 

#### Writing Python Code

Let's write some very basic Python code in the next few cells starting with the **print** function. **print** is used to display text and variables (a container for a value) in an output cell 

In [13]:
# print simple text
print("This is the first output")

This is the first output


Notice that you can use # symbol to add a text comment to the code cell.

In [16]:
# print some text given as separate values
print("This is the ",2,"nd output", sep="")

This is the 2nd output


The **sep** argument tells Python how to separate the  multiple values (separated by comma) given as input. The default **sep** is a **whitespace**

In [18]:
## print a text with a the value of a python variable
output_number = 3
print("This is the ", output_number, "rd output", sep="")

This is the 3rd output


Usually, we want to write more complex Python code to perform reusable chunks of computation. For this, we use Python **functions**. In the next code cell, we have written a Python function to count the numbers of different letters in a given string.

In [19]:
def count_letters(input): # a function is defined by using the format def function_name(input1, input2...):
    counts = dict() # define a key-value pair dictionary variable
    for token in input: # iterate through each position in the string
        if token.isalpha(): # is the character a letter
            low_char = token.lower() # convert character to lowercase
            if low_char in counts: # if we have already encountered this letter
                counts[low_char] = counts[low_char] + 1 # increment the count for the letter by 1
            else: # if this is the first time the letter is encountered
                counts[low_char] = 1 # set the count for the letter to 1
    return counts # return the key-value dictionary as output
                

Let's run our function on some string. 

In [25]:
input_string="The university consists of nineteen colleges and offers degree programs at undergraduate, graduate and postdoctoral levels in some 250 disciplines"
count_letters(input_string)

{'t': 9,
 'h': 1,
 'e': 18,
 'u': 4,
 'n': 10,
 'i': 8,
 'v': 2,
 'r': 9,
 's': 12,
 'y': 1,
 'c': 4,
 'o': 9,
 'f': 3,
 'l': 6,
 'g': 5,
 'a': 9,
 'd': 8,
 'p': 3,
 'm': 2}

Note how the output from the function is printed automatically. Only the final output or print statement is displayed in the output cell. If the code cell generates multiple things we want to view, then some of them can be assigned to **variables** which are containers for values.

In [26]:
counts = count_letters(input_string)

In [27]:
print(counts)

{'t': 9, 'h': 1, 'e': 18, 'u': 4, 'n': 10, 'i': 8, 'v': 2, 'r': 9, 's': 12, 'y': 1, 'c': 4, 'o': 9, 'f': 3, 'l': 6, 'g': 5, 'a': 9, 'd': 8, 'p': 3, 'm': 2}


#### Python Console

Some times, it becomes cumbersome to create (and later delete) multiple new code cells to look at different variables. Instead, you can use a Python console attached to the notebook. You can open a Python console by right clicking anywhere on the notebook (except a selected cell) and selecting **New Console for the Notebook**. 

<font color='red'>Caution: </font> Any variables that you create or any changes to variables or functions that you make on the console affect the notebook status. In general, we recommend using the console only to quickly view values of variables. 

### Navigating the Jupyter Environment

#### The Launcher

The Jupyter Environment starts up with one default tab called the **Launcher**. As we saw at the beginning of the lab, this can be used to create new **notebook**, **markdown** file etc.,  open a Python **console** or a **terminal** on the server where our environment is hosted. 

#### Navigating Folders

The **folder symbol** in the toolbar in the left corner of the Jupyter environment, opens a view that shows your folder structure on the server. You can navigate this structure the same way you would navigate files and folders on your personal computer. 

1. Create a new a file or notebook: Use the "+" at the top left of the folder view. This opens a new launcher.
2. Create a new folder: Use the **folder** symbol with a "+" sign. An untitled folder will be created immediately at the current location in the folder structure
3. Upload a file from your computer: The **upward arrow** symbol will open a file dialog window  which allow you to upload a file from your personal computer

#### Saving Notebook Status 

If a notebook has unsaved changes  there will be a **grey filled-in circle** icon next to its name in the tab. Changes made to a notebook can be saved using the **floppy disk** button at the top left of the notebook tab or using the shortcut **Ctrl + s**. 

#### Exporting Notebooks

A notebook can be exported into more accessible forms such as **pdf** and **html** that don't require a Jupyter environment to read. To export the notebook select **File menu** at the top left and then choose **Export Notebook as**. 

<font color='red'>Caution: </font> Note that exporting will convert the notebook and will not execute any unexecuted code cells or execute the notebook from scratch. Make sure the notebook is in the correct display status before exporting. 

**Note**: The labs ask you to upload the your completed notebooks as either **pdf** or **html**. Exporting to **pdf** is known to sometimes fail with errors. If that happens, just export to **html** for the submission.



#### <font color='red'>Managing Active Notebooks</font>





You can open a notebook by double-clicking it on the folder view. When the notebook is opened, it is immediately activated by connecting to a kernel. A Jupyter environment can have multiple simultaneously active notebooks. Active notebooks have a little **green circle icon** next to them on the folder view.  

Each open notebook requires a seperate Kernel which takes up resources on the server similar to tabs on a Web browser. It is good practice to shut down any notebooks that we are currently not using, as the resources of the server are shared between everyone. 

There are different ways to shutdown notebooks.

1. Right click on the notebook file on the folder view and select **Shut Down Kernel**

2. Select the **Stop icon** in the toolbar in the left corner of the Jupyter environment. This opens up a view showing open tabs, kernels, and terminals. Even if the corresponding notebook is not open as a tab, the open kernels maintain the status of the notebook.  You can close individual notebooks by hovering over their name and selecting **X** or close all open kernels by using **Shut Down All**. 

<font color='red'>When you are done with your work, make sure to close all notebooks and terminals using the second option above and log out of the environment with **File > Log out**</font>

