---

### If you are reading this notebook on GitHub, please go to [README](./README.md) and follow the installation instructions to set everything up locally. This is an interactive notebook and you need a local setup to execute the cells. 

---

# Welcome to Jupyter!

In this notebook, you will learn what jupyter notebooks are, get a refresher on python and learn some basic `git` operations which may be useful for future assignments. At the end of the notebook you will complete a simple function and make your first submission.

# Table of contents

0. [Verify setup](#setup)
1. [What is Jupyter?](#jupyter)
2. [Interface and Hotkeys](#interface)
3. [Python Refresher](#python)
4. [Git Basics](#git)
5. [Your first Submission](#submission) **(GRADED!)**
6. [Summary](#summary)

<a name="setup"/></a> <!-- link used in table of contents -->
## Verifying local setup

First, let's make sure you have installed the correct versions of all of the libraries.

Simply run the cell below. You can click on the cell and press `⇧↩` (Shift + Enter) to run it.

In [None]:
%run helpers/verify_config.py

If you see any warning/error messages, please make sure you have followed the installation instructions in the [README](./README.md). In case you can't resolve them please check out Piazza thread dedicated to "Assignment 0".

---

<a name="jupyter"/></a> <!-- <-this link used in table of contents -->
## What are Jupyter notebooks?

![Jupyter Logo](https://jupyter.org/assets/nav_logo.svg)

Jupyter notebooks are an interactive tool for iterative development and prototyping. When learning new concepts it is helpful to be able to look at intermediary results, take notes in Markdown/LaTeX, and have visualizations. All this is possible to achieve using jupyter notebooks.

In some of the future assignments, notebooks will be used to guide you through the assignment. The main purpose of using notebooks is to reduce the steep learning curve, help you get started quickly by walking you through basic examples, and expose you to APIs used in the assignments. This way, you can focus on understanding the concepts rather than digging down into the implementation details of unrelated bits of code. Jupyter is also very useful when you are debugging things, since you can easily check/modify the content of your variables, without having to rerun all of your code.

### Cell types 
Every jupyter notebook consists of cells. Cells come in different types. For the most part, you need to just know two types of cells: Markdown and Code. You can change the type of a cell by clicking on the cell and selecting the type in the toolbar above (Cell>Cell Type). By default all new cells are `Code` type.

In **Markdown** cells you can provide comments, format them, and write LaTeX. This cell you are reading right now is a [Markdown](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) cell. To edit it, you can select it and press 'Enter' or just double click it with your mouse. Once you are done editing it, you can press `⇧↩` (Shift + Enter) to render it as Markdown.

In **Code** cells you can write code. You can write single or multiple lines of code and you can execute them by pressing `⇧↩` (Shift + Enter).

### Basic example

Now, let's start with some simple examples and write some basic python. Use `⇧↩` (Shift + Enter) to run the cells below. You can either use a mouse to select the cells or use arrow keys to move the selection window consecutively over cells (you will learn more about how to navigate soon).

In [None]:
1+1

In [None]:
a = 10

In [None]:
a

In [None]:
a + 5

In [None]:
a

Note that you don't need to write
```python
print(a)
```
Simply executing the variable will output its value. This only works for the last variable name written. To print multiple variables, you need to explicitly use print.

In [None]:
x = 1
y = 2
z = 3
x
y
z

In [None]:
print(x)
print(y)
print(z)

One important aspect of jupyter notebooks is that cells can be executed out-of-order. This can be confusing to newcomers. 

Let's walk through a simple example:

In [None]:
# Initialize counter to zero
counter = 0

Run the `counter += 1` cell multiple times:

In [None]:
# Run this cell multiple times
counter += 1

Let's see what's the counter is now:

In [None]:
counter

Thus, it is not the order that the cells are presented in, but rather the order in which cells are executed that determines the current variable state. Now let's talk to what the kernel does, and what saving the notebook means.

### Kernels

The Kernel is a computational engine which runs behind the notebook and executes code. Jupyter supports over 40 programming languages, including Python, R, Julia, and Scala. The associated kernels can be setup separately and you can also switch between the different kernels, but we won't need to do it in this class as we will only use the Python kernel.

In the Kernel tab in the toolbar above you should be able to see multiple commands that you can send to the kernel. Here are descriptions of some of them:

* **Interrupt** - stops the execution of the code (helpful when you have a long operation but forget to change some parameter, or in case you have an infinite loop)
* **Restart** - clears up all of the variables and releases memory
* **Restart & Clear Output** - restarts the kernel + clears all of the cell outputs in the notebook
* **Restart & Run All** - restarts the kernel + executes all of the cells

It's important to note that kernel state can not be saved, i.e. if you stop/restart the kernel you will need to rerun all of the cells to define the variables. **Restart & Clear Output** is often the best option for beginners to use when restarting the kernel, since it is easy to get confused by the previous outputs. 

For example, consider the following situation:

<img src="./misc/uncleared_cell.png" alt="uncleared_cell" width="700"/>

Here, the first cell containing the variable declaration was executed before the kernel was restarted. However, the output associated with the previous kernel state was not cleared when restarting. Once the kernel was restarted, running the second cell produced an error despite the first cell's output indicating that the variable was defined. In order to access any variable you need to make sure the variable is defined in the current kernel state, even if you see some output associated with the cell that defines the variable.

---

### Signatures and Docstrings

Here are a couple of other things you can do in jupyter. You will find these particularly handy when you will start working on the assignments and want to learn the signatures, docstrings and/or source code of certain classes or functions.

Run the next two cells:

In [None]:
# Let's define a function
def print_something(n_times=2):
    """This function shows a greeting message!"""
    assert n_times > 0
    for _ in range(n_times):
        print("Hello from Jupyter!")

In [None]:
# Add ? at the end of the function to see its docstring
print_something?

In [None]:
# Same could be achieved if you place the cursor at the end of function name and press ⇧⇥ (Shift + Tab)
print_something

There are times you might want to look into the source code of a class/function. You can do it by adding `??` after the class/function name.

In [None]:
print_something??

It also works for any classes/functions defined outside the notebook.

In [None]:
import numpy as np
np.ones??

In [None]:
# Another way is to place a cursor at the end of the function/class name 
# and press ⇧⇥⇥ (Shift + Tab + Tab) to show you the docstring
np.ones

---

### Magic commands

Jupyter has a number of [magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html) you can use (to list all of the magic functions run: `%lsmagic`). You already used one of them above. It was `%run script_name`, which can be used to run `*.py` scripts inside the notebook. This is equivalent to running ``python script_name`` in the terminal. 

Let's try out the `%time` magic command, which measures the execution time of a cell or line of code. It will be useful in future assignments. For example if you are optimizing the execution speed of your code, you can use this magic command to time different approaches. Here are two ways you can use `%time`:

* `%time` - to time a single line of code; **OR**
* `%%time` - to time the whole cell

Here is an example detailing how `%time` can be used:

In [None]:
import random

Let's create an array of random integers and time it:

In [None]:
%%time 
# will time whole cell
random_integers = []
for _ in range(10000):
    random_integers.append(random.randint(0,100))

Now, let's time the same operation using list comprehension instead:

In [None]:
%time random_integers = [random.randint(0,100) for _ in range(10000)] # will time only this line

In the next example, we use `%timeit` to compare a standard python vs numpy implementation of scalar multiplication operation on a 1D array. Note that `%timeit` and `%time` are slightly different. You can read about the differences and about other magic commands [here](https://ipython.readthedocs.io/en/stable/interactive/magics.html).

In [None]:
n_items = 1000000
# Create two identical arrays
integer_list = list(range(n_items))
integer_array = np.arange(n_items)
# make sure both 1D arrays are exactly the same
assert np.allclose(integer_list, integer_array, rtol=0.0)

In [None]:
def scalar_mult_1D(somelist):
    # inplace operation
    for index, value in enumerate(somelist):
        somelist[index] = 2 * value
    return somelist

In [None]:
def np_scalar_mult_1D(nplist):
    return 2*nplist

In [None]:
%timeit scalar_mult_1D(integer_list)

In [None]:
%timeit np_scalar_mult_1D(integer_array)

In [None]:
integer_list = list(range(n_items))
integer_array = np.arange(n_items)
integer_list_2x = scalar_mult_1D(integer_list)
integer_array_x2 = np_scalar_mult_1D(integer_array)
# make sure the resultant arrays are the same
assert np.allclose(integer_list, integer_array_x2, rtol=0.0)

As you can see, for this simple task python arrays are ~100x slower. You will learn more about numpy later on in the class. For now, we will stick to standard python.

---

<a name="interface"/></a> <!-- <-this link used in table of contents -->
# Interface and Hotkeys

To learn more about the interface and about jupyter notebooks in general, check out the `Help` section in the menu.

Here is the TL;DR version of what you should know.

---
## Modal editor

There are two modes in the jupyter: edit mode and command mode.


### Edit
Edit mode is indicated by a green cell border and a prompt showing in the editor area:

![edit_image.png](https://nbviewer.jupyter.org/github/ipython/ipython/blob/3.x/examples/Notebook/images/edit_mode.png)

You can press **Enter** to enter edit mode or click with a mouse in the editable area.

### Command
Command mode is indicated by a grey cell border:

![command_image.png](https://nbviewer.jupyter.org/github/ipython/ipython/blob/3.x/examples/Notebook/images/command_mode.png)

Use **Esc** to enable command mode or click outside the editable area of a cell with a mouse.

When you are in command mode, you can edit the notebook as a whole, but you cannot type into individual cells. Most importantly, the keyboard is mapped to a set of shortcuts that let you perform notebook and cell actions efficiently. For example, if you are in command mode and you press `c`, you will copy the current cell - no modifier is needed.


## Hotkeys

There are a lot of useful hotkeys in jupyter. You can see them all in (Help > Keyboard Shortcuts).

Here is a list of the ones that will save you the most time. You can use these in both command and edit mode and they will perform the same action.

* **Shift + Enter** - Run cell + select next cell
* **Alt + Enter** - Run cell + insert new cell below

Here are hotkeys you can use in command mode only.
* **a** - Insert cell above
* **b** - Insert cell below
* **Shift + Up-Down Arrow key** or **Shift + mouse click cells** - select multiple cells.
* **x** - Cut the selected cell(s)
* **c** - Copy the selected cell(s)
* **v** - Paste the selected cell(s)
* **m** - Convert the cell to markdown (after your run the markdown cell, you can double click it to edit)
* **y** - Convert the cell back to code

<a name="python"/></a> <!-- link used in table of contents -->
## Python

If you are not at all familiar with Python, start learning! Although you can use any tutorial you like, here are a couple to get you started:
* [A beginner’s guide](https://wiki.python.org/moin/BeginnersGuide)
* [An interactive track in CodeAcademy](https://www.codecademy.com/learn/learn-python-3)

Below are a number of cells you can run to get a quick refresher on Python.

#### int/float operations

In [None]:
some_int = 2
some_float = 4.

In [None]:
some_int + some_float

In [None]:
some_int * some_float

In [None]:
some_int / some_float

In [None]:
2/4 # int division results in float

#### string operations

In [None]:
some_string = "Hello "

In [None]:
some_string + "world!" # you can concatenate strings with + operator

In [None]:
some_string

In [None]:
some_string[:2]

In [None]:
some_string[-2:]

In [None]:
some_string[-3:-1]

#### Array/List operations

In [None]:
some_array = [] # new empty array/list

In [None]:
some_array

In [None]:
some_array.append(11)
some_array.append(12)
some_array.append(13)
some_array.append(20)

In [None]:
some_array

In [None]:
some_array.pop()

In [None]:
some_array

In [None]:
some_array[1] # indexing lists starts from 0

In [None]:
some_array[1] = 22

In [None]:
some_array # lists are mutable

#### tuples

In [None]:
some_tuple = (1,2,3) 

In [None]:
some_tuple[1]

In [None]:
some_tuple[1] = 123 # tuples are immutable (you will get an error)

In [None]:
# The only way to modify tuple is to convert it to array/list and then back to tuple
temp_list = list(some_tuple)
temp_list[1] = 123
some_tuple = tuple(temp_list)
some_tuple

#### dictionaries

In [None]:
some_dict = {}
some_dict['name'] = "Alice"
some_dict['age'] = "21"
some_dict

In [None]:
some_dict['name']

In [None]:
some_dict.keys()

In [None]:
some_dict.values()

In [None]:
some_dict.items() # you can iterate through these

In [None]:
for key, value in some_dict.items():
    print(f"{key} - {value}")

#### loops

In [None]:
# For loops
for i in [10,20,30,40]:
    print(i)

In [None]:
# More for loops
for i in range(5):
    print(f"square of {i} is {i**2}")

In [None]:
# While loops
i=0
while i < 5:
    print(f"square of {i} is {i**2}")
    i+=1

In [None]:
some_array = []
for i in range(5):
    some_array.append(i**3)
some_array

#### list comprehensions

In [None]:
some_array = [i**3 for i in range(5)]
some_array

In [None]:
even_numbers = [i for i in range(25) if i%2 == 0]
even_numbers

#### functions

In [None]:
def some_function(some_arg):
    print(f"Argument is: {some_arg}")

In [None]:
some_function(1)

In [None]:
def some_function(some_arg=42): # default values
    print(f"Argument is: {some_arg}")

In [None]:
some_function()

#### classes

In [None]:
class SomeClass():
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def get_a(self):
        return self.a
    def get_b(self):
        return self.b

In [None]:
my_object = SomeClass(10,100)
my_object.get_b()

#### function pass by reference

In [None]:
def some_function():
    return 42

def print_function_output(fn):
    print(fn())

print_function_output(some_function) # note that some_function has not executed here

#### unpacking

In [None]:
# Using asterisk to pack the rest of the output
def foo():
    return 1,2,3,4,5,6,7

a, *_ = foo() # in case you are interested only in the first value returned
a

#### Nobody is perfect. A few Python quirks

In [None]:
def foo(a=[]):
    a.append(1)
    print(a)

In [None]:
foo()

In [None]:
foo()

In [None]:
foo()

For more details and how to avoid it, see: [https://docs.python-guide.org/writing/gotchas/](https://docs.python-guide.org/writing/gotchas/)

Here is another quirk:

In [None]:
a = 0
for i in range(10000000):
    a += 0.1
print(a)

The problem is an approximation of 0.1. The true decimal value of the binary approximation stored by the machine is:
```python
> 0.1
0.1000000000000000055511151231257827021181583404541015625
```
The error in the approximation accumulates as you sum up many 0.1s. See here for more detail: https://docs.python.org/3.4/tutorial/floatingpoint.html

And yet another quirk, that students have trouble with quite often in the assignments for this class:

In [None]:
a = [1,2]
b = a
b.append(3)
print(a)
print(b)

The last case happens because the lists/arrays are passed by reference. You can avoid it using the following technique:

In [None]:
import copy
a = [1,2]
b = copy.copy(a)
b.append(3)
print(a)
print(b)

### Debugger

You can use debuggers within jupyter. There are two available by default: `%pdb` & `%debug`. We won't be covering those here. There are multiple tutorials available online that you can reference.
* https://davidhamann.de/2017/04/22/debugging-jupyter-notebooks/
* https://www.blog.pythonlibrary.org/2018/10/17/jupyter-notebook-debugging/

There are also visual third party debuggers that you can try: 
* https://www.analyticsvidhya.com/blog/2018/07/pixie-debugger-python-debugging-tool-jupyter-notebooks-data-scientist-must-use/.

<a name="git"/></a> <!-- <-this link used in table of contents -->
## Git  - Keeping your code upto date!

After cloning, we recommend creating a branch and doing your development on that branch:

`git checkout -b develop`

(assuming develop is the name of your branch)

If the TAs push out an update to the assignment, you should commit (or stash if you are more comfortable with git) the changes that are unsaved in your repository:

`git commit -am "<some funny message>"`

Then update the master branch from remote:

`git fetch origin master`

This updates your local copy of the master branch. Now try to merge the master branch into your development branch:

`git merge master`

(assuming that you are on your development branch)

While on your development branch you can also just run:

`git pull origin master`

which will perform a fetch and then a merge into the branch you are on.

There are likely to be merge conflicts during this step. If so, first check which files are in conflict:

`git status`

The files in conflict are the ones that are "Not staged for commit". Open these files using your favourite editor and look for lines containing `<<<<` and `>>>>`. Resolve conflicts in whatever way you deem best. You can use special software like [Sublime Merge](https://www.sublimemerge.com/) to do so. Once you have resolved all conflicts, stage the files that were in conflict:

`git add -A .`

Finally, commit the new updates to your branch and continue developing:

`git commit -am "<funny message vilifying TAs for the update>"`

---

<a name="submission"/></a> <!-- <-this link used in table of contents -->
## First submission

For some (not all) future assignments, you will implement your code using a jupyter notebook. In order for us to grade your code, you must export your code into a submission file and then submit that file to Gradescope. We will see an example of this below.

Once you are done with an Assignment or have reached a submission checkpoint, there are two ways you can export your code to the submission file:
1. Simply run a cell with the following code
```python
%run helpers/notebook2script
```
2. Or run `python helpers/notebook2script.py` from your terminal, while in the assignment repo.

The script will parse the notebook, extract the cells that contain the `#export` comment on top of them and copy them to a `submission.py` file. The jupyter notebooks we provide for any Assignment will already include the `#export` comment at the top of any cells that contain skeleton code for code that needs to be submitted. If you add any cells to the notebook, you will need to add the `#export` line to ensure it gets exported to the submission file. Adding cells is usually not required, but may be useful if you are adding helper functions or additional classes to help decompose your code. You should verify that the final submission file contains all of the functions you were required to implement as well as any helper functions/classes you've added and then submit it to Gradescope for grading.

Let's walk through an example:

In [None]:
#export
def return_name():
    """Returns student name"""
    #TODO return your name as a string
    return None

In [None]:
print(f"Hello {return_name()}")

**Save your notebook!** and run:

In [None]:
%run helpers/notebook2script

Then, run the next cell to see the content of the `submission.py` file. Or simply open it in an editor of your choice. Note that the `%load` magic command will load the contents of the file into the same cell (just beneath the command). It also comments out the load statement itself, which will need to be uncommented to reload the file with new contents.

In [None]:
%load submission.py

Now go ahead and submit the `submission.py` file to [Gradescope](https://www.gradescope.com/). It is worth **1 point of your Final grade**! ;)

---

<a name="summary"/></a> <!-- <-this link used in table of contents -->
# Summary

We highly recommend that you spend some time in this notebook and play around with any new concepts you encountered in this tutorial. If you haven't worked with jupyter notebooks before, experimenting with them here will save you a lot of time in the future.

Here is a list of things to remember:

1. Cells in the notebook can be executed in arbitrary order
1. It might be healthy to restart the kernel from time to time if you are losing track of what you did/didn't execute
1. Use **Restart & Clear Output** to restart the kernel to avoid confusion with old output
1. Feel free to rearrange the cells in notebook since only **.py** files are submitted
1. `#export` comments will be included by default, but make sure you double check the autogenerated `submission.py` before submitting.

# Contribute to the class

If you find any typos, have some issues or suggestions on how to improve this or any future assignments please feel free to create a Pull Request or make a Piazza post about it.

---
<!-- Hi there! -->