# 🐍 Part 1: Exploring Jupyter Notebooks & Python

**Disclaimer:** This section is largely borrowed from the [Jupyter Notebook tutorial](https://github.com/ABS-Neural-Nets-Tutorial/Intro-To-Neural-Networks/blob/main/notebooks/1_Exploring_Notebooks.ipynb) and the [R tutorial](https://github.com/ABS-Neural-Nets-Tutorial/Intro-To-Neural-Networks/blob/main/notebooks/2_Learning_R.ipynb).

## 📓 About Notebooks
Science is all about recording methods and results of experiments. Just like how a student of science is expected to use a lab book to scribble their thoughts and observations, likewise [Jupyter Notebooks](https://jupyterlab.readthedocs.io/en/stable/user/notebook.html) provide a platform for exploration and experimentation -- but with computer and code.

When to use notebooks: 👍  
 ❤️ Learning new methods and programming languages (ie. right now!)  
 ❤️ Experimenting with methods 

When not to use notebook: 👎  
🤮 Productionising an application

Notebooks are made up of building blocks of **cells**. There are two such types
1. **Markdown Cells:** You can write stuff here and add images (this current cell is a markdown cell).
2. **Code Cells:** You can write and run code here (these cells usually have `[ ]` next to them, sometimes with a number inside).

For the purpose of this workshop, we only need to run the **code cells**. In order to run a code cell, click on the cell so that you see a blue bar on the left side like so:

![br](../images/blue-ribbon.PNG)

Once selected, press the ▶️ in the top bar. Alternatively you can click on a code cell and press `cltr+enter` or `shift+enter`.

---
## <font color='#F89536'> **Discussion Question:** </font> 
What is the difference between `cltr+enter` and `shift+enter`?

---

In [None]:
print("Hello World! 👋")

#### 🎉🎉🎉 Congratulations! You have passed the rite of passage for all programmers -- printing "Hello World" 🎉🎉🎉
There are many tips and tricks of the trade. In an effort to not reinvent the wheel, we won't be going through those details today. If you are interested, please have a look at [this tutorial](https://github.com/ABS-Neural-Nets-Tutorial/Intro-To-Neural-Networks/blob/main/notebooks/1_Exploring_Notebooks.ipynb) by Jack. The interactive version can be found by opening [this binder](https://mybinder.org/v2/gh/ABS-Neural-Nets-Tutorial/Intro-To-Neural-Nets-Env/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252FABS-Neural-Nets-Tutorial%252FIntro-To-Neural-Networks%26urlpath%3Dlab%252Ftree%252FIntro-To-Neural-Networks%252Fnotebooks%252F1_Exploring_Notebooks.ipynb%26branch%3Dmain).

Here's a comprehensive list of [keyboard shortcuts](https://cheatography.com/weidadeyue/cheat-sheets/jupyter-notebook/pdf_bw/) as well.

## 🐍 Python
> Programming = Sending instructions to computers

Python is no different in this regard. If you want to learn more about the nitty gritty details and difference with R, read the snippet below:

---
### <font color='red'> The Nitty Gritty </font>
[Python](https://www.python.org/doc/essays/blurb/) is:
* **A high (not low) level language**: Meaning the instructions resembles human natural language more than computer/machine language (of manipulating 0s and 1s). This means it requires significant 'translation' into machine-level code before the computer executes it. [Assembly](https://en.wikipedia.org/wiki/Assembly_language) is an example of low-level language.
* **An interpreted (not compiled) language**: Meaning we do the translation work into machine-level code *line-by-line* rather than all at once. [C++](https://en.wikipedia.org/wiki/C%2B%2B) is an example of a compiled language where we translate it all at once.
* **An object-oriented (not functional) language**: Meaning the core of it is the manipulation of objects (variables/classes etc.) rather than functions and recursion. [LISP](https://en.wikipedia.org/wiki/Lisp_(programming_language) is an example of a functional programming language. *Note this isn't an exactly correct distinction, see [this article](https://medium.com/@shaistha24/functional-programming-vs-object-oriented-programming-oop-which-is-better-82172e53a526) for more detail.*

[Python vs. R](https://www.ibm.com/cloud/blog/python-vs-r)

| Python | R |
| --- | --- |
| Developed by Software Engineers | Developed by Statisticians \& Researchers | 
| Easy to learn if you have some programming experience | Easy to learn with no programming experience |
| Is a versatile tool which can be used for many things (data science included) | Heavily emphasises statistical models and methods | 
| Easy to scale into production/application | Used more for experimentation \& exploration |

---

### 📝 A note on comments
Lines that begin with `#` are called **comments**. They are ignored by the computer when running the code, and overall is useful for helping us explain tricky bits of the program to other humans. Text wrapped in `"""` are also considered comments, they are usually used for large blocks (usually reserved for descriptions inside functions).

In [None]:
# This is a comment, the computer will completely ignore this
# This is also a comment -- did you know Japan is suffering from a ninja shortage? 🥷🥷🥷
# Maybe it's time for a career change https://www.businessinsider.com.au/iga-japan-is-facing-a-ninja-shortage-2018-7 🤔🤔🤔

### 🔢 Arithmetic & Variables
Python can be used as a simple calculator!

In [None]:
# Addition
3 + 4

In [None]:
# Subtraction
5 - 18

In [None]:
# Multiplication
2 * 8

In [None]:
# Division
7 / 3

In [None]:
# Powers 
2 ** 3

In [None]:
# Boolean (reads, does 3 x 2 equal 4?)
3 * 2 == 4

You may notice the output is automatically displayed. This only happens to the **final line** of each cell. If you want to see something else displayed, you will need to use the in-built function `print()` (don't worry we will talk about functions in a bit).

In [None]:
print("This will be printed:", 1 + 1)
1 + 1 # This will NOT be printed -- it's not the last line
3 + 5 # This will be printed -- it's the last line

That's great, but we often don't want to write out everything explicitly. **Variables** are a way for us to store data so that we can retrieve it at any point in time. 

In [None]:
# Let's add some numbers 
a = 17
b = 5
a - b

For best practice, don't write out such ambiguous variables (I may be a hypocrite in this regard 😬😬😬 Do as as I say not as as I do). For example, the code below is much more clear to a reader:

In [None]:
cost_of_annual_netflix_subscription = 19.99
number_of_users_i_share_with = 4
cost_per_user = cost_of_annual_netflix_subscription / number_of_users_i_share_with
print("I pay only $", cost_per_user, "for my Netflix!")

---
## <font color='#F89536'> **Your turn!** </font>  
Try to make code to compute the following:
    
1. What is $(225 + 751) \times 32$

In [None]:
# Your code here


2. Calculate $1^3 + 2^3 + 3^3 + 4^3$

In [None]:
# Your code here


3. If the base of a triangle is $34$ and the height is $12$, what is the area? (Hint: The area of the triangle is $ 0.5 \times base \times height$). Fill in the `?` below.

In [None]:
# Your code here
base = ?
height = ?
area = ?
print(area)

4. Is $a = 1^3 + 12^3$ the same as $b = 9^3 + 10^3$. (Hint: Define `a = 1 ** 3 + 12 ** 3` etc. first). Fill in the `?` below.

In [None]:
# Your code here
a = ?
b = ?
print(?)

This is the famous [Ramanujan's taxi number](https://en.wikipedia.org/wiki/1729_(number))!

5. (Bonus) Is $5457$ divisible by $17$? (Hint: `a % b` will give you the remainder -- eg. `5 % 4` will give `1`)

In [None]:
# Your code here


---

### ⌨️ Basic Data Types
Although a computer stores all its data in a series of 0s and 1s as numbers, we often work with more than just numbers. Each variable you store must pertain to a particular **data type** so that when you retreive it, the computer knows what to display to you. Some common data types in Python are:

| Data Type | Description | Examples |
| --- | --- | --- |
| Integer | Counting/whole numbers | `2` or `-3` |
| Floats | Decimal numbers | `4.2` or `3.0e8` (the latter is in [scientific notation](https://en.wikipedia.org/wiki/Scientific_notation))
| Strings | Sequences of characters (given with quotation marks) | `"I am gr00t"` |
| Boolean | Either `True` or `False` | `True` or `False` |
| List | A collection of items (can be any type) | `["apple", "orange", "banana"]` or `["abc", 34, True, 40, "male"]` |

Let's have a look at some examples.

In [None]:
integer = 3 # this is an integer
floats = 3e8 # this is a float
string = "Woop Woop" # this is a string
boolean = True # this is a boolean

integer_one = 1 # this is an integer
string_one = "1" # this is a string
print("Adding integers together: ", integer_one + integer_one, "...makes sense")
print("Adding strings together: ", string_one + string_one, "...weird!")

You need to be very careful with data types to make sure you don't mix them, for example, **you can't add an integer to a string**:

In [None]:
integer_one + string_one

Lists are special because they are a **collection** of items any of the other four (or more) data types. To define a list, you must used square brackets `[ ]`. 

In [None]:
my_list = ["apple", "banana", "cherry", "tomato"]

There are several properties of lists to keep in mind:
* Length `len()`: How many elements are there in total
* Index `[i]`: the i-th element of the list. Note that `i` is called the **index**.

***Note: Indices in Python begin with 0 (so the first element is 0, second is 1 etc.). This differs from R where indexing begins at 1.***

In [None]:
print("There are", len(my_list), "elements in my_list.")
print("The 2nd element of the list is:", my_list[1])

You can also have mixed lists!

In [None]:
mixed_list = ["Apple", 0.5, 2, True]

🔪 You can also access multiple elements of a list using `:` called a **slice** - which means *onwards* (eg. `1:` might mean the 2nd element onwards). 

---
## <font color='#F89536'> **Discussion:** </font>  
Try `print()` out `my_list[2:]` and `my_list[:2]`. What's the difference?

In [None]:
# Your code here


---

When in doubt you can always check the data type using `type()`!


In [None]:
type(my_list)

### 🧐 Specialised Data Types
Beyond the basic data types, users can define their very own data types! We will not be doing that today. Instead, we will be *importing* code that other people have made (aka. *libraries*) and recycle some of their data types. A key one is the concept of an array. Let's have a look at some arrays.

#### ***One Dimensional Arrays***
We talked before about the concept of a list, and how elements of a list can be retrieved by their **index**. For example

My Shopping List:

0. Milk
1. Eggs
2. Carrots
3. Cat food
4. Cat

We can normally access elements (items in a list) by giving an index (the i-th item on the list). For example, `i=2` gives "Carrots". Let's test that out:

In [None]:
# Here it is in Python
my_shopping_list = ["milk", "eggs", "carrots", "cat food", "cat"]
my_shopping_list[2] # this should give carrots!

#### ***Two Dimensional Arrays***
What about something a little bit more complex? What if you want to save data in the form of a **table/matrix**? It turns out there is a data type for this too! Say you are concerned about the difference between [Llamas and Alpacas](https://askanydifference.com/difference-between-alpaca-and-llama/) (as one would):

| Alpaca | Llama |
| --- | --- |
| South America ONLY | All over the world |
| Small pear-shaped ears | Long banana-shaped ears |
| Extremely Jumpy/Nervous | (Strong) independent guard animals for sheep/alpaca |
| Fine hair and good fleece producing abilities | Limited fleace producing abilities | 

Suppose you are really concerned with the fleece producing ability of alpacas, how would you go about extracting that information? Maybe something like this:
```Python
my_alpaca_llama_table[3,0]
```
Requiring **two** indicies to represent the data. Note that unlike the `my_shopping_list` example, we didn't specify how to define the table, just how to extract elements from them. This is intentional as that is the only functionality we are going to be concerned with here.

---
## <font color='#F89536'> **Discussion:** </font>  
Why is it `[3,0]` -- what do those numbers represent? What if I wanted to learn about llama ears? Which indices do I need to provide?

---

#### ***Three or more Dimensional Arrays***
Is there anything more complex than a table? Yes! You may have noticed that:
* In the 1D case, we extract elements with **one index** (eg. `a[2]`)
* In the 2D case, we extract elements with **two indices** (eg. `a[5,1]`)

Care to take a gander at how many indices we need for the 3D case? Yep! That's right, 3! These objects are loosely called **tensors**. An example if this is if you have a coloured image. You may want to access not only a particular pixel in 2D (ie. the position of a single pixel given by a table), but also the colour channel (since coloured pictures have Red-Green-Blue). So the top left corner of a picture in the green channel may be given by `[1,0,0]`. The main data object we will be using is a **tensor** in the PyTorch library.

#### ***Summary***
So these are some examples of arrays.

| Name | Dimensions | Description | Index| 
| --- | --- | --- | --- | 
| Scalar | 0D | Any variable/object | None `[]` |
| Vector | 1D | A list of stuff | One index `[3]` |
| Matrix | 2D | A table of stuff | Two indices `[2,3]` |
| Tensor | 3D+ | - | Three or more indices `[1,28,28]` |

![tensor](../images/tensor.png)  
[source](https://www.researchgate.net/figure/Tensors-as-generalizations-of-scalars-vectors-and-matrices_fig3_332263806) -- note that technically a tensor is *any* array from the diagram, but often they refer to arrays with 3D or higher.

### 🔁 Loops
Programming can be quite repetitive, you may be doing the same thing over and over again. To facilitate this, loops are an important aspect. Let's say you want to see what each element in our fruit list is.

In [None]:
my_fruit_list = ["apple", "banana", "cherry"]

In [None]:
# Method 1: Access each element of the list directly
for fruit in my_fruit_list:
    print(fruit)

In [None]:
# Method 2: Access each element of the list via indices
for i in range(len(my_fruit_list)):
    print(my_fruit_list[i])

---
## <font color='#F89536'> **Discussion:** </font> 
What does the `range(len(my_fruit_list))` actually do?

---

### ➕ Functions
Sometimes we want to run the same block of code over and over again in different parts of our program. We *can* just copy and paste the code, but what happens if we need to make a change to it? We would need to go back to every bit of code we copied and make the change over and over again. This is where functions come in. Functions are defined using the following syntax:
```Python
def my_function(input_1, input_2 = input_2_default_value):
    ## does a thing
    return output
```
Below, we have written a function to read in English words and conver them into [Pig Latin](https://en.wikipedia.org/wiki/Pig_Latin). To convert English words into Pig Latin:
1. Move the first letter of the word to the back of the word
2. Add '-ay' to the end of the word  

For example, 'neural' becomes 'euralnay', or 'cat' becomes 'atcay'.

In [None]:
def pig_latin_translator(english):
    """ This is also a comment!! Typically comments like this are put here to describe the function"""
    """ This function translates English words to Pig Latin"""
    assert type(english) == str # we need to make sure the input is a string and not any other data type
    first_letter = english[0] # just like lists, you can get elements (characters) of a string
    rest_of_word = english[1:] # here we select the 2nd element onwards
    pig_latin = rest_of_word + first_letter + "ay"
    return pig_latin

We will not focus on how to **write** functions, instead, we will only be looking at how to **use** functions. Functions have two parts:
* **Input**: What arguments are going into the function. In our case, these are English words.
* **Output**: What is the function outputting to us as an answer. In our case, these are Pig Latin words.

In [None]:
# Let's try it
result = pig_latin_translator("cat")
print(result)

--- 
## <font color='#F89536'> **Discussion:** </font> 
What happens when you put in a non-string input? What happens when you put in more than a single word?

In [None]:
# Your code here


---

### 📖 Libraries
Functions can be relatively simple (like our example), or quite complex. Often times, there are functions that other people have written which does precisely want we want our code to do. In those cases, we don't need to re-write the code. Instead, we can **import** libraries (collection of pre-defined functions).

For example, a factorial is a sequence of product of numbers. Say $5! = 5 \times 4 \times 3 \times 2 \times 1=120$. Instead of writing our own code to implement the factorial, we can use the `math` library to calculate our answer:
* **Input:** The target number for the factorial operation $x$
* **Output:** The calculated factorial $x!$

In [None]:
import math

math.factorial(5)