# Python Primer: fundamentals

## <div style="color: #db366d"> Day 1.2 </div>

## Installation
Before you can start writing and running python code, you will need the necessary software tools installed on your machine. There are many ways to set this up, but the simplest is to install the Anaconda software package:

https://www.anaconda.com/distribution/
(*Download the latest Python 3.x version*)

Follow the installation instructions if required:

https://docs.anaconda.com/anaconda/install/

Then run Jupyter Notebook:

### Windows:
Open the Anaconda Navigator app and click "Launch" in the Jupyter Notebook tab.

![](../images/anaconda-screenshot-win.png)


### macOS:
Open terminal, cd into your project directory and type:
`jupyter notebook`.  This is the cleanest way to run it.

![](../images/anaconda-screenshot-mac.png)

<br/>

(optionally) You can also open up the mac Anaconda Navigator app and "Launch" Jupyter Notebook from there.

![](../images/anaconda-navigator-mac.png)

## Hello World
Let's start by doing the rudimentary tradition for any programming language. 

#### EXERCISE (1) Install Anaconda, (2) run your Notebook and (3) Output the words `Hello World` using python code in the *code block* below.

In [None]:
# THESE ARE COMMENTS
# - i.e., won't be compiled into machine instructions 

# TODO: type Python code below to produce the output text 'Hello World'

# TODO: once you've written the code above, either
# PRESS CRTL-ENTER to run this cell, or
# PRESS SHIFT-ENTER to run this cell and go to next block
# (see help for full list of shortcuts)

## So why Python for ML?
Python is a modern general purpose programming language that can be used to build a variety of applications.
- can be as simple, or as advanced as one desires
- a really good readable programming language in anyone's arsenal
- you can dev in Jupyter Notebook (this), a nice interactive coding environment
- you also have a number of free cloud-based IDEs with free compute resources, e.g., [Google's colab](https://colab.research.google.com/)
- it has libraries, a lot of really good ones for ML

In [None]:
# e.g., the math module
import math

print(math.pi)
print(math.e)
print(math.ceil(8.2))
print(math.pow(2,3))
print(math.sin(10))
print(math.log(100,10))
print(math.sqrt(25))

#### Q: So how do we know what functions to use in the Math lib?

A: Either (1) Google, or (2) use TAB / SHIFT-TAB here
Note that for SHIFT-TAB, you need to run the import statements at least once first.

# Programming (bare) fundamentals
If you have not been exposed to any programming, here's a quick overview:

Programming is the act of writing statements to tell the computer how to do things. Just like English or any human language, learning and mastering how to write different types of sentences requires loads of practice. Just a few moments ago, we've already seen statements to display stuff on the screen.

Here are more examples:

In [None]:
print('Hello World again')
print("Hello same same World but different way?") # there are multiple ways to say the same thing...
print('Hello pi =', math.pi)
print('Hello {}thon w{}rld'.format(round(math.pi, 2), 0))

The "computer" (note the abstract usage of this term) will parse all statements you make and attempt to make enough sense of it to do the stuff you intended. If you do not know its language well then it will not be able to understand your sentences.

The good part is that it will complain through **warnings** and **errors** that (often) helps you with **debugging** the problem. Try running the following:

In [None]:
Hey Mr Computer, please print the words "Hello World"

Different programming languages allow you to "speak" to the computer in different syntactical styles. Other than visual differences, different languages often entail implications like performance, ease-of-use, code complexity, etc... As mentioned, we chose Python as its implications is mostly beneficial to our work in ML.

To write a Python statement, we need to understand these core fundamental programming concepts:
- variables
- conditionals
- loops
- functions

## Variables
Variables are containers to store something in memory, often to be re-used later. Let's see some variable assignment statements below:

In [None]:
x = 888            # store integer 888 into x
y = 1.68           # store float 1.68 into y
z = "lucky eights" # store string "lucky eights" into z

# note that there is a type implicit to each variable

However, it is good coding practice to name variables more meaningfully (we call this self-documenting code), e.g.,

In [None]:
# these names are still rather arbitrary as we don't have a real context
huat_num = 888
precision_val = 1.68
huat_str = "lucky eights"

Now we can perform any *valid* operations with these stored values, a.k.a. variables...

In [None]:
# add two compatible types
result = huat_num + precision_val
print("Adding", huat_num, "to", precision_val, "gives", result)

Not all operations can be performed on all variables. For example, there is no common implementation of dividing (`/`) two strings. 

In many other programming languages (C, Java, etc), the type needs to be defined by the programmer. In Python, we have dynamic typing, i.e., Python will determine the type based on what you've assigned to it. Nevertheless, knowing something about variable types is good, as it will prevent you from making such errors:

In [None]:
# the binary operator "+" does not support integer with strings in Python
result = huat_num + huat_str

#### EXERCISE: 
#### How do we resolve the above error then? (If we really want to append the number to the string...)

In [None]:
# TODO: Write Python code below to get the intended result string


#### EXERCISE: try to make the following work:

In [None]:
# this is a pretty common issue when you are working with datasets
num_int = 88888887
num_str = "1"

# TODO: write Python code below to get the Integer result below in results_int

# this checks that results_int is actually an Integer type
# asserts are common methods used in creating unit test code
assert(type(results_int) == int) 

# display the result
print(results_int)

Another important class of variable types is **containers**, i.e., variables that can store multiple values.
The important ones we need to understand now are
- lists
- dictionaries

### Lists
This is the most basic container to store a "list" of values. See how the following common operations work...

In [None]:
# an empty list
mylist = []
print('The type of mylist is', type(mylist))

# add to the list
mylist.append('item1')
print('After adding a string, mylist is', mylist)

# you can even add values of different types
mylist.append(2)
print('After adding a number, mylist is', mylist)

# add a number of things
mylist.extend([3, 'four', 5.555])
print('After adding 3 more things, mylist is', mylist)

# remove and item by value
mylist.remove(2)
print('After removing the number, mylist is', mylist)

# remove the last item
item = mylist.pop()
print('After popping', item, ', mylist is', mylist)

# purge the whole list
mylist.clear()
print('After clearing, mylist is', mylist)

# a list of ingredient names
ingredients = ['eggs', 'flour', 'sugar', 'butter', 'milk', 'baking powder']
print('\nMy ingredients list contains', len(ingredients), 'items')
print('The first ingredient is', ingredients[0])
print('The second ingredient is', ingredients[1])
print('The last ingredient is', ingredients[len(ingredients)-1]) # what?

### Dictionaries
Basically a list on steroids. 

It's like a upgraded list that can now contain arbitrary items with labels. Instead of retrieving specific items with integer indices, you can now get them using arbitrary labels, a.k.a *keys*.

These label-item things are called key-value pairs and this kind of a container is usually referred to as a hashtable in more traditional computer science speak. This may be hard to appreciate now, but the speed of retrieving specific items from a hashtable is almost as fast as a simple list (this is non-trivial to achieve).

See the following dictionary in action.

In [None]:
# note the different type of bracketing for dictionaries vs lists
recipes = {
    'name': 'Creative Cake Recipe',
    'ingredients': ingredients,
    'duration' : 60,
    'instructions' : '1. Preheat oven to 200 degrees.\n2. Mix everything together in your desired fashion\n3. Bake mixture in oven for 30 mins.'
}

# get various info you need from the dict
print('------------------------------')
print('Baking class instructor script')
print('------------------------------')
print('Welcome to baking 101.')
print('Today we are making ***{}***.'.format(recipes['name']))
print('The ingredients are {}'.format(recipes['ingredients']))
print('It will take approximately {}mins to make this cake.'.format(recipes['duration']))
print('Please follow these steps:\n{}'.format(recipes['instructions']))
print('Enjoy.')

## Conditionals
Conditionals is the foundation of incorporating decision making capabilities in writing statements.

In [None]:
# try changing this to various values
time_in_oven = 50

# note you can omit the elif if only 2 paths, and also else if only 1 path
if (time_in_oven == recipes['duration']):
    print("cake is perfect")
elif (time_in_oven < recipes['duration']):
    print("cake is under-cooked")
else:
    print("chaota liao lah")

## Loops
When we have containers, we usually want to iterate over all items and do stuff. Hence all programming languages provision for such keywords for you to tell the computer how you wish to perform these loops.

In [None]:
# the variable name 'ingredient' is introduced just for the scope of this loop
count = 0
for ingredient in recipes['ingredients']:
    # note that Python uses indentation to determine code blocks
    # the following 2 lines "belong" to this for loop and will be
    # ran repeatedly until the end-condition
    count += 1
    print('Ingredient {} is {}'.format(count, ingredient))

# Functions
We often need to re-use certain chunks of statements. This is accomplished through the use of **functions**.

In [None]:
# define function here
def check_cake_when(time_in_oven):
    """A function to check cake.
    This is an example of a docstring formatted comment, used as a 
    standard coding convention to document functions and other
    code structures in Python.
    
    Args:
        time_in_oven (int): The time cake has already been in the oven
    
    Returns:
        string: a description of the cake status based on this time
    """
    
    if (time_in_oven == recipes['duration']):
        return "perfect"
    elif (time_in_oven < recipes['duration']):
        return "under-cooked"
    else:
        return "chaota liao lah"

# you can actually display function docstrings like so
print(check_cake_when.__doc__)
    
# use the function here
# this simulates a cake checking algorithm every second
for time in range(55, 65):
    print('cake at {} is {}'.format(time, check_cake_when(time)))