# NOTEBOOK 1: INTRO TO PYTHON AND JUPYTER NOTEBOOKS

Python is the preferred language for Data Scientists. It runs on an interpreter system, meaning that code can be executed interactively as soon as it is written. Python can be used in a procedural, object-oriented or functional way.  

The advantages of python include:
1. An intuitive syntax
2. A large and active community of developers
3. Powerful libraries for Data Science and AI

This notebook is not a full introduction to programming in Python. Instead, it simply gives a quick overview (cheatsheet) of **basic Python functionality**, caveats and **what makes Python different** from other language. A large part of this notebook is trivial...

# 0 Setup

Install python 3.X and Jupyter

# 1 Variables

There is no need to declare a variable with a particular type.

In [None]:
a = 5
b = 4
c = a + b
d = a*b
print(c)
print(d)

In [None]:
a ** 2

In [None]:
f = 5.0
type(f)

For a quick test of a single statement, just send the result to the output instead of using print...

In [None]:
a+b

Jupyter notebooks: defined variables are accessible in all cells. Cells can be run in any order. Restarting the kernel deletes all variables. Deleting the cells above will not remote the variables a and b.

A variable can easily be bound to objects of a different type...

In [None]:
a = "test"
print(a)
print(type(a))

But the interpreter will not always change the type by itself...

In [None]:
a = "test"
b = 5

c = a + b
print(c)

# 2 List, tuples and dictionarys

## 2.1 List

Lists can contain any kind of datatypes. Elements in a list can be retrieved by index.

In [None]:
my_list = [5, 6, 7, "banana"]
type(my_list)

Indexes start at 0. Negative indexes count backwards in the list.

In [None]:
print(my_list[0])
print(my_list[3])
print(my_list[-1])

In [None]:
my_list[1:3] #index 1 included, index 3 not included

In [None]:
print(my_list[1:])
print(my_list[:-1])

Lists are mutable

In [None]:
my_list

In [None]:
my_list[3] = "apple"
my_list

In [None]:
len(my_list)

## 2.2 Tuple

Tuples are like lists...

In [None]:
my_tuple = (4,5,6,"banana")
type(my_tuple)

In [None]:
my_tuple[0]

...but a tuple is immutable:

In [None]:
my_tuple[3] = "apple"

## 2.3 Dictionary

A dictionary stores key-values pairs. There is no ordering, values can be of any type and are retrieved by their keys. Keys are always strings

In [None]:
a = {"key1":"value1","key2":8,"key3":my_tuple}

In [None]:
a["key1"]

In [None]:
a["key3"]

# 2 Functions

## 2.1 Basics

Code blocks are distinguised with indents, unlike other programming languages where blocks are seperated in a different way (e.g. with curly brackets {...} in Java and Javascript)

In [None]:
# defining a function

def add_variables(a, b):
    result = a + b
    return result

note: adding comments in Python code is done with the #-symbol. Anything after this symbol will be ignored by the interpreter and will not be run.

In [None]:
# calling a function

sum_numbers = add_variables(6,8)
print(sum_numbers)

## 2.2 Calling a function

When calling the function, a distinction is made between **Positional arguments** (parameters) and **keyword arguments**. **Positional parameters** must always be given first. Afterwards, other variables can be given as key-value pairs in any order, or without keys in the correct order.

Examples:

In [None]:
def calculate(a,b,c,d):
    return a + b + c - d

In [None]:
calculate(1,2,3,4)

In [None]:
calculate(a=1,b=2,d=4,c=3)

In [None]:
calculate(1,2,c=3,d=4)

In [None]:
# ERROR
calculate(a=1, b=2,3,4)

When defining a function, **default values** can be set for all parameters. When a parameter is not given when calling the function, the default value will be used. Values with defaults always come after parameters without defaults.

In [None]:
def add_variables_2(a,b=10):
    result = a + b
    return result

In [None]:
#ERROR
def add_variables_2(b=10,a):
    result = (a+b)
    return result

In [None]:
add_variables_2(5)

In [None]:
add_variables_2(5,8)

In [None]:
#ERROR
add_variables_2(b=10)

In [None]:
#ERROR
add_variables(a=10,8)

## 2.3 Variable scopes

### 2.3.1 Global variables

A variable defined inside a code block (like a function or a loop) can only be used within that code block. 

However, a variable defined outside of a code block, can be used both inside and outside the of it.

In [None]:
my_var = 3

In [None]:
def my_first_func():
    print("my_var is ...",my_var)

In [None]:
my_first_func()

This variable cannot be changed directly from within the function.

In [None]:
def my_wrong_func(x):
    my_var = my_var + x
    
print(my_var)
my_wrong_func(5) #ERROR
print(my_var)

A way around is, is to use a return value. Here, we change the variable "outside" of the function.

In [None]:
def my_correct_func(x):
    result = my_var + x
    return result

print(my_var)
my_var =  my_correct_func(5)#Here, we change the variable "outside" of the function.
print(my_var)

Another method is using the [global statement](https://docs.python.org/2/reference/simple_stmts.html#the-global-statement) (not recommended)

In [None]:
def my_correct_func_2(x):
    global my_var
    my_var = my_var + x

In [None]:
my_var = 5
my_correct_func_2(4)
print(my_var)

### 2.3.2 Local variables

A variable defined inside a code block like a function, is only accessible from within the function. This is a **local variable**.

In [None]:
def my_second_func():
    my_local_var = 6

In [None]:
my_second_func()
print(my_local_var) #ERROR

A local variable can have the same name as an existing global variable...

In [None]:
my_var = 10

def my_overwriting_func():
    my_var = 6 # notice the difference with my_wrong_func above.
    print("my_var inside function: ",my_var)

In [None]:
print("my_var before function: ",my_var)
my_overwriting_func()
print("my_var after function: ",my_var)

## 2.4 \*\*Kwargs and \*args (advanced)

### 2.4.1 \*\*Kwargs

The creator of a function can allow the user to pass his own custom parameters, without specifying in advance the name of these parameters or the amount of parameters. Check the following syntax.

In [None]:
def kwargs_function(a,b=10,**kwargs):
    print(kwargs)
    print(type(kwargs))
    return a+b

In [None]:
kwargs_function(5,first_kwarg="Hello",second_kwarg=42)

\*\*kwargs always come last. They represent any additional key-value pair that the user wants to pass to the function. Inside the function, the parameters are stored in a dictionary 'kwargs'.

### 2.4.2 \*Args

A similar thing can be done with \*args.  This is a tuple, not a dictionary.

In [None]:
def myFun(first_arg, *args): 
    print("type of args: ",type(args))
    print ("first :", first_arg) 
    for arg in args: 
        print("Next :", arg) 
  

In [None]:
myFun('Example', 'of', '*args', 'parameters') 

## 2.5 Passing functions around

A function behaves as an object and can be used as a variable in another function.

In [None]:
def print_function(message):
    print("message: ",message)
    
def length_function(message):
    print("the length of this message is {0} characters".format(len(message)))

In [None]:
def wrapper_function(function,message):
    function(message)

In [None]:
wrapper_function(print_function,"Hello world")

In [None]:
wrapper_function(length_function, "Hello world")

## 2.6 Type checks

A Python function will not enforce variable types for the parameters.

In [None]:
sum_string = add_variables("hello","world")
print(sum_string)

note: In order to make his/her code robust and easily debuggable, a Python developer will often include his own type checks. 

In [None]:
test_variable = "5"
type(test_variable)

In [None]:
isinstance(test_variable,str)

In [None]:
isinstance(test_variable,int)

In [None]:
c = isinstance(test_variable,int)
type(c)

# 3 Logical operators, comparison and conditionals

## 3.1 Operators

In [None]:
a = True # reserved keywords (capital letters!)
b = False
type(a)

In [None]:
a_or_b = a or b

a_and_b = a and b 

print("a or b:", a_or_b)
print("a_and_b", a_and_b)

**IMPORTANT:** Don't confuse the **'and' and 'or' keywords** with the **'&' and '|' symbols**. & and | are used for bitwise comparison. 
Getting it wrong results in unexpected results...

In [None]:
# Nothing unexpected here...
a = True
b = False
a & b 

In [None]:
# But what about this
a = 5
b = 9
print(a & b)
print(a | c)
print(type(a & b))

Rule of thumb: Use '|' and '&' for bitwise comparison of integers, 'and' and 'or' otherwise.

In [None]:
c = True
print(not c)

In [None]:
d = False
print(not d)

## 3.2 Comparison

In [None]:
a = 5
b = 6

In [None]:
a == b

In [None]:
a != b 

In [None]:
a > b

In [None]:
a >= 5

In [None]:
c = a == b
print(c)
print(type(c))

What about strings?

In [None]:
a_int = 5
a_string = "5"

In [None]:
a_int == a_string

In [None]:
a_int > a_string

In [None]:
word_1 = "banana"
word_2 = "apple"

In [None]:
word_1 > word_2 # word_1 comes later in alphabetical order.

**IMPORTANT: ** In this course, we will use the '==' and '!=' operators. However, some code examples online, you might see '**is**' instead of '==' and '**is not**' instead of '!='.

In [None]:
a = 7
b = 8

In [None]:
a is b

In [None]:
a is not b

There is an **important difference** between these methods!The former compares equality, the latter compares identity. This is nicely explained in [this blog...](https://dbader.org/blog/difference-between-is-and-equals-in-python)

In [None]:
a = [1,2,3]
b = [1,2,3]

In [None]:
# Comparing equality
a == b # true

In [None]:
# Comparing identity
a is b # False: variables a and b are pointing to two different objects.

In [None]:
c = [1,2,3]
d = c
print(c)
print(d)

In [None]:
c is d # c and d point to the same object 

## 3.3 Conditionals

Note the indents!

In [None]:
a = False
eight = 8
seven = 7

In [None]:
if a:
    print("hello")

In [None]:
if a: 
    print("hello")
else:
    print("no hello")

In [None]:
if (eight != seven):
    print("not equal")

In [None]:
if a:
    print("hello")
elif (eight == seven):
    print("equal")
else: 
    print("not equal")
    

In [None]:
if a:
    print("hello")
elif (eight == seven):
    print("equal")
elif (eight != seven): 
    print("not equal")
elif (9 == 9):
    print("9 equals 9")
else:
    print("nothing")

"9 equals 9" did not print, because previous if-statement was executed

# 4 Loops

## 4.1 For-loop

A for loop iterators over all elements of an iterable object. The most important iterable objects are lists and tuples.

In [None]:
my_list = ["Iterate","over","all","elements"]
for word in my_list:
    print(word)

In [None]:
total = 0
my_tuple = (1,2,3,4)
for number in my_tuple:
    total += number
    
print(total)

In [None]:
for i in range(10):
    print(i)

In [None]:
for i in "Hello":
    print(i)

## 4.2. While-loop

In [None]:
import random

value = 0
while value < 1:
    value += random.randint(0,10) #add a random number between 0 and 10
    print(value)

# 5 Classes

Instead of using built-in python classes or classes from third-party library, a Python user can also define his own classes. A class is basically a **type of object** which has certain **attributes** and some functions (**methods**) that can be performed on the object. Without going in to many detail, below is an example of a class and how to use it. Does the syntax make sense?

In [None]:
class team():
       
    def __init__(self,initial_members=[]): # the __init__ function is the constructor, it decides how a new 'team' object is created.
        self.members = initial_members
        self.count = len(initial_members)
        
    def starting_with(self,letter): #returns all team members with a name starting with 'letter'.
        result = []
        for member in self.members:
            if (member[0] == letter):
                result.append(member)
        return result
    
    def add_member(self,name):
        self.count += 1
        self.members.append(name)


In [None]:
team1 = team()
team1.add_member("John")
print("members: ",team1.members)
print("count: ",team1.count)
print("starting with J: ",team1.starting_with("J"))

# 6 Importing libraries

Python has some built-in libraries to perform specific tasks. Before using a library, it must be imported. The cells below demonstrate a few ways to import libraries.

Third-party libraries must be installed first before importing them. Installing libraries is easy with the **pip** package manager. We will demonstrate this in the next notebook. When using Colaboratory, many popular third-party packages are already installed on the hosted Colaboratory environment.

In [None]:
import random
import math

value = random.randint(0,10)
print(value)

four = math.sqrt(16)
print(four)

In [None]:
#import specific functions from a library
from random import randint, randrange

value = randint(0,10)
print(value)

In [None]:
# import all functions from the library. What's the difference with "import random"?
from random import *

In [None]:
# import with custom name
import random as r

value = r.randint(0,10)
print(value)

In [None]:
from random import randint as r
value = r(0,10)
print(value)

# 7 Exercise

Create a function "highest_value" that takes two parameters: 
1. a list 'my_list' of numerical values
2. a threshold (a numerical value with a default value of 100)

the goal of the function is to find the highest value in the list. If the highest value is lower or equal to the threshold, return the higest value. If it is higher than the threshold, return the threshold value.

In [None]:
# SOLUTION
def highest_value(my_list, threshold=100):
    
    highest_value = my_list[0]
    
    for value in my_list[1:]:
        
        if value > highest_value:
            
            highest_value = value
    
    if highest_value > threshold:
        
        return threshold

    else:
        
        return highest_value



Test the function by running the code below.

In [None]:
numbers_list = [1,2,45,78,369,15]

In [None]:
# should output 100
highest_value(numbers_list)

In [None]:
# should output 369
highest_value(numbers_list, 400)

# 6 Documentation

The [official documentation](https://docs.python.org/3.5/) is the place-to-go to find out more about Python's functionality and built-in functions. 


