# Exercise 1: Introduction to Python

Welcome to the first exercise of Introduction to Machine Learning. Today we will get familiar with Python, the language we will use for all the exercises of this course. 

This week we will introduce some important concepts in the basics of Python. Next week, you will learn how to work with NumPy, a popular Python library used for scientific computing. 

Python is a popular language to use for machine learning tasks. This is especially because of the selection of **libraries and frameworks**, developed specifically for machine learning and scientific computing. To name a few, you have Keras, TensorFlow, PyTorch for developing neural networks, SciPy and NumPy used for scientific computing, Pandas for data analysis, etc. (You might also get to dabble in PyTorch in the upcoming weeks.)

Python also allows you to write quick, readable, high-level code. It's great for fast prototyping. 

Let's get into it!

## Miniconda and Jupyter Notebook

If you're reading this Jupyter notebook as it is intended, chances are you already installed Miniconda. Miniconda is a small version of Anaconda, which is a Python distribution that comes with its own package management system, `conda`. Using `conda`, you can install and upgrade software packages and libraries. It will make managing the versions of the libraries you use very convenient.

In these exercises we use the web-application Jupyter Notebook. Jupyter Notebook allows us to create these exercises which contain Python code, text explanations and visuals. 

The Jupyter Notebook document (such as the one you are looking at right now) consists of cells containing Python code, text or other content. You can run each cell by clicking on the button `Run` in the top toolbar, or you can use a keyboard shortcut `Ctrl` + `Enter` (run current cell) or `Shift` + `Enter` (run current cell and move to the cell below).

## Running Python 

Jupyter Notebook is not the only way to write Python code. You can run Python in your terminal for some very quick coding. Try it by typing `python` into your terminal. This launches the Python interpreter. Try writing the following lines: 

`print("Hello world!")`

You can use Ctrl-D, Ctrl-Z or type `quit()` to exit.

You can also save Python scripts as files and run them from the terminal. We have a script for you in this folder. Navigate to this folder in your terminal and run the script by typing `python first_script.py`. Python scripts typically have the extension `.py`.


## Indentation and Control Flow

Finally we get to start doing some coding!

First thing to know: python does not separate different lines of code with a semicolon `;`. So just move to the next line with no worries.

In [None]:
# This is a Python comment. Start the line with `#` for a comment
print("First line of code. I will declare some variables")
a = 1 #second line!!
b = 2
c = "Fish"
print("My variables are: a =", a, ", b =", b, ", c =", c)

Easy! However, in Python you have to be careful and have perfect indentation (a reason why Python code is so readable). The reason is, Python uses indentation to keep track of what is part of the if statement, the loops and the functions. This is different from Java (this is assuming you know Java) where you would have curly brackets `{ }` for this purpose. 

Let's start with the if statement.

#### If Statement

The rule is, all indented parts after the `if condition :` belong to that branch of the if statement. 

`if condition :
     inside the statement
     still inside the statement
 elif condition:
     inside the else-if part of the statement
 else:
     inside the else part of the statement
 outside the statement`
 
Let's see it in action:

In [None]:
if a+b == 3:
    print("It's me again! We are inside the first if statement")
    print("It's optional to use parentheses for the condition a+b ==3")
    print("Don't forget to put a `:` at the end of the condition!!")
    if (c == "Fish"):
        print("This is a second if statement inside the first one")
    print("I'm out of the second if statement, but still inside the first one")
else:
    print("This is the else part of the first if statement.")
    print("These lines will never be printed!")
print("I'm not inside any of the if statements")

#### Exercise:

Let's see another if statement example. Try to figure out what the output will be **BEFORE** running the cell below.

Reminder, we declared

`a = 1
 b = 2
 c = Fish`

In [None]:
#Don't run me until you find the output first!
if a == 5:
    print ("1")
    if b == 1:
        print("2")
# here comes an else-if 
elif a == 2 or c == "Fish":
    print("3")
    
    if b == 1:
        print("4")
        if b == 2:
            print("5")
    if b == 2:
        print("6")
    if c == "Fish":
        if a == 1:
            if b == 100:
                print("7")
            else:
                print("8")
    elif a == 1:
        print("9")
print ("10")

#### Loops

Let's talk about loops. The syntax for a while loop is:

`while condition:
     inside the loop
     inside the loop
     inside the loop
 outside the loop`
 
 A small example:

In [None]:
count = 0
while count < 3:
    count += 1 #this is the same as count = count +1
    print("Count is", count)
print("Left the loop!")

For loops iterate through sequences, in this way:

`for x in sequence:
     inside the loop
     inside the loop
     inside the loop
 outside the loop`
 
 An example is shown below:

In [None]:
#Here is a basic list of strings
fish_list = ["salmon", "trout", "parrot", "clown", "dory"]

#The for loop:
for fish in fish_list:
    print(fish)
    print("*")
print("fish list over!")

An incredibly useful built-in function to use in for loops is `range()`. Range allows you to create a sequence of integers from the start (default is 0), to the stop, with a given step size (default is 1). We can use `range()` in for loops as shown in the example below.

In [None]:
#"default start is 0, default step size is 1"
for number in range(7):
    print (number)
print("**")

#now we also provide the start as 2.
#default step size 1 is still used.
for number in range(2,7):
    print(number)
print("**")

#now we also provide the step size as 2.
for number in range(2,7,2):
    print(number)
print("**")    

#what happens if step size is -1?
for number in range(6,-1,-1):
    print(number)

One more useful built-in function will be `enumerate()`. Let's go back to the fish list.


In [None]:
for fish in fish_list:
    print(fish)

What if I also want to keep track of the index of the list element? You can use `enumerate()` which creates a sequence of 2-tuples, where each tuple contains an integer index and an actual element of the original list. Here is how it looks like:

In [None]:
for item_index, fish in enumerate(fish_list):
    print(item_index, ":", fish)

### Data Types and Basic Operations:

Python is a **dynamically typed** language. This means that the data type is inferred at run-time and can be changed during run-time. To check the type of a variable you can use the function `type()`.

In [None]:
#var_1 is first defined as an integer
var_1 = 1
print(var_1, "is", type(var_1))

#var_1's type is changed to string
var_1 = "hi!"
print(var_1, "is", type(var_1))

#more types
var_1 = 0.312
print(var_1, "is", type(var_1))
var_1 = 3.
print(var_1, "is", type(var_1))
var_1 = 3+2j
print(var_1, "is", type(var_1))
var_1 = True
print(var_1, "is", type(var_1))

#### Type Casting

Some examples of type casting in Python:

In [None]:
# From int to float
var_1 = 42
print(var_1, "is", type(var_1))
var_1 = float(var_1)
print(var_1, "is", type(var_1))
print ("**")

# From float to int
var_2 = 3.14
print(var_2, "is", type(var_2))
var_2 = int(var_2)
#This operations does FLOOR, not round!
print(var_2, "is", type(var_2))
print ("**")

# From string to int
var_3 = "100"
print(var_3, "is", type(var_3))
var_3 = int(var_3)
print(var_3, "is", type(var_3))
print("**")

# From float to string
var_4 = 1.23
print(var_4, "is", type(var_4))
var_4 = str(var_4)
print(var_4, "is", type(var_4))
print("**")


#### Basic Operations

Arithmetic operations are fairly standard. There are some examples below. 
* Look out for the difference between `/` division and `//` integer division.
* `**` is used for power.
* `%` is modulo.

In [None]:
a = 50
b = 7

print("a+b=", a+b)
print("a-b=", a-b)
print("a*b=", a*b)
print("a/b=", a/b)
print("a//b=", a//b) #integer divison
print("a**b=", a**b) #power
print("a%b=", a%b) #modulo

Boolean operations are also fairly standard:

In [None]:
print("(True and False)=", True and False)
print("(True or False)=", True or False)
print("((True and False) or True) =", (True and False) or True)

You can declare strings with a single quote `'`, a double quote `"` or a three double quotes `"""`. The string declared with `"""` is known as a *docstring*, it can span multiple lines and is usually used to comment functions and classes.

In [None]:
a = 'Life\'s but a walking shadow, a poor player,' 
print(a)
a = "That struts and frets his hour upon the stage,"
print(a)
a = """And then is heard no more. It is a tale
Told by an idiot, full of sound and fury,
Signifying nothing."""
print(a)

In [1]:
#The types of quotes do not change anything!
a = "fish" #double quote
b = 'fish' #single quote
c = """fish""" #three double quotes
print(a==b, b==c) #the string is the same!

True True


### Lists

Lists are data types containing a sequence of values. The size of the list can change during run-time, as you add and remove elements from the list. 

Here is how you can create lists:

In [None]:
list_a = []                        # empty
print("list_a", list_a)

list_b = [1, 2, 3, 4]              # 4 elements
print("list_b", list_b)

list_c = [1, 'cat', 0.23]          # mixed types
print("list_c", list_c)

list_d = [1, ['cat', 'dog'], 2, 3] # list in list
print("list_d:", list_d)

list_e = [1]*10 #a list of 1s of length 10
print("list_e", list_e)

list_f = list(range(5)) #turns range object into a list
print("list_f", list_f)

Below we introduce some common operations with lists.
* Use `len(list1)` to find the length of the list.
* `list1.append(element)` to add an element to the end of the list.
* `list1.insert(index, element)` to add an element to an index in the list
* `list1.extend(list2)` to extend the elements of list1 with the elements of list2
* `list1.pop()` removes last element from the list
* `list1.pop(index)` removes the element at the given index
* `list1.remove(element)` removes the first instance of the given element

In [None]:
#Some common operations
b = ["great", "minds", "think", "alike"]
print("b:", b)

#finding the length
print("length of b is", len(b))

#append element to list
b.append("sometimes")
print("b.append(\"sometimes\")=",b)

#extend list
c = ["-", "Abraham", "Lincoln"]
b.extend(c)
print("c:", c)
print("b.extend(c)=", b)

#removes element and specific index
b.pop(6)  
print("b.pop(6)=", b)

#remove specific element
b.remove("Lincoln")  
b.remove("-")
print("b.remove(\"Lincoln\"); b.remove(\"-\")=", b)


You  can also check whether an element is in a list in the following way:

In [None]:
list_1 = ["a", "b", "c"]
if "b" in list_1:
    print("\"b\" is in list")
else:
    print("\"b\" is not in list")

#### List Indexing and Slicing:

You can extract a single element from a list in the following way:
`list1[index]`

In lists, the indices start from 0. You can also index elements from the end of the list to the beginning by $-1, -2, -3...$. Check out the image below for the example list:

`list_1 = ["a", "b", "c", "d", "e"]`

<img src="img/list_indices.png" width=400/>

* You can extract multiple elements by slicing. This will give you elements from the start up to **(but not including)** the end index.

  `list1[start_index:end_index]`


* If you do not specify the `start_index`, you will retrieve the elements from index $0$ up to the `end_index`.

  `list1[:end_index]` is the same as `list1[0:end_index]`


* If you do not specify the `end_index`, you will retrieve the elements from the `start_index` up to (and **including**) the end of the list.

  `list1[start_index:]`


* You can provide a step size.
  `list1[start_index:end_index:step_size]`
  

#### Exercise:

Try to write the output of the following code **BEFORE** running the cell.

In [None]:
# Don't run BEFORE you solve it!
list_1 = ["a", "b", "c", "d", "e"]

print("list_1[-3] =", list_1[-3])
print("list_1[0:2] =", list_1[0:2])
print("list_1[:4:2] =", list_1[:4:2])
print("list_1[::-1] =", list_1[::-1])
print("list_1[-4:-1] =", list_1[-4:-1])

You can also assign new values to indices using slicing. Here is an example:

In [None]:
list_1 = ["a", "b", "c", "d", "e"]

list_1[-1]= "<3"
print(list_1)

list_1[0:2] = ["x", "y"]
print(list_1)

list_1[::2] = [":)",":(", ":O"]
print(list_1)

#### Copying

We have one last thing to say about lists. Observe the behaviour of the following code:

In [None]:
#Case 1:

list_1 = ["a", "b", "c", "d", "e"]
print("list_1 before", list_1)

list_2 = list_1
list_2.append("Z")

print("list_1 after", list_1)

In [None]:
#Case 2:

list_1 = ["a", "b", "c", "d", "e"]
print("list_1 before function", list_1)

def function_that_changes_list(input_list):
    input_list.append("Z")

function_that_changes_list(list_1)

print("list_1 after function", list_1)

We never changed list_1 explicitly, but the values changed anyway. What's going on?

Well, in Python, when you say `list_2 = list_1`, you are not actually creating a new list, you are only copying the **reference** to the same list. This means that they are actually two variables pointing to the same list! So when you change the values of `list_2`, the values of `list_1` also change (since they are referring to the same list). Something similar is at play when you pass this list to a function. So be careful!

If you do not want this to happen, you can use the function `.copy()` to create a new object with the same values. 

#### Exercise:

Change the code below and fix the two cases given above using the `.copy()` function. Make sure the contents of `list_1` do not change.

In [None]:
#Case 1:
list_1 = ["a", "b", "c", "d", "e"]
print("list_1 before", list_1)

list_2 = list_1
list_2.append("Z")

print("list_1 after", list_1)
print("**")

#Case 2
list_1 = ["a", "b", "c", "d", "e"]
print("list_1 before function", list_1)

def function_that_changes_list(input_list):
    input_list.append("Z")

function_that_changes_list(list_1)

print("list_1 after function", list_1)

#### Exercise:

Now that we know how lists work, here is a quick exercise for you. Fill in the function below that takes a list and returns True if it is a palindrome, False if it is not. Palindromes are defined as sequences that read the same forwards and backwards.
Examples of palindrome lists:
* ["cat", "dog", "fish", "dog", "cat"]
* [0, 1, 2, 3, 3, 2, 1, 0]
* [1]
* []
You may use a for loop in this exercise. However, if you're feeling ambitious try to do it in 1 line, without using a for loop (hint: use slicing)

In [None]:
def function_is_palindrome(input_list):
    is_palindrome = True
    #your code here
    return is_palindrome

In [None]:
test_list_1 = ["cat", "dog", "fish", "dog", "cat"]
res_1 = function_is_palindrome(test_list_1)

test_list_2 = ["cat", "dog", "fish",  "bird", "dog", "cat"]
res_2 = function_is_palindrome(test_list_2)

test_list_3 = ["cat"]
res_3 = function_is_palindrome(test_list_3)

test_list_4 = ["cat", "cat"]
res_4 = function_is_palindrome(test_list_4)

if not (res_1 and not res_2 and res_3 and res_4):
    print("Test failed")
else:
    print("Correct! :)")


## Tuples

Tuples are similar to lists but they are fixed in size and **immutable**, which means that change is not allowed.
We declare tuples in the following way using parentheses`()`:

In [None]:
tuple_1 = ("wash", "your", "hands", "with", "soap")

print("tuple_1=", tuple_1)

Since change is not allowed, observe the result of the following piece of code.

In [None]:
tuple_1[2] = ("face")

You can typecast from list to tuple and vice versa! 

In [None]:
sequence_1 = ["here", "comes", "the", "sun"]
print(sequence_1, "is", type(sequence_1))


# from list to tuple
sequence_1 = tuple(sequence_1)
print(sequence_1, "is", type(sequence_1))

#from tuple to list
sequence_1 = list(sequence_1)
print(sequence_1, "is", type(sequence_1))

### Dictionaries

An incredibly useful data type to know, you might also know dictionaries as "hash maps". Dictionaries are collections of "key: value" pairs. You can access the values using the keys in $O(1)$ time.

The keys of a dictionary must be **immutable** and **unique**. Below we show how to define a dictionary.


In [None]:
shopping_list = {"apples": 3, "pears":2, "eggs":6, "bread":1, "yogurt":1}
print("shopping_list=", shopping_list)
print("**")


book_dict = {}
print("book_dict=", book_dict)
#add key value pairs
book_dict["vonnegut"] = "cat\'s cradle"
book_dict["ishiguro"] = "never let me go"
print("book_dict=", book_dict)
print("**")

# we can retrieve the dict keys:
print(book_dict.keys())
# and the dict values:
print(book_dict.values())
print("**")

#we can also iterate through the dict keys and values with a for loop
for key, value in book_dict.items():
    print(key, ":", value)

print("**")
#we can modify the value of a key
book_dict["ishiguro"] = "a pale view of hills"
print("modified book_dict=", book_dict)
print("**")

#and we can remove a key completely
removed_value = book_dict.pop("ishiguro")
print("book_dict with removed value =", book_dict)
print("removed_value=", removed_value)

### Functions

You can define a function in Python in the following way:

In [None]:
def multiply(a,b):
    return a*b


print("multiply(100,2) =", multiply(100,2))

You can have default arguments by specifying their default value in the parameters.

In [None]:
def add(a, b, c=0, d=1):
    return a+b+c+d

#use no default arguments
print("add(1,2,100,1000) =", add(1,2,100,1000))

#use the default value of d
print("add(1,2,100) =", add(1,2,100))

#use the default value of c and d
print("add(1,2) =", add(1,2))

#use the default value of c
print("add(1,2,d=1000) =", add(1,2,d=1000))

A function can return multiple values in a tuple. You can assign the values of the tuple to separate variables. This is called **tuple unpacking**.

In [None]:
def min_max(input_list):
    return min(input_list), max(input_list)


test_list = [1,2,3,4]
min_val, max_val = min_max(test_list)
print("min_val:", min_val, ", max_val:", max_val)

Note: You have seen tuple unpacking when using function `enumerate` in for loop.

### Common Built-in Functions

Here we introduce some nifty commonly used built-in functions. 

* You already learned `range()`, `enumerate()`!
* We have also seen `type()` to return the type of the object. We use `str()`, `int()`, `float()`, `list()`, `tuple()` for typecasting.
* The functions `len()`, `sum()`, `min()`, `max()`, `any()`, `all()`, `sorted()`, `zip()` are useful for lists and tuples.

Let's see them in action below

In [None]:
list_1 = list(range(5))
print("list_1 =", list_1)

print("len(list_1) =", len(list_1))
print("sum(list_1) =", sum(list_1))
print("min(list_1) =", min(list_1))
print("max(list_1) =", max(list_1))
print("**")


list_2 = [5,3,1,2,0,6]
print("list_2 =", list_2)
print("sorted(list_2) =", sorted(list_2))
print("**")


#any checks whether there are any 1s in the list (OR)
#all checks whether all elements are 1s. (AND)
#in Python: 1 = True, 0 = False
list_3 = [1, 1, 1]
print("list_3 =", list_3)
print("any(list_3) =", any(list_3))
print("all(list_3) =", all(list_3))

list_4 = [0, 1, 1]
print("list_4 =", list_4)
print("any(list_4) =", any(list_4))
print("all(list_4) =", all(list_4))

list_5 = [0, 0, 0]
print("list_5 =", list_5)
print("any(list_5) =", any(list_5))
print("all(list_5) =", all(list_5))
print("**")

# zip function:
x = [1,2,3]
y = [4,5,6]
zipped = zip(x,y)
print("x", x)
print("y", y)
print("zipped", list(zipped))

### List Comprehensions

One of the most practical things about Python is that you can do many things on just a single line. One popular example is so called *list comprehensions*, a specific syntax to create and initalize lists of objects. Here are some examples.

A syntax for list comprehension is shown below:
`[thing for thing in list]`

Let's make it more concrete with an example.

In [None]:
list_of_numbers = [1, 2, 3, 101, 102, 103]
print("list_of_numbers =", list_of_numbers)

#I want to create a new list with all these items doubled.
doubled_list = [2*elem for elem in list_of_numbers]
print("doubled_list =", doubled_list)

#A new list with all these items as floats
float_list = [float(elem) for elem in list_of_numbers]
print("float_list =", float_list)

Let's make it more interesting by adding an if in there:

`[thing for thing in list if condition]`


In [None]:
#I want to create a new list with all these items doubled
#IF the element is above 100
conditional_doubled_list = [2*elem for elem in list_of_numbers if elem>100]
print("conditional_doubled_list =", conditional_doubled_list)

**Exercise**

You will be given a list of vocabulary words. Your task is to use list comprehensions to iterate through a document and create a new list including the words that are included in the vocabulary. You don't need to worry about duplicates.

Example: `vocabulary = ["a" "c", "e"]
         document = ["a", "b", "c", "d"]
         new_list = ["a", "c"]`

In [None]:
vocabulary = ['epfl', 'europe', 'swiss', 'switzerland', 'best', 'education', 'high', 'higher', 'research', 'school', 'science', 'students', 'technology', 'top-tier', 'university']

document = """The École polytechnique fédérale de Lausanne (EPFL) is a research institute
and university in Lausanne, Switzerland, that specializes in natural sciences and engineering.
It is one of the two Swiss Federal Institutes of Technology, and it has three main missions: 
education, research and technology transfer at the highest international level. EPFL is widely regarded 
as a world leading university. The QS World University Rankings ranks EPFL 12th in the world 
across all fields in their 2017/2018 ranking, whilst Times Higher Education World 
University Rankings ranks EPFL as the world's 11th best school for Engineering and Technology."""
document_parsed = document.split()
document_parsed = [word.lower() for word in document_parsed]
new_list = []

#your code here
new_list = ...

#We convert the list to a set and then back to a list. We do this because converting it to a set automatically
#removes duplicates (since sets are sequences that do not contain duplicates). afterwards we sort it.
new_list = sorted(list(set(new_list)))


In [None]:
correct_result = ['best', 'education', 'epfl', 'higher', 'research', 'school', 'swiss', 'technology', 'university']

if new_list == correct_result:
    print ("Correct! :)")
else:
    print ("Incorrect :(")


### Matplotlib (Optional)

Perhaps the most widely used plotting library in Python is Matplotlib. If you've ever used MATLAB, you'll find that the functions look pretty similar. 

In the following exercise sessions, we won't ask you to do any plotting. So this part is optional for those who are interested in having a short introduction.

First, we will import Matplotlib.

#### Importing in Python

* A short note on importing: to be able to use modules in our code, we import them. 
  
  example: `import numpy`
  

* We can also select a name for the imported module.
  
  example: `import numpy as np`. Now when we call numpy functions, we will always use `np.` as a prefix, i.e. `np.zeros()`
  

* You can also choose to only import selected functions/variables/classes from the module. 
  
  example: `from numpy import arange`. Now you can use this function as `arange(5)`. You cannot use any other functions from the numpy module as you did not import them.

In [None]:
# To import Matplotlib we do:

import matplotlib.pyplot as plt

Let's do some plotting! 

Let's start with the simplest of plots, the good old line-plot. The function we will use is `plot()`

In [None]:
#Let's create some data first to plot
x = list(range(10))
y = [2,3,5,1,0,2,3,0,0,1]

#first create a figure
fig = plt.figure()

#now do the plotting
#specifying a color and marker are optional.
#check out the documentation to see what else you can do with the plot function
plt.plot(x, y, marker="*", color="r")

#axis labels and title
plt.xlabel("x")
plt.ylabel("y")
plt.title("just a random plot")

#so that we see the plot
plt.show()

#close the plot
plt.close(fig)

#### Exercise

You can plot two lines on top of one another by calling the `plt.plot()` function consecutively. Try to implement this! Also, specify the parameter `label` of the `plt.plot()` function and call the function `plt.legend()` to create a legend for your graph. It should look like the figure shown below.

![result](img/two_lines_plot.png)

In [None]:
#Let's create some data first to plot
x = list(range(10))
y1 = [2,3,5,1,0,2,3,0,0,1]
y2 = [1,2,3,5,1,0,2,3,0,0]

#first create a figure
fig = plt.figure()
#your code here


You can create scatter plots (line plots without lines) with `scatter()`

In [None]:
#Let's create some data first to plot
x = list(range(10))
y1 = [2,3,5,1,0,2,3,0,0,1]

#first create a figure
fig = plt.figure()

#now do the plotting
p1 = plt.scatter(x, y1, marker="*", color="r")

#axis labels and title
plt.xlabel("x")
plt.ylabel("y")
plt.title("just a random scatter plot")

#so that we see the plot
plt.show()

#close the plot
plt.close(fig)

And you can read and display images with `imread()` and `imshow()`

In [None]:
#first create a figure
fig = plt.figure()

#now do the plotting
im = plt.imread("img/krabby_patty.jpg")
plt.imshow(im)

#axis labels and title
plt.xlabel("x")
plt.ylabel("y")
plt.title("krabby patty")

#so that we see the plot
plt.show()

#close the plot
plt.close(fig)

And that's all for this exercise! If you have any problems, just ask (or even Google) them. You can check out the official Python tutorials for further learning.

https://docs.python.org/3/tutorial/