# Python basics 3: Flow Control

This notebook contains more basics of Python. Use it as a reference whenever needed.

## Conditional statements, loops and functions

Up till now, you've used the Python language and syntax as a fancy calculator. Most likely you felt that you needed additional elements to prevent replication in your code, or saw that you needed a conditional statement that only executed your code when a particular variable/statement was True/False. 

Here, we introduce the building blocks for writing such code blocks. Let's take a look at conditional statements first.

## Conditional Statements

A lot of programming has to do with executing a block of code only if a certain condition is verified. 

In Python, the `if-then-else` construct has the form:

```python
if condition1:
    # statements
elif condition2:
    # statements
elif condition3:
    # statements
else:
    # statements
```

Note that the `elif` and `else` clauses are optional. A conditional statement can contain a single `if` block, and nothing else.

All methods from the previous notebook that return a boolean (True/False) can be used after an `if` statement. You can of course combine these with the `and` and `or` operators. 

If you inspect the function below, do you think you can change the `elif` statement into an `if` statement? Would this alter the code? What is the difference? Try playing with the code by adding some `Pear` to the grocerylist. 

Can you describe what the code is doing?

In [1]:
fruit = ["Apple", "Pear", "Banana", "Orange"]

item = "Pear"

# To remind you that you can check for elements in a list
if item in fruit:
    print("Already on the list!")
    
    if fruit.count(item) >= 3:
        print("It's on the list three times or more!")
    elif fruit.count(item) > 1:
        print("It's on there twice!")
    # You don't have to finish with an 'else'
    
else:
    print("Not on the list yet, adding it!")
    fruit.append(item)


Already on the list!


### Quiz

Finish the following code block. The `input()` function askes the user in the Python interpreter to input some text. Implement the following:

    * Print a line with a friendly message, telling the user what their input is
    * Check if the input is a digit
    * Check if the input meets the > 0 condition 
    * If the input is a digit and meets the condition, tell the user if it is even or odd
    * Otherwise, print another informative message 




In [2]:
user_digit = input("A number > 0 please: ")
if user_digit.isdigit():
    # Checking if the input is > 0
    if int(user_digit) > 0:
        # Checking if the input is even or odd
        if int(user_digit) % 2 == 0:
            print("Your number is even!")
        else:
            print("Your number is odd!")
    else:
        print("Your number is not > 0!")
else:
    print("Your input is not a digit!")

Your number is odd!


## Flowcontrol / Looping

### For Loops

Programming is of little use if we cannot repeat an instruction for an intended number of times. In the previous examples, you had to change the code and re-run in if you wanted to play with, for instance, the value of the `item` variable. 

The `for` statement allows us to define **iterations** (i.e.taking items from an iterable) by following this template:

```python
for variable in sequence:
	# statements
    print(variable)
```

Little known fact is that the for-loop also has an `else` clause. It's also rarely used:

```python
for number in [0, 1, 2, 3, 4, 5]:
	# statement
    print(number)
    
    if number > 4:
        break
        
else:
	# statement
    print("All done!")
```

The code in the optional `else` clause is executed if and only if the loop terminates successfully (i.e. without a **`break`**). 

In [7]:
# Let's iterate over our list of fruit

for item in fruit:
    print(item)

In [3]:
# Or from another function we recall

text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit"
words = text.split()

for word in words:
    print(word.title())

Lorem
Ipsum
Dolor
Sit
Amet,
Consectetur
Adipiscing
Elit


In [4]:
# You can also nest for loops:
text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit"
words = text.split()

for word in words:
    for character in word:
        print(character, end='-')  # What is the 'end' argument doing?
    #The end argument is used to specify what to print at the end of the line.

L-o-r-e-m-i-p-s-u-m-d-o-l-o-r-s-i-t-a-m-e-t-,-c-o-n-s-e-c-t-e-t-u-r-a-d-i-p-i-s-c-i-n-g-e-l-i-t-

#### The enumerate function

The `enumerate()` is problably the most used among the functions that supports the iteration of an iterable. This function return the current item plus **its index** in the iteration process.

You can see that we assign two variable names, `i` and `word` to the outcome of `enumerate()`. 
What is the datatype that this function is returning? 
And what are the datatypes of its elements?

In [1]:
# Use enumerate in the iteration over a list of words
text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
words = text.split()

for i, word in enumerate(words):
    print (i, "-->", word)

0 --> Lorem
1 --> ipsum
2 --> dolor
3 --> sit
4 --> amet,
5 --> consectetur
6 --> adipiscing
7 --> elit.


In [6]:
# What is the datatype of the returning value of enumerate(words)?
index_words = enumerate(words)
print(type(index_words))
index_words = list(index_words)
index_words


<class 'enumerate'>


[(0, 'Lorem'),
 (1, 'ipsum'),
 (2, 'dolor'),
 (3, 'sit'),
 (4, 'amet,'),
 (5, 'consectetur'),
 (6, 'adipiscing'),
 (7, 'elit.')]

Optionally, you can make the `enumerate()` function start at another number:

In [10]:
for i, word in enumerate(words, 1):
    
    if i > 100:
        break
    # No else here?
    #There is no else here because the break statement is used to stop the loop.
    
    print(i, word)

1 Lorem
2 ipsum
3 dolor
4 sit
5 amet,
6 consectetur
7 adipiscing
8 elit.


#### The range construct

The  `range()` construct can be used to control the iteration. It generate lists of numbers on the basis of the following three arguments:

- `start` : the first integer of the list
(default is 0)
- `stop` : one larger the last integer of the list (list stop at n - 1)
- `step`: the increment of the list (default is 1)

In [15]:
# Let's play with range
print(list(range(0,10)))
print(list(range(10)))
print(list(range(1,10,2)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 3, 5, 7, 9]


This each prints the function name with its arguments. Not very informative at this point...

Similar to the call with the `list()` function above, we first have to transform the outcome of this function to a list to make it printable. This happens for some efficient functions that don't put everything in memory at once, but only during each iteration (when you run the code). So basically, the `range()` function produces a list of numbers.

In [12]:
print(list(range(100, 1, -10)))  # You can make it reverse with a negative step!

[100, 90, 80, 70, 60, 50, 40, 30, 20, 10]


In [13]:
# Let's use range in a for loop

for i in range(1, 10, 2):
    print(i)

1
3
5
7
9


### While Loops

The `while` statement allows us to control a loop on the basis of a condition. 

A `while` loop runs as long as a condition is verified. 

It has the following general form:

```python
while condition:
	# statement
else:
	# statement
```

the code in the optional `else` clause is execute if and only if the loops terminates successfully (i.e., without a **`break`**)

### Quiz

_Think before doing._ What do you think happens when you execute?

```python

n = 1

while n:
    print(n)
    n += 1
```

If you dare, you can execute this code.

More safe is the code below, this does not let you end up in an _infinite loop_. Try executing the code a couple of times and see what it does. What is your longest streak of odd numbers?

In [23]:
import random

n = 1
while n % 2 != 0:
    n = random.randrange(99)
    print(n)

49
21
24


---

### Break and Continue

The clauses `break` and `continue` are two statements that allow for a more flexible control of a loop. Intuitively:

- `continue` is used to pass to the next iteration of the loop
- `break` is used to interrupt the loop abruptly

In [None]:
# When we encounter 7 we skip to the next step
for el in range(1, 10, 2):
    if el == 7:
        continue
    print(el)

In [None]:
# When we encounter 7 we stop our loop 
for el in range(1, 10, 2):
    if el == 7:
        break
    print(el)

The `break` influences the execution of the loop in yet another way: when a loop terminates due to a `break` statement, the code embedded in the option `else` clause is skipped.

In [None]:
# The continue statement does not influence the execution of the else block
for el in range(1, 10, 2):
    if el == 7:
        print ("(let's ignore the " + str(el) + ")")
        continue
    print(el)
else:
    print (">>> the iteration ended with the number " + str(el))

In [None]:
# What if we replace continue with break
for el in range(1, 10, 2):
    if el == 7:
        print ("(we encountered the number " + str(el) + ", let's break the loop)")
        break
    print(el)
else:
    print (">>> the iteration ended with the number " + str(el))

### The Pass Statement

Given the importance of indentation for Python, sometimes we may need a placeholder that allows us to write down a condition for an `if-then-else` construct or for a `while` loop without writing any statement (maybe just a comment). This is the case in which the `pass` statement comes in handy. 

In what follows, **nothing happens**:

```python
if condition1:
    pass
else:
    pass
```


In [24]:
numbers = range(10)

# Handy if you're quickly typing a loop
# or function (see further on in this notebook)

for n in numbers:
    if n % 2 == 0:
        pass  # TODO
    else:
        print("Odd!")

Odd!
Odd!
Odd!
Odd!
Odd!


---

### Quiz

The following list contains 100 random extractions (with replacement) of numbers between 1 and 15. 

Find the number that has never been extracted

In [10]:
random_numbers = [1, 2, 1, 1, 9, 13, 15, 5, 9, 8, 12, 14, 3, 2, 8, 10, 3, 12, 15, 13, 5, 3, 7, 5, 2, 13, 12, 8, 10, 5, 15, 8, 2, 8, 5, 12, 9, 2, 3, 5, 1, 4, 5, 9, 13, 2, 12, 5, 10, 8, 1, 15, 15, 6, 12, 3, 1, 3, 7, 14, 15, 10, 15, 7, 10, 12, 1, 2, 13, 7, 9, 6, 6, 7, 4, 12, 10, 8, 8, 3, 8, 4, 6, 14, 10, 5, 2, 3, 15, 4, 9, 3, 7, 7, 2, 4, 4, 1, 7, 15]

In [12]:
# Making an unique set of the list
unique_numbers = set(random_numbers)
print(unique_numbers)
for n in range(1, 16):
    if n not in unique_numbers:
        print(n)



{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15}
11


---

## Functions

We now discussed the art of looping, to automate bits of your code and prevent repetition. If you know that you're going to use parts of your code more than once, you can make a function out of it. Functions are constructs that allows us to organize portions of code more than once in a program. The alternative way to obtain the same results without functions would be to copy the same portion of code every time it is needed. Organizing your code with functions makes it easier to split your problem into smaller subproblems.

We have seen plenty of built-in functions so far. If you used them, you could have recognized them by their parenthesesis after their name. The `print()` functions is such an example. 

A function **takes optional parameters** (inside the parenthesis) and **optionally returns a value** after it has done something. 

Functions in Python are defined by a `def` statement, following this template:

```python

def function_name(parameters):
    """
    Documentation on what the function is doing
    """
    
    # your statements
    result = True
    
    return result
```

> The list of the parameters required by the function is reported between round brackets right after the name of the function. Each function may have **zero or more** parameters. When a function is called, its parameters are called **arguments**.
>
> The (optional) documentation string should be placed immediately after the function definition. There are many way to format your **docstring**, [PEP 287](https://www.python.org/dev/peps/pep-0257/) recommends reStructuredText, but more formats are available. See [this tutorial](http://daouzli.com/blog/docstring.html) for an introduction to the topic.
>
> The **indented** function body contains all the statements that are executed every time the function is called. When a `return` statement is executed, the function exits and its output is the argument of the `return` statement. 
>
> When there is no return statement in the body function, or when a return statement with no arguments is executed, the function  returns `None`

For instance, the following function calculates the number of characters in a string:

In [25]:
def chars(s):
    """
    Calculate the number of characters in a string
    """
    
    if type(s) != str:
        return "This is not a string!"
    
    r = len(s)
    
    return r

The docstring is saved into a  `__doc__` variable and can be accessed by using the `help()` function or the IPython `?`

In [26]:
# Don't use this, it is just to make the point
print(chars.__doc__)


    Calculate the number of characters in a string
    


In [27]:
# Use one of this two
help(chars)
chars?

Help on function chars in module __main__:

chars(s)
    Calculate the number of characters in a string



Or keep using the `shift + tab` ipython magic!

In [28]:
chars('test')

4

In order to execute the code included in a function, you have to **call the function**, either in your script or in the interactive shell. For instance:

In [29]:
chars("voodoo")

6

In [30]:
chars(1979)

'This is not a string!'

The `return` is also optional. What happens when you assign a variable to a function call that does not have a return statement? 

In [31]:
def tokenize(text):
    words = text.split()
    
    print(words)
    
    # Forgot something?
    
words = tokenize("What happens when there is no return keyword?")

print(type(words))

['What', 'happens', 'when', 'there', 'is', 'no', 'return', 'keyword?']
<class 'NoneType'>


Sometimes, you don't need a return statement, as you're for instance writing a function as some kind of wrapper function, that ties several methods together. Or write the output to a file, instead of to the console. It's best to make clear in the docstring/documentation of the function (using the `""" """` docstring style) what the function takes in, processes, and spits out again. 

### Parameters

A function can receive any number of parameters:

In [32]:
def higher(n1, n2, n3):
    """
    Find the higher of three numbers
    """
    
    if n1 > n2 and n2 >= n3:
        return n1
    
    if n2 >= n3:
        return n2
    
    else:
        return n3

In [33]:
# A parameter can be passed either by position
higher(4, 2, 8)

8

In [34]:
# Or by name/keyword
higher(n3=8, n1=4, n2=2)

8

#### Optional Parameters

In some situation it may be useful to have a default parameter value, that is used when a call leaves an arguments **unspecified**.

In [35]:
def higher(n1, n2=0, n3=0):
    """
    Find the higher of three numbers
    """
    
    if n1 > n2 and n2 >= n3:
        return n1
    if n2 >= n3:
        return n2
    else:
        return n3

In [36]:
higher(9,4)

9

But what happens now:

In [46]:
higher(-6, -3)  # How to fix this?
#Fixing this is easy, just change the order of the parameters in the function definition
#This is how you would fix this. I will create a new function named higherFix:
def higherFix(n1, n2=0, n3=0):
    """
    Find the higher of three numbers
    """

    if n1 > n2 and n2 >= n3:
        return n1
    if n2 >= n3:
        return n2
    else:
        return n3

higherFix(-6, -3)

0

#### Arbitrary Number of Parameters

A different situation is when we want our function to have an unspecified number of parameters. Python functions admit the so-called "tuple references", marked by an asterisk `*` in front of the last parameter  (that becomes a `tuple`).

In [47]:
def print_params(*params):
    print ("your input:")
    print (params)

In [48]:
print_params("Down from my ceiling", "Drips great noise", "It drips on my head through a hole in the roof") 

your input:
('Down from my ceiling', 'Drips great noise', 'It drips on my head through a hole in the roof')


---

#### Quiz

Remember the grocery list. Can you write a function that:
* Takes *one or multiple* items as arguments
* Takes in an existing dictionary of groceries (keys) and counts (values)
* Adds the single (or multiple items) to this dictionary if it is not yet in there and update the count if it is
* Returns this (updated) dictionary

In [31]:
groceries = {
    'kiwi': 8,
    'bread': 2,
    'banana': 3,
    'soy sauce': 1,
    'red wine': 1,
    'soup': 1
}

def add_groceries(items, groceries):
    """
    Add items to the grocery list

    Arguments:
        items(list): List of items.
        groceries(dict): Dictionary of groceroes
    Returns:
        dict: groceries
    """
    if type(items) == str :
       items == [items]
    for thing in items:
        if thing in groceries:
            groceries[thing] += 1
        else:
            groceries[thing] = 1

    return groceries

add_groceries('cola', groceries)
groceries




SyntaxError: invalid syntax (3473262099.py, line 20)

---

## Modules and Packages

Python modules are groupings of related code that are structures as to facilitate its re-use. 

Physically, modules are `.py` files implementing a set of **functions, classes or variables**, as well as **executable statements**, that can be accessed from other modules by using the `import` command.

The `import` command can be used both to import **the whole code** of a module, using the following syntax:

```python
import module
```

or just **specific attributes** (one or more functions, variables, classes or a combination of these) with the following syntax:

```python
from module import name1, name2, name3
```

For example, if order to know what is our current working directory, we can use the function `getcwd()` available `os` module (see below) in two different ways:

In [32]:
import os
os.getcwd()

'/Users/george/Desktop/Uni/Development/georges-2023-coding-the-humanities/notebooks'

In [33]:
from os import getcwd
getcwd()

'/Users/george/Desktop/Uni/Development/georges-2023-coding-the-humanities/notebooks'

You can think of a **package** as a structured collection of Python modules.

There are some modules/libraries available in Python as built-in. Take a closer look at three of them:
    
- Math ([manual](https://docs.python.org/3/library/math.html))
- Collections ([manual](https://docs.python.org/3/library/collections.html))
- Itertools ([manual](https://docs.python.org/3/library/itertools.html))

Can you figure out when to use the `defaultdict` and `Counter` from the Collections library? Can you use them in any of the functions/code you wrote above?

In [36]:
# Example

from collections import Counter
countedNumbers = Counter(random_numbers)


for n in range(1, 16):
    if n not in countedNumbers:
        print(n)

# Your code here

11


# Exercises

## Tasks

### Exercise 1

Write a function that:
* Takes a string as input
* Calculates the frequency of every token/word (separated by white space) in the string
* Returns a dictionary of tokens as keys and their frequency in the text as values.
* Implement an optional function parameter to make the function ignore capital letters

You can use everything you think you need and that you've learnt so far. Test it on the `text` variable value from the previous cell.

In [5]:
# Defning the function that takes a string as input
def word_freq(text):
    #Calculating the frequency of the words separated by white space in text
    words = text.split()
    #Returning a dictionary of tokens as keys and their frequency in the text as values
    return Counter(words)
    #Implementing an optional function parameter to make the function ignore capital letters
    if ignore_case:
        return Counter(words.lower())
    return None
#Testing the function on the text variable value from the previous cell
word_freq(text)

Counter({'Lorem': 1,
         'ipsum': 1,
         'dolor': 1,
         'sit': 1,
         'amet,': 1,
         'consectetur': 1,
         'adipiscing': 1,
         'elit': 1})

### Exercise 2

Adapt the function in the previous exercise so that it also ignores punctuation. 

Hint: you can find all punctuation characters by calling:
        
```python
import string
print(string.punctuation)
```

In [39]:
#Function that ignores punctuation
def word_freq(text):
    #Calculating the frequency of the words separated by white space in text
    words = text.split()
    #Returning a dictionary of tokens as keys and their frequency in the text as values
    return Counter(words)
    #Implementing an optional function parameter to make the function ignore capital letters
    if ignore_case:
        return Counter(words.lower())
    #Implementing an optional function parameter to make the function ignore punctuation
    if ignore_punctuation:
        return Counter(words.translate(str.maketrans('', '', string.punctuation)))
    return None
#Testing the function on the text variable value from the previous cell while ignoring punctuation
word_freq(text, ignore_punctuation=True)

TypeError: word_freq() got an unexpected keyword argument 'ignore_punctuation'

### Exercise 3

The [factorial](https://en.wikipedia.org/wiki/Factorial) of an integer $n$, defined as:

$$
n! = \begin{cases}
               1               & n = 1\\
               n * (n-1)! & \text{n > 1}
           \end{cases}
$$

is the product of all positive integers less than or equal to $n$. For example:

$$4! = 4 * 3 * 2 * 1$$

$$3! = 3 * 2 * 1$$

The factorial operation can be implemented in Python both as a recursive function and as an iterative functions. 

Write one factorial function picking the approach you prefer.

In [44]:
# Writing the factorial function using the recursive approach
def factorialRecursive(n):
    if n == 1:
        return 1
    else:
        return n * factorialRecursive(n-1)
# Writing the factorial function using the iterative approach
def factorialIterative(n):
    result = 1
    for i in range(1, n+1):
        result *= i
    return result
print(factorialIterative(500))
print(factorialRecursive(500))

1220136825991110068701238785423046926253574342803192842192413588385845373153881997605496447502203281863013616477148203584163378722078177200480785205159329285477907571939330603772960859086270429174547882424912726344305670173270769461062802310452644218878789465754777149863494367781037644274033827365397471386477878495438489595537537990423241061271326984327745715546309977202781014561081188373709531016356324432987029563896628911658974769572087926928871281780070265174507768410719624390394322536422605234945850129918571501248706961568141625359056693423813008856249246891564126775654481886506593847951775360894005745238940335798476363944905313062323749066445048824665075946735862074637925184200459369692981022263971952597190945217823331756934581508552332820762820023402626907898342451712006207714640979456116127629145951237229913340169552363850942885592018727433795173014586357570828355780158735432768888680120399882384702151467605445407663535984174430480128938313896881639487469658817504506926365338175

---
