![Cloud-First](https://github.com/tulip-lab/sit742/blob/main/Jupyter/image/CloudFirst.png?raw=1)


# SIT742: Modern Data Science
**(Module: Python Foundations for Big Data)**

---
- Materials in this module include resources collected from various open-source online repositories.
- You are free to use, change and distribute this package.
- If you found any issue/bug for this document, please submit an issue at [tulip-lab/sit742](https://github.com/tulip-lab/sit742/issues)


Prepared by **SIT742 Teaching Team**

---


## Session 2D: Control Flow

Normally,
*Python* executes a series of statement in exact top-down order.
What if you want to change the flow how it works?
As you might have guessed, in this prac, we will first look at **control flow**
 statements. We are going to practice on three control flow statements, i.e.,
**if**, **for** and **while**, to see how they can
determine what statement is to be executed next in a program.  



## Table of Content

### Control Flow

1.1 [**If** statements](#cell_if)

1.2 [**For** statements](#cell_for)

1.3 [**While** statements](#cell_while)

1.4 [**Break** statements](#cell_break)

1.5 [Notes on Python 2](#cell_note)



## Control flow

<a id = "cell_if"></a>

### 1.1 **If** statements

The **if** statement is used to check a condition: **if** the condition is true, we run a block of statements(**if-block**), **else** we process another block of statement(**else-block**). The **else** clause is optional. The condition is usually a boolean expression, which can have value of **True** or **False**.

Often  we have a block of statement  inside  either **if-block** or **else-block**. In this case,   you need to especially pay attention to the indentation of the statements in the block. Indenting starts a block and unindenting ends it. As a general practice, Python style guide recommend using 4 white spaces for indenting a block, and not using tabs. If you get indentation error message from Python interpretor, you  will need to check your code carefully.


In [4]:
x = 16
if x % 2 == 0:
    print('%d is even' % x)
else:
    print('%d is odd' % x)

print('This is always printed')

16 is even
This is always printed



Try to change **x** to even number and run the program again.

The else-block  in **if** statement is optional. If  the **else** block is omitted,  the statements in **if**-block are executed when the condition equal to **True**. Otherwise, the flow of execution continues to the statement after the **if** structure.

Try the following code:    

In [2]:
x = -2
if x < 0:
    print("The negative number %d  is not valid here." % x)

print("This is always printed")

The negative number -2  is not valid here.
This is always printed


What will be printed if  the value of **x** is negative?


**If** statement can also be nested within another. Further **if** structure can either be nested in **if-block** or **else-block**. Here is an example with another **if** structure nested in **else-block**.

This example assume we have two integer variables, **x** and **y**. The code shows how we might decide how they are related to each other.

In [5]:
x = 10
y = 10

if x < y:
    print("x is less than y")
else:
    if x > y:
        print("x is greater than y")
    else:
        print("x and y must be equal")

x and y must be equal


Here we can  see that the indentation pattern can tell the Python interpretor exactly which **else** belong to which **if**.

Python also provides an alternative way to write nested **if** statement. We need to use keyword **elif**. The above example is equivalent to :

In [6]:
x = 10
y = 10

if (x < y):
    print("x is less than y")
elif (x > y):
    print("x is greater than y")
else:
    print("x and y must be equal")

x and y must be equal


**elif** is an abbreviation of **else if**. With above structure, each condition is checked in order. If one of them is **True**, the corresponding branch executes. Even if more than one condition is **True**, only the first **True** branch executes.

There is no limit of the number of **elif** statements. but only a single final **else** statement is allowed. The **else** statement must be the last branch in the statement.

<a id = "cell_for"></a>

### 1.2 **For** statements
Computers are often used to automate repetitive tasks. Repeated execution of a sequence of statements is called iteration. Two language features provided by Python are **while** and **for** statement. We first take a look an example of **for** statement:

In [7]:
for name in ["Joe", "Amy", "Brad", "Angelina", "Zuki"]:
    print("Hi %s Please come to my party on Saturday!" % name)

Hi Joe Please come to my party on Saturday!
Hi Amy Please come to my party on Saturday!
Hi Brad Please come to my party on Saturday!
Hi Angelina Please come to my party on Saturday!
Hi Zuki Please come to my party on Saturday!


This example assume we have some friends, and we would like to send them an invitation to our party. With all the names in the list, we can print a message for each friend.

This is how the **for** statement works:

1. **name** in the **for** statement is called loop variables, and the names in the square brackets is called **list** in Python. We will cover more details on list in next prac. For now, you just need to know how to use  simple list in  a **for** loop.

2.  The second line in the program is the **loop body**. All the statements in the loop body is indented.

3.  On each iteration  of the loop, the loop variable is updated to refer to the next item in the list. In the above case, the loop body is executed 7 times, and each time name will refer to a different friend.

4. At the end of each execution of the body of the loop, Python returns to the **for** statement to handle the next items. This continues until there are no item left. Then program execution continues at the next statement after the loop body.      

One function commonly used in loop statement is **range()**.

Let us first have a look at the following example:


In [8]:
for i in [0, 1, 2, 3, 4 ]:
    print( 'The count is %d' % i)

The count is 0
The count is 1
The count is 2
The count is 3
The count is 4


Actually generating lists with a specific number of integers is a very common task, especially in **for** loop. For this purpose,  Python provides a built-in **range()** function to generate a sequence of values. An alternative way of performing above counting using **range()** is as follows.

In [9]:
for i in range(5):
    print( 'The count is %d' % i)
print('Good bye!')

The count is 0
The count is 1
The count is 2
The count is 3
The count is 4
Good bye!


Notice **range(5)** generate a list of $5$ values starting with 0 instead 1. In addition, 5 is not included in the list.


Here is a note on **range()** function: a strange thing happens if you just print a range:

In [10]:
range(5)

range(0, 5)

In [11]:
print(range(5))

range(0, 5)


In many ways the object returned by **range()** behaves as if it is a list, but in fact it isn’t. It is an object which returns the successive items of the desired sequence when you iterate over it, but it doesn’t really make the list, thus saving space. We say such an object is *iterable*. In the following example, **rangeA** is **iterable**.

There are functions and constructs that expect something from these objects to obtain successive items until the supply is exhausted. The **list()** function can be used to creates lists from iterables:

In [12]:
rangeA = range(5)
list(rangeA)

[0, 1, 2, 3, 4]

In this way, we can print the list generated by **range(5)** to check the values closely.

To count from 1 to 5, we need the following:

In [13]:
list(range(1, 6))

[1, 2, 3, 4, 5]


We can also add another parameter, **step**, in **range()** function. For example, a step of **2** can be used to produce a list of even numbers.

Look at the following example. Think about what will be the output before you run the code to check your understanding.

    

In [14]:
list(range(1, 6))

[1, 2, 3, 4, 5]

In [15]:
list(range(1, 6))

[1, 2, 3, 4, 5]

In [16]:
list(range(0, 19, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [17]:
list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [18]:
list(range(10, 0, -1))

[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

Let us return to the previous counting example, when **range()** generate the sequence of numbers, each number is assign to the loop variable **i** in each iteration. Then the block of statements is executed for each value of **i**. In the above example, we just print the value in the block of statements.

<a id = "cell_while"></a>

### 1.3  **While** statements

The **while** statement provide a much more general mechanism for iteration. It  allows you to repeatedly execute a block of statements as long as a condition is **True**.

Similar to the **if** statement, the **while** statement uses a boolean expression to control the flow of execution. The body of **while** will be repeated as long as the condition of boolean expression equal to **True**.

Let us see how the previous counting program can be implemented by **while** statement.

In [19]:
i = 0
while (i < 6):
    print('The count is %d' % i)
    i = i + 1

print('Good bye!')

The count is 0
The count is 1
The count is 2
The count is 3
The count is 4
The count is 5
Good bye!


How the while statement works:
1. The **while** block consists of the print and increment statements. They are executed repeatedly until count is no longer less than $6$. With each iteration, the current value of the index count is displayed and then increased by $1$.

2. Same as **for** loop, this type of flow is also called a loop since **while** statements is executed repeatedly. Notice that if the condition is **False** at the first time through the loop, the statement inside the loop are never executed. Try change the first line into **i = 6**. What is the output?

2. In this example, we can prove that the loop terminates because **i** start from $0$ and increase by $1$. Eventually, **i** will be great than 5. When the condition becomes **False**, the loop stops.

3. Sometime, we will have loop that repeats forever. This is called an infinite loop. Although this kind of loop might be useful sometimes, it is often caused by a programming mistake. Try to change the first two lines of previous program into  the following code. See what happen?

Note that if you run the following cell, the code will run indefinitely. You will need to go to the menu: **Kernel->Restart**, and then restart the kernel.

Also note that if you run such code in Python at command line or script mode, you will need to use **CTRL-C** to terminate the program.

In [20]:
i =  6
while i > 5 :
    print('The count is %d' % i)
    i = i + 1

print('Good bye!')

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
The count is 5342113
The count is 5342114
The count is 5342115
The count is 5342116
The count is 5342117
The count is 5342118
The count is 5342119
The count is 5342120
The count is 5342121
The count is 5342122
The count is 5342123
The count is 5342124
The count is 5342125
The count is 5342126
The count is 5342127
The count is 5342128
The count is 5342129
The count is 5342130
The count is 5342131
The count is 5342132
The count is 5342133
The count is 5342134
The count is 5342135
The count is 5342136
The count is 5342137
The count is 5342138
The count is 5342139
The count is 5342140
The count is 5342141
The count is 5342142
The count is 5342143
The count is 5342144
The count is 5342145
The count is 5342146
The count is 5342147
The count is 5342148
The count is 5342149
The count is 5342150
The count is 5342151
The count is 5342152
The count is 5342153
The count is 5342154
The count is 5342155
The count is 5342156
The count i

KeyboardInterrupt: 

<a id = "cell_break"></a>

### 1.4 **Break** statements

The **break** statement can be used to break out of a loop statement. It can be used both in **while** loop and **for** loop.

Alternative way of previous counting example with  **break** statement is as follows:

In [21]:
i =  0
while True :
    print( 'The count is %d' % i)
    i = i + 1
    if i > 6:
        break

print('Good bye!')


The count is 0
The count is 1
The count is 2
The count is 3
The count is 4
The count is 5
The count is 6
Good bye!


In this program, we repeated print value  of **i** and increase it by 1 each time. We provide a special condition to stop the program by checking if **i** is greater than 6. We then break out of the loop and continue executed the statement after the loop.  

Note that it is important to decide what the terminate condition should  be. We can see from previous counting example that the terminate condition might be different in different loop structure,.  

<a id = "cell_note"></a>