# SMU Master of Science (Economics) Programming Workshop in Python


## Introduction
In today's class, we will be learning more about Loops, Conditional Statements, Functions and Classes. Previously, we have seen some simple applications of loops and functions. Today, we will learn about them in more detail.

We being our discussion with loops.

### Loops

There are typically 2 types of loops: "For" and "While" loops.

"For" loops are traditionally used when you have a block of code which you want to repeat a fixed number of times. The Python "for" statement iterates over the members of a sequence in order, executing the block each time, while "while" loops are typically used when a condition needs to be checked each iteration, or to repeat a block of code forever.

We open the section with a few examples for both types of loops to get a feel of what programmers aim to do when they use such loops.

In [1]:
# Suppose we have a list of names, and we want to find out how many names have more than 4 letters
names = ['James', 'Daniel', 'Jane', 'Ahmad', 'Tan', 'Clarissa', 'Douglas', 'Gina']

names_more_than_4_letters = 0
for name in names:
    if len(name) > 4: names_more_than_4_letters += 1

print('There are %d names with more than 4 letters.' % names_more_than_4_letters)

There are 5 names with more than 4 letters.


In [2]:
# We can loop over a dictionary too
# Suppose we are interested in finding out the names of those who are taller than 160cm
height = {'James': 180, 'Jane': 157, 'Clarissa': 162, 'Douglas': 172, 'Gina': 175}

# for name in height.keys():
#     print(name)
    
for name in height.keys():
    if height[name] > 160:
        print(name, 'has a height of more than 160cm.')

James has a height of more than 160cm.
Clarissa has a height of more than 160cm.
Douglas has a height of more than 160cm.
Gina has a height of more than 160cm.


In [3]:
# Suppose we want to sum integers from 1 to n
n = 235
count = 0

total = 0
while count <= n: # condition for while loop to "stop looping"
    total += count
    count += 1

total

27730

In [4]:
# Of course, this can be done with a for-loop as well
n = 235

total = 0
for num in range(n+1):
    total += num

total

27730

In [5]:
# Fastest way to do this through a list comprehension (not discussed)
n = 235
sum([num for num in range(n+1)])

27730

Besides lists, one can also loop through strings; each character in the string will be read once. In addition, we can use the range function on an integer (say, n) to loop through the numbers 0 to n-1. Below, we show how both methods are done:

In [6]:
new_string = 'Betty'
for char in new_string:
    print(char)

B
e
t
t
y


In [7]:
num = 10
for i in range(num):
    print(i)

0
1
2
3
4
5
6
7
8
9


At times, one may find it useful to keep the index of the list when looping over. While one may use counts to keep count, we can also use the function enumerate. The function enumerate (which works only in a loop) returns the element and its index of each iteration.

In [8]:
name_list = ['Arthur', 'Bob', 'Charlie', 'Dennis']
for idx, name in enumerate(name_list):
    print(name, 'is index number %d in the class list.' % idx)

Arthur is index number 0 in the class list.
Bob is index number 1 in the class list.
Charlie is index number 2 in the class list.
Dennis is index number 3 in the class list.


In [9]:
count = 0
for name in name_list:
    print(name, 'is index number %d in the class list.' % count)
    count += 1

Arthur is index number 0 in the class list.
Bob is index number 1 in the class list.
Charlie is index number 2 in the class list.
Dennis is index number 3 in the class list.


### Conditional Statements

Conditional statements are executed whenever a condition is satisfied. We typically do this in the form of an if-else statement, where a specific chunk of code is executed only if the condition for it to be executed is met. Otherwise, the chunk of code is skipped. 

Here's an image representation of an if-only statement:

![](images/flowchart_if_only.png "If-Only Statements")


Here's an image representation of an if-else statement:

![](images/flowchart_if_else.png "If-Else Statements")

Application:
Given some data, one might, for example, be interested to subset the data based on a person's test result to identify who might require additional tuition. In that case, one way to do so will be to use conditional statements.

In [10]:
# In this example, we are given a dictionary of data where the keys are the names of the students and the values are
# the test results of the students. As the teacher, you are interested in those that score below x marks 
# which can be defined by user.

test_results = {
    'Amy': 68,
    'Jason': 41,
    'Priscilla': 48,
    'Daniel': 89,
    'Katie': 95,
    'Ryan': 28,
    'Dennis': 64
}

In [11]:
# One way to find out who scored below x is to use conditional statements. Below, we show 1 way to do thisL
pass_score = 50

for name in test_results.keys():
    if test_results[name] < pass_score:
        print(name, 'has scored below %d marks.' % pass_score)
    else:
        print(name, 'has scored above %d marks. Good job!' % pass_score)

Amy has scored above 50 marks. Good job!
Jason has scored below 50 marks.
Priscilla has scored below 50 marks.
Daniel has scored above 50 marks. Good job!
Katie has scored above 50 marks. Good job!
Ryan has scored below 50 marks.
Dennis has scored above 50 marks. Good job!


It appears that Jason, Priscilla and Ryan require additional attention and possibly tuition. Suppose we want to give more tuition to students who are failing, and reward students who are performing well (scoring above 80). How we can write this in 1 block of code? We can use the 'elif' conditional statement, which stands for else-if.

In [12]:
good_score = 80

for name in test_results.keys():
    if test_results[name] < pass_score:
        print(name, 'has scored below %d marks. Please come back during summer for tuition.' % pass_score)
    elif test_results[name] >= good_score:
        print(name, 'has scored above %d marks. Well done!' % good_score)
    else:
        print(name, 'has scored %d marks.' % test_results[name])

Amy has scored 68 marks.
Jason has scored below 50 marks. Please come back during summer for tuition.
Priscilla has scored below 50 marks. Please come back during summer for tuition.
Daniel has scored above 80 marks. Well done!
Katie has scored above 80 marks. Well done!
Ryan has scored below 50 marks. Please come back during summer for tuition.
Dennis has scored 64 marks.


There's no limit as to how many 'elif' statements we choose to include in the loop itself. One can add as many as one needs, but it comes at the expense of ease to read. 

Another useful concept is that of 'continue' and 'break'. The continue function is called when you wish to loop over to the next iteration when certain conditions are not met, while the break function exits the loop once some condition is met. This is best illustrated by means of an example.

In [13]:
# Suppose we are interested in converting the dates to YYYY format from YY format
# We also know that the list contains data from 1990 - 2019
dates = ['1996', '2019', '18', '16', '94', '1997', '08', '92', '90', '2011', '09', '1993', '95']

for idx, date in enumerate(dates):
    if len(date) == 4: continue # Go to the next loop; there's nothing to do here (what happens if we omit this?)
    else:
        if int(date) + 10 > 100: newdate = '19' + date
        else: newdate = '20' + date
    dates[idx] = newdate
    
dates

['1996',
 '2019',
 '2018',
 '2016',
 '1994',
 '1997',
 '2008',
 '1992',
 '2090',
 '2011',
 '2009',
 '1993',
 '1995']

### Loops + Conditional Statements

Loops can be combined with conditional statements to replicate more complicate structures that cannot be achieved simply by using either one alone. 

For example, consider the case where we have 2 lists: a list containing the gender of the students and another list containing the scores obtained by the students on a test, and we are interested in finding out which gender performed better on the test. It is easily to see that using either loops or conditional statements do not give us the result we want easily. This case can easily be generalized to policy changes (Treatment vs Control) etc.

In [14]:
# Suppose we are given 2 lists, one with the gender of the student and 1 with the score of the student
# We are trying to find the mean score for both genders
genders = ['F', 'F', 'M', 'F', 'M', 'F', 'M']
scores = [36, 44, 32, 31, 45, 41, 27]

count_M, count_F = 0, 0
total_M, total_F = 0, 0

for gender, score in zip(genders, scores):
    if gender == 'M':
        count_M += 1
        total_M += score
    else:
        count_F += 1
        total_F += score
        
print('The average score for males is %f.' % (total_M/count_M))
print('The average score for females is %f.' % (total_F/count_F))

The average score for males is 34.666667.
The average score for females is 38.000000.


In-class assignment:

You are given the following dataset:

|Name  | Height | Gender| Weight| School | Scores |
|------|--------|-------|-------|--------|--------|
|Jane  | 155.0  |   F   |  42.8 |  SMU   |  87.5  |
|John  | 181.2  |   M   |  85.1 |  NTU   |  81.2  |
|Tina  | 172.6  |   F   |  52.7 |  NUS   |  76.1  |
|Lena  | 162.3  |   F   |  49.0 |  SMU   |  88.3  |
|Kane  | 174.8  |   M   |  77.1 |  NUS   |  91.2  |
|Ryan  | 172.3  |   M   |  82.6 |  NTU   |  77.6  |
|Kate  | 151.8  |   F   |  45.2 |  SMU   |  82.5  |

1. Using loops and conditional statements, find the difference between the average weights and heights of males and females.
2. Using loops and conditional statements, find the difference between the average scores of males and females.
3. Using loops and conditional statements, find the difference between the average scores of students who studied in SMU and NTU.

### Functions
Now that we have a working understanding of how loops and conditional statements work, we are ready to move on to functions and classes. 


We begin this module with a simple question: what exactly are functions?

From [w3schools](https://www.w3schools.com/python/python_functions.asp), a function is a block of code which only runs when it is called. While this doesn't actually say much, it kind of implies that a function consists of proper code that can do whatever you want it to do, so long as it is computationally probable and possible. 

So far, we've seen quite a few functions:

1. print( )
2. range( )
3. list( )
4. dict( )
5. len( )
6. zip( )
7. int( )
8. float( )

What do they have in common? Apart from the fact that they turn green when you type them in the Jupyter Notebook, they have round brackets ( ), which we can then input arguments for the functions. This begs the question: what exactly is an argument, then? 

Before we answer that question, we appeal to the '?' function in Python.

In [15]:
?int()

The '?' function in Python returns the documentation string for the function. For our previous example, we called the '?' function on the argument, int( ). It then returns the documentation string for the int function. From the documentation string, we know that int converts a number or a string to an integer, and returns 0 when no arguments are given. For floating point numbers (decimals), it truncates towards 0 (rounds down). Let's give it a try.

In [16]:
print(int(6.1))
print(int())

6
0


It worked! Now, this is one way to find out what a function is doing under the hood. The next time you have problems with certain functions, be sure to use this function to find out what exactly the function takes as arguments, and what it outputs.

In what follows, I provide an example of a function. Do try to follow what's going on, as functions are pretty central to programming. Do note how loops and conditional statements are used (although not all functions require them) in this context.

In [17]:
def get_mean(lst):
    '''
    Given a list, returns x, the mean of the values.
    
    Parameters: lst (list) containing only integers and/or float values
    Output: x (float value).
    '''
    only_int_or_float = True
    if type(lst) != list: return 'Please provide a list.'
        
    for num in lst:
        if type(num) != float and type(num) != int: only_int_or_float = False
    
    if only_int_or_float is True: 
        return sum(lst)/len(lst)
    else: 
        return 'Please provide a list with integers and/or float values.'

In [18]:
names = ['Jane', 'John', 'Jack']
get_mean(names)

'Please provide a list with integers and/or float values.'

In [19]:
height = [136, 246, 250]
get_mean(height)

210.66666666666666

In [20]:
k = 36
get_mean(k)

'Please provide a list.'

What do we notice?
1. def - def is short-form for the word, define. This tells Python that we are going to define a function.
2. get_mean( ) - get_mean is the name of our function, which takes only 1 argument.
3. lst - lst is the sole argument used for the function, get_mean( ). In our case, we require lst to be a list.
4. return - returns the output of interest (in this case, the mean of the list)

We can include more than 1 argument for the function but in this case, it is not required.

Here's another example. This function acts almost exactly the same as the condition *num* >= 5.

In [21]:
def larger_than_five(num):
    '''
    Returns True if the number provided is larger than 5 in value.
    '''
    if type(num) != int and type(num) != float: return 'Please input a number.'
    
    if num >= 5:
        return True
    else:
        return False

In [22]:
print(larger_than_five('six'))
print(larger_than_five(6))

Please input a number.
True


In-class assignment

In what follows, you are required to design a function that receives a list and outputs the most popular character in the list. For example, the list could be ['Jack, 'Jason', 'Gina']. In this case, the most popular character is 'a', which appears 3 times. I have provided starter code for this assignment.

In [23]:
import operator

name_list = ['Jack', 'Jason', 'Gina']

def most_popular_character(lst):
    '''
    Returns the most popular alphabet from the names provided in the list
    '''
    if type(lst) != list: return 'Please provide a list.'
    
    character_count = {}
    for name in lst:
        for char in name: # You can loop through strings too!
            
            # Your code here
            
            
    sorted_character_count = sorted(character_count.items(), key=operator.itemgetter(1), reverse=True)
    return sorted_character_count[0]

most_popular_character(name_list)

# Robustness tests
bad_name_list = 'sngoe'
most_popular_character(bad_name_list) # Always assume users are trying to make your system look bad.

IndentationError: expected an indented block (<ipython-input-23-149a28062e63>, line 18)

### Advanced Application: Recursive Programming with Functions
In this section, we will be using recursive programming to solve problems. Recursive programming feres to the idea of using a function to call itself repeatedly, yielding a recursive structure. This can be used to solve many difficult problems, by breaking each problem down.

Finally, there will not be any questions on this segment since this is pretty difficult even for experienced programmers (aside: I'm not an experienced programmer). Nonetheless, it **might** be useful since DP uses one of best approaches to tackle difficult problems.

You have been warned: this sub-module is difficult. With that in mind, let's proceed.

---

#### Recursive Programming
One way to see what Recursive Programming is doing is through the factorial problem. Suppose you wish to define a function, that takes as an integer as its rgument and returns the factorial of the integer. How do you do this? One way is the following:

In [24]:
def factorial(n):
    if n == 0: return 1
    elif n == 1: return 1
    else:
        return n * factorial(n-1)
    
factorial(5)

120

While this is not the only way to solve this problem, it gives a glimpse as to what Recursive Programming aims to do. By repeatedly calling itself, it is able to reduce a potentially difficult to solve problem into a "linear" problem (the difficulty of the problem is O(n), where n is the integer).


However, while Recursive programming is one way to solve complicated problems such as the Fibonacci problem, it is by no means the only way. Sometimes, it can even be the "bad" way.

Consider the following problem:

A Fibonacci sequence is the integer sequence of 0, 1, 1, 2, 3, 5, 8...., where each succeeding integer is the sum of the previous 2 integers. Suppose we are trying to solve for the number at the n-th index (in Python, that will be the (n-1)-th index) of the Fibonacci Sequence. How can we do this? 

In [25]:
# First, the Recursive Programming method.
def fibonacci(n):
    if n == 0: return 0
    elif n == 1: return 0
    elif n == 2: return 1
    else: return fibonacci(n-1) + fibonacci(n-2)

n = 30

%timeit fibonacci(30)

286 ms ± 4.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [26]:
# As it turns out, recursive programming isn't very useful. How can we solve for this problem *more* efficiently?
# One way to do this is through Dynamic Programming and memoization

def fast_fibonacci(n):
    if n == 0: return 0
    elif n == 1: return 0
    elif n == 2: return 1
    else:
        start = [0, 0, 1]
        for num in range(2, n+1):
            start.append(start[num] + start[num-1])
    return start[n]

%timeit fast_fibonacci(100)

17.7 µs ± 497 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [27]:
fibonacci(35)

5702887

In [28]:
fast_fibonacci(35)

5702887

Memoization refers to the idea of storing the results the computer has previously calculated so that it does not have to calculate them from scratch over and over again. In the first scenario, the results from `fibonacci(3)` dose not enter into the computation of `fibonacci(4)`. That is, when the computer is called upon to run `fibonacci(4)`, it calculates `fibonacci(3)` from scratch and adds that to `fibonacci(2)`.

On the other hand, the second scenario stores the results obtained from `fast_fibonacci(3)`, making the computation of `fast_fibonacci(4)` that much easier.

---
In-class assignment:

Define a **recursive** function that takes an argument, n, and returns the sum of the first n integers.

In [29]:
def recursive(n):
    '''
    This function takes an integer, n, as its input and returns the sum of the first n integers.
    
    Input: n, int
    '''
    # Your code here

Suppose you're interested to solve for the square root of a number. Without a calculator, how can you do so? One way to do so is to use recursive programming.

In [30]:
(3.75 * 3/4)

2.8125

In [31]:
def approximate(guess, n, threshold):
    return abs(n - guess**2) < threshold

def square_root(n, guess=n/2, current_range = [0, n]):
    if approximate(guess, n, threshold = 0.01): return guess
    else:
        if n > guess**2: 
            current_range = [guess, current_range[1]]
            new_guess = (current_range[0] + current_range[1])/2
            return square_root(n, new_guess, current_range)
        else:
            current_range = [current_range[0], guess]
            new_guess = (current_range[0] + current_range[1])/2
            return square_root(n, new_guess, current_range)

square_root(60, 0)

7.745361328125

#### Dynamic Programming
It turns out that recursive programming is not the most efficient way. If you don't believe me, try setting n = 40. In this case, Dynamic Programming is the superior method. This begs the question: what is dynamic programming? 

Dynamic Programming is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing (**memoizing**) their solutions using a memory-based data structure (taken from the [Use Journal Blog](https://blog.usejournal.com/top-50-dynamic-programming-practice-problems-4208fed71aa3)).

Due to their difficulty (and the fact that this is only an introductory workshop), we will refrain from talking about Dynamic Programming.

### Classes

Python classes provide all the standard features of Object Oriented Programming: the class inheritance mechanism allows multiple base classes, a derived class can override any methods of its base class or classes, and a method can call the method of a base class with the same name. Objects can contain arbitrary amounts and kinds of data. As is true for modules, classes partake of the dynamic nature of Python: they are created at runtime, and can be modified further after creation.

---

In layman terms, classes are objects that a programmer can create, and bestow upon these objects attributes (such as names, height, weight etc.) and **methods** (which are essentially functions for classes). It might help if we were to look at a trivial example. To define a class, we call *class*:

In [32]:
class Robot:
    '''
    Creates an instance of the robot when called. 
    
    Robot has the following attributes:
    1. Name
    2. Favourite Drink
    3. Favourite Activity
    4. Location
    
    Robot has the following methods (functions it can call):
    1. what: Returns the class of the robot
    2. drink: Returns the favorite drink of the robot
    3. location: Returns the location of the class
    4. name: Returns the name of the robot
    5. activity: Returns the favorite activity of the robot
    '''
    def __init__(self, name, drink, act, location):
        # Attributes of the robots
        self.name = name
        self.drink = drink
        self.activity = act
        self.location = location
        
    # Methods of robots (what functions it can call)
    def what(self):
        return "I am a robot."
    
    def fav_drink(self):
        return self.drink
    
    def cur_location(self):
        return self.location
    
    def cur_name(self):
        return self.name
    
    def fav_activity(self):
        return self.activity

In what follows, we instantiate (create an instance of) a robot, with the name of the robot Julian, given as an argument. Our robot loves drinking diesel, and likes to party with friends. The __init__ method of our class tells what we need to provide to instantiate the class: we need to provide a name, drink, activity and a location to create an instance of a robot.

After the creation of the robot, we have 5 methods that we can call. Methods are essentially functions, but for classes. That is, the first argument of a method is the class. 

In [33]:
# In 
robot_1 = Robot("Julian", "Diesel", "Party", "Japan")
robot_1.what()

'I am a robot.'

In [34]:
robot_1.fav_drink()

'Diesel'

Consider a new example, which is exactly the same as the previous example except for a new method `made_new_friends`. This function tells us whether our robot made a new friend, and if so, how many new friends it made. Initially, we initialised the robot with 0 friends, but with the inclusion of this function, the robot is allowed to make new friends.

In [35]:
class Robot:
    '''
    Creates an instance of the robot when called. 
    
    Robot has the following attributes:
    1. Name
    2. Favourite Drink
    3. Favourite Activity
    4. Location
    
    Robot has the following methods (functions it can call):
    1. what: Returns the class of the robot
    2. drink: Returns the favorite drink of the robot
    3. location: Returns the location of the class
    4. name: Returns the name of the robot
    5. activity: Returns the favorite activity of the robot
    '''
    def __init__(self, name, drink, act, location, friends=0):
        # Attributes of the robots
        self.name = name # Name of robot
        self.drink = drink # Name of fav. drink
        self.activity = act # Name of fav. activity
        self.location = location # Name of current location
        self.friends = friends
        
    # Methods of robots (what functions it can call)
    def what(self):
        return "I am a robot."
    
    def fav_drink(self):
        return self.drink
    
    def cur_location(self):
        return self.location
    
    def cur_name(self):
        return self.name
    
    def fav_activity(self):
        return self.activity
    
    def made_new_friends(self, n):
        self.friends += n
        return "I have %d friends now." % self.friends
    
    def update_location(self, new_location):
        print('I was at %s.' % self.location)
        self.location = new_location
        print("I am currently at %s." % self.location)

In [36]:
# In 
robot_1 = Robot("Julia", "Beer", "Party", "Robobar")
robot_1.what()

'I am a robot.'

In [37]:
print(robot_1.friends)

robot_1.made_new_friends(5)

0


'I have 5 friends now.'

In [38]:
robot_1.friends

5

In [39]:
robot_1.update_location("Chomp Chomp Hawker Center")

I was at Robobar.
I am currently at Chomp Chomp Hawker Center.


It turns out that after attending some parties, Julia the Robot was able to make 5 friends! After drinking, she moved to a hawker center for some supper. 

Hopefully, this gives you a better idea of what classes are.

---

In-class assignment:

Write up a class, "Class" that has the following attributes:
1. Course Name, course_name
2. Course Code, course_code
3. Instructor Name, instructor_name
4. Number of students, num_students
5. Class Timings, class_timing

The class should have the following methods:
1. add_new_students (with argument: n (number of new students), int)
2. remove_students (with argument: n (number of students removed), int)
3. change_class_timings (with argument: new_timing, str)

Instantiate the following classes, using the "Class" class:

|   Course Name    | Code    |  Instructor Name   | No. of Students | Class Timings |
|------------------|---------|--------------------|-----------------|---------------|
|Microeconomics I  | ECON601 |  Takashi Kunimoto  |      20         | Mon 12:00pm   |
|Econometrics I    | ECON611 |  Su Liangjun       |      18         | Tue 08:30am   |    
|Mechanism Design  | ECON714 |  Shurojit Chatterji|       9         | Thu 15:30pm   |
|Macroeconomics I  | ECON602 |  Nicolas Jacquet   |      11         | Wed 09:00am   |

The following starter code has been provided. Do note that you should have 5 attributes, and 8 methods for your class (they are provided down in the starter code below).

In [40]:
class Class:
    '''
    Creates an instance of Class when called. 
    
    Each class has the following attributes:
    1. Course Name
    2. Course Code
    3. Instructor Name
    4. Number of Students
    5. Class Timings
    
    Robot has the following methods (functions it can call):
    1. name: Returns the course name
    2. code: Returns the course code
    3. instructor_name: Returns the instructor name
    4. no_of_students: Returns the number of students in the class
    5. timing: Returns the class timing
    6. add_new_students: Takes an input, n, and returns the total number of students in the class.
    7. remove_students: Takes an input, n, and returns the total number of students left in the class.
    8. change_class_timings: Takes an input, time, and returns the new timing of the class
    '''
    ## Your code here

Suppose there are some changes to the classes after the semester has commenced. Please reflect these changes:

1. 5 students from ECON601 decided to switch to ECON714. Update this change using the methods, add_new_students and remove_students from the appropriate classes.
2. Due to conflicting schedules, Professor Su Liangjun decided to change his classes from Tuesdays, 08:30am to Mondays, 15:30pm.
3. 2 new students have joined Professor Nicolas Jacquet's class.

In [None]:
## Your code here