### GESIS Fall Seminar in Computational Social Science 2022
### Introduction to Computational Social Science with Python
# Day 2-1: Understanding Control Flow

## Overview

* Conditionals
* Iteration
   * List comprehensions
* Functions
* Modules and libraries

## So Far, We Learned How to Write Straight-Line Programs

In [116]:
s = 'All animals are equal, but some animals are more equal than others.'
s = s.rstrip('.').lower()
s_tokens = s.split()
print('There are', len(s_tokens), 'words in the sentence.')


There are 12 words in the sentence.


In straight-line programs, code is executed line by line, from top to bottom and within a line, from left to right (unless overridden with brackets).

Statements can be executed in more complex order, however, and the control flow determines how this is done.

## [Control Flow](https://www.youtube.com/watch?v=k0xgjUhEG3U)

* Control flow is the order in which statements are executed or evaluated
* In Python, there are three main categories of control flow:
  * **Branches** (conditional statements) – execute only if some condition is met
  * **Loops** (iteration) – execute repeatedly 
  * **Function calls** – execute a set of distant statements and return back to the control flow

![Three categories of control flow](figs/control_flow.png "Three categories of control flow")


# Conditional Statements

![Conditional statements](figs/conditional_statements.png "Conditional statements")

## Conditional Statements

```
if *Boolean expression*:
    *block of code*
```

```
if *Boolean expression*:
    *block of code*
else:
    *block of code*
```

```
if *Boolean expression*:
    *block of code*
elif *Boolean expression*:
    *block of code*
else:
    *block of code*
```

In [122]:
x = 2

if x > 0:
    print('Positive')
elif x < 0:
    print('Negative')
else:
    print('Zero')
    

Positive


## Indentation in Python Code

* Indentation is semantically meaningful in Python
* You can use [tabs or spaces](https://www.youtube.com/watch?v=SsoOG6ZeyUI)

* Obviously(!), tabs are preferable
* However, it does not really matter in Jupyter as Jupyter converts tabs to spaces by default

## You Can Nest Conditional Statements


In [123]:
x = -100

if type(x) == int or type(x) == float:
    if x >= 0:
        print('This is a nonnegative number.')
    else:
        print('This is a negative number.')
elif type(x) == str:
    print('This is a string.')
else:
    print("I don't know what this is.")
    

This is a negative number.


# Iteration

![Iteration](figs/iteration.png "Iteration")

## Iteration: `while` vs. `for`

```
while *Boolean expression*:
    *block of code*
```

```
for *element* in *sequence*:
    *block of code*
```

## Iteration: `while` with decrementing function

The decrementing function is a function that maps variables to an integer that is initially non-negative but that decreases with every pass through the loop; the loop ends when the integer is 0.

In [125]:
# decrementing function: 5 - x
x = 0
while x < 5: 
    print(x)
    x += 1
    

0
1
2
3
4


## Iteration: `while` with conditional statements


In [126]:
correct = 25
repeat = True

while repeat:
    guess = int(input("Guess which number from 1 to 100 I'm thinking of? "))
    
    if guess > correct + 10 or guess < correct - 10:
        print("You are quite far. Try again.")
    elif guess != correct:
        print("You are very close. Try again.")
    else:
        print("That's right!")
        repeat = False
        

Guess which number from 1 to 100 I'm thinking of? 59
You are quite far. Try again.
Guess which number from 1 to 100 I'm thinking of? 26
You are very close. Try again.
Guess which number from 1 to 100 I'm thinking of? 25
That's right!


## Iteration: `for` with sequences

In [127]:
for elem in [1, 2, 3, 4]:
    print(elem)
    

1
2
3
4


## Iteration: `for` with `range()`

* In-built function that produces an immutable ordered non-scalar object of type `range`
* Initiate as `range([start], stop, [step])`. If ommitted, `start = 0` and `step = 1`. 
* Function produces progression of integers `[start, start + step, start + 2*step, ..., start + i*step]` 

In [134]:
for i in range(5):
    print(i+1)
    
list(range(1, 5, 2))

1
2
3
4
5


[1, 3]

## Indexing Lists with `range(len(L))`

In [135]:
mylist = ['a', 'b', 'c', 'd']
for i in range(len(mylist)):
     print('index', i, '-', mylist[i])
        

index 0 - a
index 1 - b
index 2 - c
index 3 - d


* This is especially useful when you need to go simultaneously over two different lists of the same length

In [136]:
mylist1 = ['a', 'b', 'c', 'd']
mylist2 = [1, 2, 3, 4]
for i in range(len(mylist1)):
     print(mylist1[i] + str(mylist2[i]))

a1
b2
c3
d4


## Iteration: `break` and `continue`

* Use `break` to exit a loop 
* Use `continue` to go directly to next iteration

In [138]:
for i in range(5):
    if i == 2:
        break
    print(i)
    

0
1


## 🏋️‍♀️ PRACTICE

**Q1**: Using loops, write a program to print the following pattern:

![Iteration exercise](figs/iteration_exercise.png "Iteration exercise")

## 🏋️‍♀️ PRACTICE

In [None]:
# Q2: Sum the even integers from the list below.
lst = [1, 3, 2, 4.5, 7, 8, 10, 3, 5, 4, 7, 3.33]

# Hint: A number is even if, when we divide it by 2, the remainder is 0.
# Use the modulo operator % to get the remainder when dividing by an integer


In [None]:
# Q3: Create a list that contains all integers from 1 to 100 (inclusive), 
# except that it has the string 'boo' for every integer that is divisible by 3 
# Your list should look like: [1, 2, 'boo', 4, 5, 'boo', 7, 8, 'boo', 10, ...]
# Hint: Use the modulo operator % to check if a number is divisible by 3


# List Comprehensions

```
L = [*object, expression, or function* for *element* in *sequence*]
L = [*object, expression, or function* for *element* in *sequence* if *Boolean expression*]
L = [*object, expression, or function* for *element* in *sequence* for *element2* in *sequence2*]
```

* Provide a concise way to create lists
* Faster because implemented in C
* Nested list comprehensions can be somewhat confusing


## List Comprehensions

In [145]:
ans = []
for x in range(1, 11):
    ans.append(x**2)
print(ans)

ans2 = [x**2 for x in range(1, 11)]
print(ans2)

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


In [148]:
[x**2 for x in range(1, 11) if x%2 == 0]
[x + str(y) for x in ['a', 'b', 'c'] for y in [1, 2, 3]]

['a1', 'a2', 'a3', 'b1', 'b2', 'b3', 'c1', 'c2', 'c3']

## Dictionary and Set Comprehensions

In [149]:
print( {x: x**2 for x in range(1, 11)} )

print( {x.lower(): y for x, y in [('A', 1), ('b', 2), ('C', 2)]} )

print( {x.lower() for x in 'SomeRandomSTRING'} )


{1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}
{'a': 1, 'b': 2, 'c': 2}
{'r', 'e', 'o', 'g', 't', 'i', 'n', 'm', 'a', 's', 'd'}


## 🏋️‍♀️ PRACTICE

**Q4**: Rewrite the following code using a list comprehension:

```
sentence = "the quick brown fox jumps over the lazy dog"
words = sentence.split()
word_lengths = []
for word in words:
      if word != "the":
          word_lengths.append(len(word))
print(word_lengths)
```

## 🏋️‍♀️ PRACTICE

In [None]:
# Q5: Using a list comprehension, create a new list containing 
# the squares of the integers in the list below
lst = [1, 3, 2, 4.5, 7, 8, 10, 3, 5, 4, 7, 3.33]


In [None]:
# Q6: Consider the lists x and y below. Using a list comprehension,
# create a list that contains all combinations of (elem_x, elem_y) 
# such that elem_x + elem_y = 6
# Your answer should look as follows: [(0, 6), (1, 5), (2, 4), (3, 3)]
x = [0, 1, 2, 3]
y = [3, 4, 5, 6]


# Functions

![Functions](https://drive.google.com/uc?id=16PwluRAedCvTnbFylD8Cy4Fq7NEca1W1 "Functions") 


* Built-in
  * `len()`, `max()`, `range()`, `open()`, etc.
* User-defined
  * By you, collaborators, or the open-source community

## Defining and Calling Functions

**Defining a function**

```
def *function_name*(*list of parameters*):
    *body of function*
```

**Calling a function**

```
*function_name*(*arguments*)
```


## When the Function is Used, the Parameters are Bound to the Arguments

```
def *function_name*(*list of parameters*):
    *body of function*

*function_name*(*arguments*)
```


In [152]:
def get_larger(x, y):
    """Assumes x and y are of numeric type.
    Returns the larger of x and y.
    """
    if x > y:
        # The execution of a `return` statement terminates the function call
        return x
    else:
        return y
    
m = get_larger(3, 4)
m

4

## A Function Call Always Returns a Value

* The execution of a `return` statement terminates the function call
* The function call also terminates when there are no more statements to execute
* If no expression follows `return` or there is no `return` statement, the function returns `None`       

In [156]:
def get_larger(x, y):
    if x > y:
        return x
    if y > x:
        return y

print(get_larger(7,7))


None


## Functions Can Return Multiple Values

In [158]:
def double_one(a):
    return 2*a

def double_two(a, b):
    return 2*a, 2*b

x, y = double_two(2, 3)
x

4

## 🏋️‍♀️ PRACTICE

In [None]:
# Q7: Write a function that reverses a string, e.g. "now" -> "won".
# Then call the function to reverse each of the strings in the list.

to_reverse = ['doc', 'keep', 'lap', 'lever', 'nap', 'nip', 'war']


## 🏋️‍♀️ PRACTICE

In [None]:
# Q8: Rewrite the code below using a function and a suitable data structure.

# Print the name and profession of famous dead scientists:
print('Alan Turing was a mathematician.')
print('Richard Feynman was a physicist.')
print('Marie Curie was a chemist.')
print('Charles Darwin was a biologist.')
print('Ada Lovelace was a mathematician.')
print('Werner Heisenberg was a physicist.')


# Answer: Use a dictionary to store the data and a function 
# that reads the dictionary and prints each sentence. There is 
# less of a chance to make a typo if you carefully write 
# the function once instead of copying-pasting-and-modifying 
# each print statement.

scientists = {'Alan Turing': 'mathematician', 'Richard Feynman': 'physicist',
              'Marie Curie': 'chemist', 'Charles Darwin': 'biologist',
              'Ada Lovelace': 'mathematician', 'Werner Heisenberg': 'physicist'}

def print_professions(dic):
    """Takes a dictionary of {Name: profession} and prints
    'Name was a profession.'
    """
    for i in dic:
        print(i + ' was a ' + dic[i] + '.')
        
print_professions(scientists)

## Positional vs. Keyword Arguments

* Keyword arguments cannot come before positional arguments

In [None]:
def print_reverse(first, second, third):
    print(third, second, first)
    
print_reverse(1, 2, 3)
print_reverse(third=3, second=2, first=1)
print_reverse(1, second=2, third=3)
 

## Default Parameter Values

* Default values allow to call a function with fewer arguments than specified
* Default arguments cannot come before non-default arguments

In [159]:
def pretty_print(lst, sep, fullstop=True, capitalize=True):
    toprint = sep.join(lst)
    if fullstop:
        toprint += '.'
    if capitalize:
        toprint = toprint.capitalize()
    print(toprint)

wordlst = ['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']  # an English pangram

pretty_print(wordlst, ' ', True, True)
pretty_print(wordlst, ' ')
pretty_print(wordlst, ' ', False)


The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog


## A Function Defines a New Scope

* Scope = name space
* This means you can reuse your favorite variable names in different functions

In [161]:
def func(x, y):
    x += 1
    # x is a parameter, z is a local variable
    z = x + y   # z, x, and y exist only in the scope of the definition of func
    return z

x = 1
res = func(x, 5)

z

NameError: name 'z' is not defined

## The Global Scope

In [162]:
GLOBVAR = 3 # It is conventional to use CAPITALS to name global variables

def print_global():
    # Since GLOBVAR is not defined in the function, it is treated as global
    print(GLOBVAR)  

print_global()

3


## Modules

* For large programs, store different parts in `.py` files
* Get access using `import` statements

In [1]:
import module

module.my_func('Hello!')


She said: "Hello!"


In [2]:
import module as md

md.my_func('Hello there!')


She said: "Hello there!"


In [3]:
# You should be careful with this one: there will be a conflict if you
# import a different module that also has a function called get_tokens()
from module import *

my_func('HELLO! DO YOU HEAR ME?')


She said: "HELLO! DO YOU HEAR ME?"


## Useful Python Modules

https://docs.python.org/3/library/

* `re` – Regular expression operations
* `datetime` – Basic date and time types
* `math` – Mathematical functions
* `random` – Generate pseudo-random numbers
* `os.path` – Common pathname manipulations
* `pickle` — Python object serialization
* `csv` — CSV file reading and writing
* `json` — JSON encoder and decoder
* ...

## Useful Python Packages

* `numpy` – Scientific computing with multi-dimensional arrays
* `pandas` – Data anlysis with table-like structures (R, pretty much)
* `statsmodels` – Statistical data analysis with linear models
* `scikit-learn` – Data mining and machine learning
* `networkx` – Network analysis
* `matplotlib` – Plotting
* ...