<center><img src="http://i.imgur.com/sSaOozN.png" width="500"></center>

## Course: Computational Thinking for Governance Analytics

### Prof. José Manuel Magallanes, PhD 
* Visiting Professor of Computational Policy at Evans School of Public Policy and Governance, and eScience Institute Senior Data Science Fellow, University of Washington.
* Professor of Government and Political Methodology, Pontificia Universidad Católica del Perú. 

_____

# Session 1:  Programming Fundamentals

## Part B: Control of Execution in Python

<a id='beginning'></a>

You can  not be an effective programmer, if you can not master the concept of control of execution when writing a code. I will introduce three main schemes:

1. [Conditional Execution.](#part1) 
2. [Loops.](#part2) 
3. [Error Handling.](#part3) 

I will also introduce the concept of **[comprehensions](#comprehension)** that Python supports (but not R).

----

<a id='part1'></a> 
## Conditional Execution

This is how you tell the computer what part of a code to execute depending if an event is true or false.

In [1]:
from math import sqrt

value=-100

#condition
if value >= 0: 
    # what to do if condition is true:
    rootValue=sqrt(value)
    print (rootValue)
else:
    # what to do if condition is false:
    print('Sorry, I do not compute square roots of negative numbers')

Sorry, I do not compute square roots of negative numbers


Notice the condition follows *if* immediately. Notice also the use of **indentation** to indicate a group of instructions under the effect of the condition. This is very different from *R*. If you omitted the whole **else** section, the program will still run, but it will neither send any message nor value when the input is invalid.

When condition is complex, besides using **&**/**|**/**~** as in pandas, you can use  **and** / **or** / **not**:

In [2]:
value=8

if (value <= 10) & (value%2==0) : 
    print('This is an even number less than 11')
elif (value <= 10) & (value%2>0) : 
    print('This is an odd number less than 11')
elif (value > 10) & (value%2>0) : 
    print('This is an odd number greater than 10')
else:
    print('This is an even number greater than 10')

This is an even number less than 11


In [3]:
value=8

if value <= 10 and value%2==0 : 
    print('This is an even number less than 11')
elif value <= 10 and value%2>0 : 
    print('This is an odd number less than 11')
elif value > 10 and value%2>0 : 
    print('This is an odd number greater than 10')
else:
    print('This is an even number greater than 10')

This is an even number less than 11


Notice what happens if you do not use parenthesis with the '&' (or that family)

In [4]:
value=8

if value <= 10 & value%2==0 : 
    print('This is an even number less than 11')
elif value <= 10 & value%2>0: 
    print('This is an odd number less than 11')
elif value > 10 & value%2>0: 
    print('This is an odd number greater than 10')
else:
    print('This is an even number greater than 10')

This is an even number greater than 10


[Go to page beginning](#beginning)

----

<a id='part2'></a> 

## Loops

This is how you tell the computer to do something many times (and to stop when it has to):

In [5]:
from math import sqrt  # no need for this in R

values=[9,25,100]

for value in values:  # for each value in values...
    print(sqrt(value)) # do this

3.0
5.0
10.0


Notice that Python does not have a *sqrt* function in its base. The package **math** took care of that.

You do not need to show each result, you could save the results.

In [6]:
values=[9,25,100]
rootValues=[] # empty list, we will populate it later!

for value in values:
    rootValues.append(sqrt(value))  # appending an element to the list (populating the list)

# This list started empty, now see what its elements are:
rootValues

[3.0, 5.0, 10.0]

It is evident that combining *loops* and *conditonals* we can make better programs. This code is NOT controlling well the process:

In [None]:
values=[9,25,-100]
rootValues=[]

for value in values:
    rootValues.append(sqrt(value))

# to see the results:
rootValues

Above, you saw that Python gives an error ('ValueError'), it is because _sqrt_ is not defined for negative values; then the process ended abruptly. The code below controls the execution better:

In [None]:
values=[9,25,-100]
rootValues=[]

for value in values:
    if value >=0:
        rootValues.append(sqrt(value))
    else:
        print('We added a missing value (None) when we received a negative input')
        rootValues.append(None)
        
# to see the results:
rootValues

We are producing an output with the same size as input. If we omit the **else** structure, we will produce an output with smaller size than the input. 

You can also use **break** when you consider the execution should stop:

In [None]:
values=[9,25,-100,144,-72]
rootValues=[]

for value in values:
    # checking the value:
    if value <0:
        print('We need to stop, invalid value detected')
        break
    # you will get here if the value is not negative
    rootValues.append(sqrt(value))
        

# to see the results:
rootValues

The code above halted the program.

You can use **continue** when you consider the execution should not halt:

In [None]:
import numpy as np

values=[9,None,np.nan, '1000',-100, 144,-72]
for value in values: # notice the order of 'IFs'
    if value==None: # condition1
        print ('missing values as input')
        continue
    if isinstance(value, str): #condition2
        print ('string as input')
        continue
    if value < 0: # condition3
        print ('negative value as input')
        continue
    print (sqrt(value), 'is the root of ',value)            

The _None_ and _NAN_ have a different nature:

In [None]:
type(None),type(np.nan)

You use both values to denote a missing value, but NAN is common in structures containing only numbers, while None in any structure. Becareful when doing math:

In [None]:
10 + None

In the previous case, Python complains because '+' can not be used to add those two different data types. It is like trying this:

In [None]:
10 + '10'

As previously mentioned, nan is used with numerical data to denote missing values, so this operation is allowed:

In [None]:
10 + np.nan


_Loops_ are also needed when you want to count the presence of a particular value:

In [None]:
values=[9,25,-100,144,-72]

counterOfInvalids=0 # counter 

for value in values:
    if value <0:
        counterOfInvalids +=1 #updating counter

# to see the results:
counterOfInvalids

You may want to save particular positions (here is another difference with R):

In [None]:
values=[9,25,-100,144,-72]
positionInvalids=[]
currentPosition=0 # ithis is the 'accumulator' initial position

for value in values:
    if value <0:
        positionInvalids.append(currentPosition)
    currentPosition+=1 # becareful where you put the 'accumulator'

# to see the results:
positionInvalids 

In [None]:
# testing:
for pos in positionInvalids:
    print (values[pos])

If you have boolean values, you can profit by using boolean operators:

In [None]:
bvalues=[True,False,True,True]

for element in bvalues:
    if element:
        print('this guy is True')

In [None]:
bvalues=[True,False,True,True]

for element in bvalues:
    print (element)
    if element:
        print('this guy is True',type(element))

Notice this are not boolean:

In [None]:
# this is wrong
for element in bvalues:
    if ~element:
        print('this guy is True')

In [None]:
for element in bvalues:
    print (element)
    if ~element:
        print('this guy is True',~element,type(~element))

In [None]:
# this is wrong
for element in bvalues:
    if !element:
        print('this guy is True')


[Go to page beginning](#beginning)

----

<a id='part3'></a> 

## Error Handling

We have controlled errors before, using *if-else*; however, Python has particular functions to take care of that:

In [None]:
# what kind of error you get:
print (sqrt(-10))

In [None]:
# what kind of error you get:
print (sqrt('10'))

Python is giving different types of **errors** (*Type* and *Value*), let's use that:

In [None]:
values=[10,-10,'10']

In [None]:
for value in values:
    try:
        print (sqrt(value))
    except ValueError:
        print (value,'is a Wrong number!')
    except TypeError:
        print (value,'is Not even a number!!')
        

[Go to page beginning](#beginning)
____

<a id='comprehension'></a>
### Comprehensions

Python has implemented ways to create data structures using a technique called comprehensions (R can not do that).

As lists are mutable, this operation is creating a list on the run.

In [7]:
from math import sqrt

values=[9,25,49,121]
rootsInList=[sqrt(value) for value in values]  #List comprehension
rootsInList

[3.0, 5.0, 7.0, 11.0]

As tuples are immutable,  this operation is not creating a tuple on the run. We are in fact generating values that will later become a tuple.

In [8]:
values=[9,25,49,-121]
rootsInTuple=tuple(sqrt(value) for value in values if value > 0)  #tuple comprehension
rootsInTuple

(3.0, 5.0, 7.0)

Dicts can also be created that way:

In [None]:
values=[9,25,49,-121]
rootsInDict={value:(sqrt(value) if value > 0 else None) for value in values}  #Dic comprehension
rootsInDict

When you have a dict as input in comprehensions you can visit its values using _items()_ like this:

In [None]:
newDict={'name':'John', 'age':40, 'State':'WA'}
[[key,value] for key,value in newDict.items()]

The function **zip** allows you to create tuples using parallel association:

In [None]:
letters=['a','b','c']
numbers=[10,20,30]
list(zip(letters,numbers))

_Zipped_ lists are common in comprehensions:

In [None]:
[(number,double) for number,double in zip(numbers,np.array(numbers)**2)]

## Class exercises:

Make a function that:

1. Create a data frame with this:

2. Create a list of tuples, where each tuple is a pair (name,country), using comprehensions

In [9]:
names=["Tomás", "Pauline", "Pablo", "Bjork","Alan","Juana"]
woman=[False,True,False,False,False,True]
ages=[32,33,28,30,32,27]
country=["Chile", "Senegal", "Spain", "Norway","Peru","Peru"]
education=["Bach", "Bach", "Master", "PhD","Bach","Master"]

3. Implement a _for_ loop to count how many peruvian there are in the data frame. Try using **not** in one solution and **~** in another one.

4. Implement a _for_ loop to get the count of men. Try using **not** in one solution and **~** in another one.

Solve this in a new Jupyter notebook, and then upload it to GitHub. Name the notebook as 'ex_controlOfEx'.

## Homework

1. Implement a _for_ loop to get the count of men that have a Bach degree in the data frame. I recommend the use of **zip** (somwehere)

2. Implement a _for_ loop to get the count of people whose current age is an even number.

Solve this in a new Jupyter notebook, and then upload it to GitHub. Name the notebook as 'hw_controlOfEx'.

_____

* [Go to page beginning](#beginning)
* [Go to REPO in Github](https://github.com/EvansDataScience/ComputationalThinking_Gov_1)
* [Go to Course schedule](https://evansdatascience.github.io/GovernanceAnalytics/)