# Data Structures, Control Structures and Exception Handling


November 2015

by Daniel Lins da Silva - daniel.lins (at) gmail.com

---
- [Data Structures](#Data-Structures)
    - [Strings](#Strings)
    - [Lists](#Lists)
    - [Tuples](#Tuples)
    - [Sets](#Sets)
    - [Dictionaries](#Dictionaries)
- [Control Structures](#Control-Structures)
    - [indentation](#Indentation)
    - [if, else, elif](#if,-else,-elif)
    - [for loops](#for-loops)
    - [while loops](#while-loops)
- [Exception Handling](#Exception-Handling)
- [References](#References)

## Data Structures
---
In addition to the atomic data types, Python has a number of very powerful built-in collection classes. **Strings**, **Lists** and **Tuples** are ordered collections that are very similar in general structure but have specific differences that must be understood for them to be used properly. **Sets** and **Dictionaries** are unordered collections.

## Strings
---
Strings are the data type that is used for storing text messages. 

Since strings are considered to be sequentially ordered, they support a number of operations that can be applied to any Python sequence. Table 1 reviews these operations and the following session gives examples of their use.

Table 1: Operations on Any Sequence in Python

 
| Operation Name 	| Operator 	| Explanation                             |
|----------------	|----------	|-----------------------------------------|
| indexing       	| [ ]      	| Access an element of a sequence         |
| concatenation  	| +        	| Combine sequences together              |
| repetition     	| *        	| Concatenate a repeated number of times  |
| membership     	| in       	| Ask whether an item is in a sequence    |
| length         	| len      	| Ask the number of items in the sequence |
| slicing        	| [ : ]    	| Extract a part of a sequence            |



In [109]:
var_a = "Puma concolor"
print type(var_a)

<type 'str'>


In [41]:
print len(var_a)

13


In [42]:
var_b = var_a.replace("concolor","pardoides")
var_c = var_b.replace("pardoides","yagouaroundi")
print var_b
print var_b

Puma pardoides
Puma pardoides


We can index a character in a string using square brackets `[]`:

In [43]:
print var_a[5] #Puma concolor

c


To slice a string we can use square brackets `[]` with colon `:` :

In [44]:
print var_a[5:11],var_a[5:], var_a[-2:], var_a[:4]

concol concolor or Puma


We can also define the step size using the syntax `[start:end:step]` (the default value for `step` is 1, as we saw above):

In [45]:
print var_a[::2], var_a[0:4:2]

Pm oclr Pm


String concatenation:

In [46]:
var_concat = var_a + " or " + var_b
print var_concat

Puma concolor or Puma pardoides


Find characters in a string:

In [114]:
print var_a.find("concolor")

5


Split a string into substrings:

In [117]:
print var_a
print var_a.split(" ")

Puma concolor
['Puma', 'concolor']


Python has a very rich set of functions for text processing. See http://docs.python.org/2/library/string.html for more information or use the function `help(str)`.

## Lists
---
Lists are very similar to strings, except that each element can be of any type.

The syntax for creating lists in Python is `[...]`:

In [48]:
list_number = [1,2,3,4,5]

print type(list_number)
print list_number

<type 'list'>
[1, 2, 3, 4, 5]


We can use the same slicing techniques to manipulate lists as we could use on strings:

In [49]:
print(list_number)

print(list_number[1:3])

print(list_number[::2])

[1, 2, 3, 4, 5]
[2, 3]
[1, 3, 5]


Note: Indexing starts at 0!

In [113]:
list_mixed = [1, 2.5, "hello", True]
print type(list_mixed), type(list_mixed[0]), type(list_mixed[1]), type(list_mixed[2]), type(list_mixed[3])
print list_mixed
print list_mixed[0]

<type 'list'> <type 'int'> <type 'float'> <type 'str'> <type 'bool'>
[1, 2.5, 'hello', True]
1


Inserting, modifying and removing elements from lists:

In [76]:
# create a new empty list
list_names = []

# add an elements using `append`
list_names.append("Ana")
list_names.append("Maria")
list_names.append("Carla")

# insert an element at an specific index
list_names.insert(1, "Pedro")


print list_names

['Ana', 'Pedro', 'Maria', 'Carla']


In [77]:
# modifying a list
list_names[2] = 'Daniel'
print list_names

list_names.sort()
print list_names

['Ana', 'Pedro', 'Daniel', 'Carla']
['Ana', 'Carla', 'Daniel', 'Pedro']


In [78]:
# removing an element
del list_names[2]
print list_names

['Ana', 'Carla', 'Pedro']


## Tuples
---
Tuples are like lists, except that they cannot be modified once created, that is they are *immutable*. 

In Python, tuples are created using the syntax `(..., ..., ...)`, or even `..., ...`:

In [86]:
tuple_values = (10,40.0,"A")
print type(tuple_values), len(tuple_values), tuple_values

<type 'tuple'> 3 (10, 40.0, 'A')


We can unpack a tuple by assigning it to a comma-separated list of variables:

In [84]:
x,y,z = tuple_values #Unpacking
print z

A


In [118]:
tuple_values[1] = "B"

TypeError: 'tuple' object does not support item assignment

## Sets
---
A set is an unordered collection of zero or more immutable Python data objects. Sets do not allow duplicates and are written as comma-delimited values enclosed in curly braces. The empty set is represented by `set()`. Sets are heterogeneous, and the collection can be assigned to a variable as below.

In [142]:
set_value = set()
print type(set_value)
print set_value

set_value = {12, 10, True, "House"}
print set_value

<type 'set'>
set([])
set(['House', 10, 12, True])


Even though sets are not considered to be sequential, they do support a few of the familiar operations presented earlier.

Sets support a number of methods that should be familiar to those who have worked with them in a mathematics setting. Table 2 provides a summary. Examples of their use follow. Note that union, intersection, issubset, and difference all have operators that can be used as well.

Table 2: Operations on a Set in Python

| Method Name  	| Use                         	| Explanation                                                    	|
|--------------	|-----------------------------	|----------------------------------------------------------------	|
| union        	| aset.union(otherset)        	| Returns a new set with all elements from both sets             	|
| intersection 	| aset.intersection(otherset) 	| Returns a new set with only those elements common to both sets 	|
| difference   	| aset.difference(otherset)   	| Returns a new set with all items from first set not in second  	|
| issubset     	| aset.issubset(otherset)     	| Asks whether all elements of one set are in the other          	|
| add          	| aset.add(item)              	| Adds item to the set                                           	|
| remove       	| aset.remove(item)           	| Removes item from the set                                      	|
| pop          	| aset.pop()                  	| Removes an arbitrary element from the set                      	|
| clear        	| aset.clear()                	| Removes all elements from the set                              	|

In [149]:
set1 = {1,3,5,7,9}
set2 = {2,4,6,8,10}
print set1.union(set2)
print set1.intersection(set2)

set1.add(2)
print set1.intersection(set2)
print set1.difference(set2)

set([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
set([])
set([2])
set([1, 3, 9, 5, 7])


## Dictionaries
---
Our final Python collection is an unordered structure called a **Dictionary**. Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, ...}`:

Also called *associative arrays* or *hash maps* in other languages.

In [90]:
import datetime as dt

occurrence = {}

occurrence["BasisOfRecord"] = "HumanObservation"
occurrence["ScientificName"] = "Puma concolor"
occurrence["EventDate"] = dt.datetime.today().strftime("%Y/%m/%d")

print occurrence

{'ScientificName': 'Puma concolor', 'EventDate': '2015/10/30', 'BasisOfRecord': 'HumanObservation'}


In [92]:
print len(occurrence)
print type(occurrence)

3
<type 'dict'>


Inserting, modifying and removing elements from Dictionaries:

In [99]:
# create a new empty dictionary
project = {}

# add new entries
project["name"] = "Name of project"
project["budget"] = 100000
project["team"] = ["Pedro", "Daniel", "Andre", "Andreiwid", "Suelane", "Jorge"]
project["StartDate"] = dt.datetime.today().strftime("%Y/%m/%d")

# insert a nested element
project["budget"] = {"HR":50000, "Hardware":30000, "Software":20000}

#modify an element
project["name"] = "Project Management Application"


# remove an element
del project["StartDate"]

print project

{'team': ['Pedro', 'Daniel', 'Andre', 'Andreiwid', 'Suelane', 'Jorge'], 'name': 'Project Management Application', 'budget': {'HR': 50000, 'Hardware': 30000, 'Software': 20000}}


We can also view a dictionary with a pretty format:

In [104]:
import json
print json.dumps(project,indent=4)

{
    "team": [
        "Pedro", 
        "Daniel", 
        "Andre", 
        "Andreiwid", 
        "Suelane", 
        "Jorge"
    ], 
    "name": "Project Management Application", 
    "budget": {
        "HR": 50000, 
        "Hardware": 30000, 
        "Software": 20000
    }
}


In [105]:
import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(project)

{   'budget': {   'HR': 50000, 'Hardware': 30000, 'Software': 20000},
    'name': 'Project Management Application',
    'team': ['Pedro', 'Daniel', 'Andre', 'Andreiwid', 'Suelane', 'Jorge']}


When getting the value of an element, we can use a default value if the element does not exist.

In [108]:
params = {"parameter1" : 1.0,
          "parameter2" : 2.0,
          "parameter3" : 3.0,}

value = params.get("parameter4",5.0)
value2 = params.get("parameter2", 5.0)

print value, value2

5.0 2.0


## Control Structures
---
Algorithms require two important control structures: iteration and selection. Both of these are supported by Python in various forms. The programmer can choose the statement that is most useful for the given circumstance.

### Indentation
Python programs get structured through indentation and not by the use of braces `{}` , i.e. code blocks are defined by their indentation. This principle makes it easier to read and understand other people's Python code.

So, how does it work? All statements with the same distance to the right belong to the same block of code, i.e. the statements within a block line up vertically. The block ends at a line less indented or the end of the file. If a block has to be more deeply nested, it is simply indented further to the right.
<img src="pythonblocks.png">

Example `if` statement in `C++`:

```C
/* C++ uses braces {} to denote the start and end of a block*/
if( value < 0){
    std::cout << value << std::endl;
}
std::cout << "done" << std::endl;
```

Example `if` statement in `Python`:

```python
# Python uses a colon (`:`) and the indentation to denote the start and end of a block.
if value < 0:
    print value
print done
```

### if, else, elif
The Python syntax for conditional execution of code uses the keywords `if`, `else` and `elif` (else if):

In [158]:
value1 = False
value2 = True

# Simple if statement
if value1:
    print "value1 is True!"

if value2:
    print "value2 is True!"

# if-else statement
if value1:
    print "value1 is True!"
else:
    print "value1 is NOT True!"

value2 is True!
value1 is NOT True!


In [159]:
# if-else-elif statement
if value1:
    print "value1 is True"
    
elif not value2:
    print "value2 is False"
else:
    print "value1 is False and value2 is True!"

value1 is False and value2 is True!


In [160]:
#if nested statement
value1 = value2 = True
if value1:
    if value2:
        print "value1 and value2 are True!"

value1 and value2 are True!


In [163]:
value1 = True

if value1:
    print "Inside the if statement"
    print "Still inside the if statement"
print "Outside the if statement"

Inside the if statement
Still inside the if statement
Outside the if statement


Another option is using the ternary expression:

In [165]:
tax = 100

action = 'go by car' if tax < 50 else 'go walking' #one-liner

print action

go walking


### for loops
the for statement can be used in conjunction with many of the Python collections. The for statement can be used to iterate over the members of a collection, so long as the collection is a sequence.

In [28]:
list_item = [1,2,3,4,5]

for item in list_item:
    print item

1
2
3
4
5


In [33]:
import datetime as dt

occurrence = {}
occurrence["BasisOfRecord"] = "HumanObservation"
occurrence["ScientificName"] = "Puma concolor"
occurrence["EventDate"] = dt.datetime.today().strftime("%Y/%m/%d")

for k,v in occurrence.items():
    print k, "-->", v

ScientificName --> Puma concolor
EventDate --> 2015/10/31
BasisOfRecord --> HumanObservation


A common use of the for statement is to implement definite iteration over a range of values.

In [171]:
# The range function will return a list object representing the sequence 0,1,2,3,4.
for item in range(5):
    print item

0
1
2
3
4


The other very useful version of this iteration structure is used to process each character of a string. The following code fragment iterates over a list of strings and for each string processes each character by appending it to a list. The result is a list of all the letters in all of the words.

In [172]:
wordlist = ['cat','dog','rabbit']
letterlist = [ ]
for aword in wordlist:
    for aletter in aword:
        letterlist.append(aletter)
print(letterlist)

['c', 'a', 't', 'd', 'o', 'g', 'r', 'a', 'b', 'b', 'i', 't']


### while loops
 The while statement repeats a body of code as long as your condition is `True`. 

In [174]:
counter = 1
while counter <= 5:
    print("Hello, world")
    counter = counter + 1

Hello, world
Hello, world
Hello, world
Hello, world
Hello, world


In [6]:
done = False
step  = 1
while step <= 10 and not done:
    print step
    if step > 5:
        done = True
        print "Job Done!"
    step = step + 1

1
2
3
4
5
6
Job Done!


## Exception Handling
An exception is an error that happens during the execution of a program. Error handling is generally resolved by saving the state of execution at the moment the error occurred and interrupting the normal flow of the program to execute a special function or piece of code, which is known as the exception handler. Depending on the kind of error ("division by zero", "file open error" and so on) which had occurred, the error handler can "fix" the problem and the program can be continued afterwards with the previously saved data.

Exceptions handling in `Python` is very similar to `Java`. The code, which harbours the risk of an exception, is embedded in a try block. But whereas in Java exceptions are caught by `catch` clauses, we have statements introduced by an `except` keyword in Python.

In [23]:
number = ""
print float(number)

ValueError: could not convert string to float: 

In [24]:
try:
    print float(number)
except(ValueError, TypeError), e:
    print e
    print "Do something!"
    

could not convert string to float: 
Do something!


 A try statement may have more than one except clause for different exceptions. But at most one except clause will be executed. 

In [25]:
values = [1,2,3]
try:
    print float(values)
except ValueError:
    print "Error"
except TypeError:
    print float(values[0])

1.0


So far the try statement had always been paired with except clauses. But there is another way to use it as well. The try statement can be followed by a finally clause. Finally clauses are called clean-up or termination clauses, because they must be executed under all circumstances.

In [1]:
try:
    x = float(input("Your number: "))
    inverse = 1.0 / x
finally:
    print("There may or may not have been an exception.")
print("The inverse: ", inverse)

Your number: 10
There may or may not have been an exception.
('The inverse: ', 0.1)


## References
---
- Python Tutorial: https://docs.python.org/2/tutorial/index.html
- Problem Solving with Algorithms and Data Structures (By Brad Miller and David Ranum, Luther College): http://interactivepython.org/runestone/static/pythonds/index.html
- Introduction to Python for Data Analysis (University of Colorado Computational Science and Engineering): https://github.com/ResearchComputing/Meetup-Fall-2013
- Lectures on scientific computing with Python (Robert Johansson): https://github.com/jrjohansson/scientific-python-lectures

## Next Lecture
In the next lecture we will see functions and classes in Python.