# Python part 4 #
## Looping through data structures ##
Iterating through data structures is a very common programming task. Let's think about printing each value in a list:

In [1]:
my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

We could loop over the list like this:

In [2]:
for i in range(0, len(my_list)):
    print(my_list[i])

0
1
2
3
4
5
6
7
8
9


though we've alread seen a nicer way of doing this using `in`:

In [3]:
for x in my_list:
    print(x)

0
1
2
3
4
5
6
7
8
9


The `x` here is a variable that gets assigned for each item: we can call it whatever we like. We can do the same thing with a set:

In [4]:
my_set = {"Auric", "Hugo", "Julius"}
for item in my_set:
    print(item)

Hugo
Auric
Julius


What about a dictionary?

In [5]:
my_dict = {"Helium" : 4, "Lithium" : 6, "Beryllium" : 8}
for x in my_dict:
    print(x)

Helium
Lithium
Beryllium


See how by default, we just iterate over the ***keys***, not the ***values***. To iterate over keys and values at the same time, use the `items` method:

In [6]:
for key, value in my_dict.items():
    print (key, ":", value)

Helium : 4
Lithium : 6
Beryllium : 8


Again, `key` and `value` are variables that are assigned automatically for each entry in the dictionary. Their names don't matter, but it is common to use `key` and `value`/`item`, or `k` and `v`/`i`.

### <span class="girk">Ex 4.1</span> ###
Each item in the dictionary `stocklist` describes an item of a retailer's stock; the key is the price of the item, and the value is the number of items in stock.

In [7]:
stocklist = {
         1.23 : 586,
         8.99 : 1080,
         2.68 : 2997,
         20.71 : 10,
         5.99 : 1007,
         0.43 : 3021
}

Use the `items` method to calculate the total value of all stock.

In [8]:
total = 0
for price, num in stocklist.items():
    total += price * num
total

26000.0

## Zip ##
Imagine you have two lists of some personal measurements: one is a list of heights in metres and another a list of weights in Kilograms, both ordered by person.

In [9]:
weights = [77.9, 84.5, 56.4, 90.2, 88.4]
heights = [1.61, 1.84, 1.70, 1.56, 1.86]

Let's calculate the BMIs for each person:

In [10]:
for i in range(0, len(weights)):
    weight = weights[i]
    height = heights[i]
    print(weight / height ** 2)

30.052852899193702
24.958648393194707
19.515570934256058
37.06443129520052
25.5520869464678


Iterating over two lists at the same time is a common task so there is a function to help: `zip`. This saves us dealing with indexed, and a couple of lines of code.

In [11]:
for weight, height in zip(weights, heights):
    print(weight / height ** 2)

30.052852899193702
24.958648393194707
19.515570934256058
37.06443129520052
25.5520869464678


<div class="mark">
You can think of `zip` as zipping together two lists like this:</div><i class="fa fa-lightbulb-o "></i>

### <span class="girk">Ex 4.2</span> ###
Write a single line of code that calculates a list of BMIs using the lists `weight` and `height` above. *Hint:* use a list comprehension.

In [12]:
bmis = [weight / height ** 2 for weight, height in zip(weights, heights)]
bmis

[30.052852899193702,
 24.958648393194707,
 19.515570934256058,
 37.06443129520052,
 25.5520869464678]

## Sorting and reversing lists ##
Let's write some code to print the list of weights, but sorted. We could use the `sort` method on the list, but that would alter the list order and it would no longer correspond to the heights. Let's suppose we don't want to alter the original list so that the order still agrees with that of the heights. Instead, create a copy, sort it, and then iterate: 

In [13]:
new_weights = weights.copy()
new_weights.sort()
for weight in new_weights:
    print(weight)

56.4
77.9
84.5
88.4
90.2


We are relieved of manually creating a copy of our list and sorting it by the `sorted` function. This function returns a sorted copy of any list, so now our code becomes:

In [14]:
new_weights = sorted(weights)
for weight in new_weights:
    print(weight)

56.4
77.9
84.5
88.4
90.2


or even more neatly

In [15]:
for weight in sorted(weights):
    print(weight)

56.4
77.9
84.5
88.4
90.2


Similarly, `reverse` returns a reversed copy of a list. We can combine `reverse` with `sorted` to list the weights from heaviest to lightest:

In [16]:
for weight in reversed(sorted(weights)):
    print(weight)

90.2
88.4
84.5
77.9
56.4


Note you can't sort a set or a dictionary: order doesn't matter for these collections. You can't sort a tuple in place (i.e. alter one) but you can return a sorted copy of one:

In [17]:
my_tuple = (1,3,2,4,0)
sorted_tuple = sorted(my_tuple)
sorted_tuple

[0, 1, 2, 3, 4]

### More control over sorting ###
If you sort a list of numbers (ints or floats), it's obvious that they should be sorted by value. Similarly, if we sort a list of strings, they are sorted alphabetically. Let's suppose we have something more complicated, for example a list of tuples:

In [18]:
films = [
    # title, year, box office
    ("A View to a Kill", 1985, 275.2),
    ("Licence to Kill", 1989, 250.9),
    ("Goldfinger", 1964, 820.4),
    ("Dr No", 1962, 448.8),
    ("Thunderball", 1965, 848.1),
    ("You Only Live Twice", 1967, 514.2),
    ("From Russia with Love", 1963, 543.8)
]

What happens if we sort this list?

In [19]:
films.sort()
films

[('A View to a Kill', 1985, 275.2),
 ('Dr No', 1962, 448.8),
 ('From Russia with Love', 1963, 543.8),
 ('Goldfinger', 1964, 820.4),
 ('Licence to Kill', 1989, 250.9),
 ('Thunderball', 1965, 848.1),
 ('You Only Live Twice', 1967, 514.2)]

By default the list is sorted by the first element in each tuple, i.e. alphabetically by film name. What if we want to sort by year, using the second element in each item? The `sort` method takes an optional argument `key` to specify what to sort on:

In [20]:
films.sort(key = lambda film : film[1])
films

[('Dr No', 1962, 448.8),
 ('From Russia with Love', 1963, 543.8),
 ('Goldfinger', 1964, 820.4),
 ('Thunderball', 1965, 848.1),
 ('You Only Live Twice', 1967, 514.2),
 ('A View to a Kill', 1985, 275.2),
 ('Licence to Kill', 1989, 250.9)]

Don't worry too much about how this works - the bit of code after the `key=` part is a bit advanced, but by copying this pattern you can see how to sort items by things other than the first element, and this is something you will often encounter. Let's sort by box office:

In [21]:
films.sort(key = lambda film : film[2])
films

[('Licence to Kill', 1989, 250.9),
 ('A View to a Kill', 1985, 275.2),
 ('Dr No', 1962, 448.8),
 ('You Only Live Twice', 1967, 514.2),
 ('From Russia with Love', 1963, 543.8),
 ('Goldfinger', 1964, 820.4),
 ('Thunderball', 1965, 848.1)]

### <span class="girk">Ex 4.3</span> ###
Here is our list of records of name and year of birth from part 3:

In [22]:
records = [
    [ "Maryam d'Abo" , 1960 ] ,
    [ "Claudine Auger" , 1941 ],
    [ "Barbara Bach" , 1947 ],
    [ "Daniela Bianchi" , 1942 ],
    [ "Carole Boquet" , 1957 ],
    [ "Lois Chiles" , 1947 ],
    [ "Britt Ekland" , 1942 ],
    [ "Carey Lowell" , 1961 ],
    [ "Tanya Roberts" , 1955 ],
    [ "Jill St. John" , 1940 ],
]

Practise sorting this list:
- alphabetically by name
- chronologically
- reverse chronologically

In [23]:
records.sort()
print(records)
records.sort(key = lambda x : x[1])
print(records)
records = records[::-1]
print(records)

[['Barbara Bach', 1947], ['Britt Ekland', 1942], ['Carey Lowell', 1961], ['Carole Boquet', 1957], ['Claudine Auger', 1941], ['Daniela Bianchi', 1942], ['Jill St. John', 1940], ['Lois Chiles', 1947], ["Maryam d'Abo", 1960], ['Tanya Roberts', 1955]]
[['Jill St. John', 1940], ['Claudine Auger', 1941], ['Britt Ekland', 1942], ['Daniela Bianchi', 1942], ['Barbara Bach', 1947], ['Lois Chiles', 1947], ['Tanya Roberts', 1955], ['Carole Boquet', 1957], ["Maryam d'Abo", 1960], ['Carey Lowell', 1961]]
[['Carey Lowell', 1961], ["Maryam d'Abo", 1960], ['Carole Boquet', 1957], ['Tanya Roberts', 1955], ['Lois Chiles', 1947], ['Barbara Bach', 1947], ['Daniela Bianchi', 1942], ['Britt Ekland', 1942], ['Claudine Auger', 1941], ['Jill St. John', 1940]]


## Formatting strings ##
Suppose you want to take a float variable `weight`, and create a string to display it of the form:

weight: 70kg

We can do this in two stages:
   - use the `str` function to turn each of the variables into strings;
   - joinging separate strings into a sinle string using `+`

In [24]:
weight = 70
string = "weight: " + str(weight) + "Kg"
print(string)

weight: 70Kg


However this is a bit unwieldy. Better is to use the format operator, `f`. This is very easy to use:
- enter `f` in front of your ordinary string
- if you want the values of any variables to appear in your string, put the variable name inside `{}`.

Here is our example using the `f()` operator:

In [25]:
string = f"weight: {weight}kg"
print(string)

weight: 70kg


This makes it very easy to format more complex strings without using lots of `+` signs and having to think about adding spaces:

In [26]:
for weight, height in zip(weights, heights):
    string = f"weight: {weight}kg, height: {height}m, BMI = {weight / height ** 2}"
    print(string)

weight: 77.9kg, height: 1.61m, BMI = 30.052852899193702
weight: 84.5kg, height: 1.84m, BMI = 24.958648393194707
weight: 56.4kg, height: 1.7m, BMI = 19.515570934256058
weight: 90.2kg, height: 1.56m, BMI = 37.06443129520052
weight: 88.4kg, height: 1.86m, BMI = 25.5520869464678


There are also lots of options available for formatting variables more precisely.

## IO with files ##
Input and output (IO) of data to and from your program can be achieved in many ways, but a common method is using a *file*. File IO is very easy in Python.
### Basics ###
Open a new file using `open`

In [27]:
f = open("file1.txt", "w+")

There are two things to notice here:
- we used an extra argument `"w+"`, this tells `open` to open a file for both writing and reading. It ensures any existing file is overwritten. If the file does not exist, a new file is created.
- `open` returns a special object which represents the open file.

Close the file using `close`

In [28]:
f.close()

Open the file again in the same way (it will be overwritten) and use `write` to write a string to the file.

In [29]:
f = open("file1.txt", "w+")
f.write("testing, testing, 1, 2, 3.\n")
f.close()

Now open the file again, but this time for reading and writing ***without overwriting it***: we do this using the `"r+"` option:

In [30]:
f = open("file1.txt", "r+")

Finally, read the file and see what we get:

In [31]:
string = f.read()
print(string)
f.close()

testing, testing, 1, 2, 3.



### Remembering to close ###
Whenever you have finished with a file you need to remember to close it. To avoid this pitfall, you can use the **with open as** syntax:

In [32]:
with open("file1.txt", "r+") as f:
    string = f.read()
    
print(string)

testing, testing, 1, 2, 3.



Now, all our dealing with the file object are ***in the indented block***. As soon as the end of the indented code is reached, the file is closed automatically. 

We can check this using the attribute `closed`, in the following code which reads our file, and writes another string to it in addition.

In [33]:
print(f.closed) # True: file starts closed

with open("file1.txt", "r+") as f:
    
    print(f.closed) # False: file now open
    
    string = f.read()
    f.write("testing, testing, 4, 5, 6\n")
    
    print(f.closed) # False: file still open
    
print(f.closed) # True: file has closed automatically

True
False
False
True


### <span class="girk">Ex 4.4</span> ###
- Open a file `test.txt` for writing (use `w+`);
- write some lines to the file;
- close the file;
- open the file for reading (use `r+`);
- read the file into a variable `text`;
- close the file.
- print the contents of `text`

In [34]:
f = open("test.txt", "w+")
f.write("I'll buy you a delicatessen...")
f.write("in stainless steel.")
f.close()
f = open("test.txt", "r+")
text = f.read()
print(text)
f.close()

I'll buy you a delicatessen...in stainless steel.


## Reading a line at a time ##
The `read` function reads the whole file into a string. You often need to read a line at a time, which is easy using `in`:

In [35]:
with open("file1.txt", "r+") as f:
    for line in f:
        print(line)

testing, testing, 1, 2, 3.

testing, testing, 4, 5, 6



### <span class="girk">Ex 4.5</span> ###
The files `films.txt` and `directors.txt` contains a list of films and their directors. Use the data in these files to produce a list of tuples of the form (film, director), by doing the following:
- for each file, open it and read a line at a time
- create two empty lists for films and directors
- for each file, read each line and append to one of your lists
- loop over both lists together using `zip()` adding individual tuples to a list;
- can you create your list in a single statement using the `zip` and `list` functions together?

In [36]:
# empty lists
films, directors = [], []

#read films from file to list
with open("films.txt", "r+") as f:
    for film in f:
        films.append(film)
        
# read directors from file to list
with open("directors.txt", "r+") as f:
    for director in f:
        directors.append(director)
        
# create empty list and populate
bond = []
for film, director in zip(films, directors):
    bond.append((film, director))

# do the same in a single statement
bond2 = list(zip(films, directors))

# print the first few items of each list to compar
print(bond[0:5])
print(bond2[0:5])

[('Dr. No\n', 'Terence Young\n'), ('From Russia with Love\n', 'Terence Young\n'), ('Goldfinger\n', 'Guy Hamilton\n'), ('Thunderball\n', 'Terence Young\n'), ('You Only Live Twice\n', 'Lewis Gilbert\n')]
[('Dr. No\n', 'Terence Young\n'), ('From Russia with Love\n', 'Terence Young\n'), ('Goldfinger\n', 'Guy Hamilton\n'), ('Thunderball\n', 'Terence Young\n'), ('You Only Live Twice\n', 'Lewis Gilbert\n')]


## Object-oriented programming ##
### Why are objects useful? ###
Python is an *object-oriented (OO) language* because it allows objects. We've seen that Python provides several types of objects so far:
- list
- dict
- set
- file

There are many more. Recall that an object can have:
- methods (built-in functions): e.g. `list.append(x)`, `set.clear()`, `file.close()`
- member variables (data that belongs to them): `file.closed` (either `True` or `False`)

Objects are useful for more than one reason:
- they keep the data that is being stored together with the functions that are useful for working with it. For example, a list object looks after its own data (the members of the list) and gives us all the fuctions we need for manipulating the list (we just do `list.function_name`);
- they prevent the user having to worry too much about the internal structure of the data: we don't need to know how a list or a dictionary is actually arranged in memory: we just use the convenient methods for dealing with it.

### Terminology ###
There are two more important terms:
- The type of an object is more properly called its **class**; so *list*, *dict*, *set*, *file* are all **classes**.
- When we actually create an object of a particular class (e.g. of the list class by entering `list = [1, 2, 3]`) we create an **instance** of the class.

You might write a program that creates several *instances* of a class (e.g. more than one list), but there is *only ever one class of a particular name*.

### Creating your own class ###
The power of OO languages comes from ***defining your own classes***. Here's a very simple class that defines one method and one member variable:

In [37]:
class simpleclass:
    
    def do_something(self):
        self._greeting = "hello"
        print(self._greeting)

The class is defined using the keyword `class` followed by the name of the class; the rest of the class is indented.

Creating an object is easy: let's create a simpleclass object and check it's type:

In [38]:
s = simpleclass()
type(s)

__main__.simpleclass

The class has been given one method, `do_something`. It takes a very special argument called `self`, which is a reference to the object itself. Python always supplies this argument: you don't have to enter it when using any method:

In [39]:
s.do_something()

hello


The class has been given one member variable, `_greeting`, simply by assigning `self._greeting`. We can check the value of this variable:

In [40]:
s._greeting

'hello'

### A more useful example ###
Let's write a class to define objects that serve a very simple purpose. Suppose we want to store some patient data: name, height and weight and BMI. We will create a patient object that will do this. Each object:
- will have `_name`, `_weight`(Kg) and `_height`(m) member variables;
- a method called `bmi` which calculates the patient's BMI; (ensuring the BMI is correct even if height and weight change);
- a way of making sure that when an object is created, the name, weight and height are all populated.

Here's how the class for this simple object looks:

In [41]:
class patient:
    
    def __init__(self, name, weight, height):
        self._name = name
        self._weight = weight
        self._height = height
        print(f"created patient {name} with weight {weight}Kg and height {height}m.")
        
    def bmi(self):
        return self._weight / self._height ** 2

The `__init__` method is a special one. It is called automatically whenever an object is created. It can take arguments which can be used to set initial values in the object. Let's see it work:

In [42]:
p = patient(name = "Messervy", height = 1.72, weight = 88.4)

created patient Messervy with weight 88.4Kg and height 1.72m.


We see output from the `print` statement in the `__init__` method which proves it was called automatically. Now we can check Linus' details

In [43]:
print(p._name, p._weight)

Messervy 88.4


and BMI

In [44]:
p.bmi()

29.881016765819368

### <span class="girk">Ex 4.6</span> ###
- Create a `film` class that has three member variables: `_title`, `_year` and `_rating`, where rating is a percentage score. 
- Give your class an init method that takes a title, year and rating
- Add a function `star_rating` that returns the films rating as a score out of 5. *Hint:* use the `round` function.

In [45]:
class film:
    
    def __init__(self, title, year, rating):
        self._title = title
        self._year = year
        self._rating = rating
        
    def star_rating(self):
        return round(self._rating / 20)
    
my_film = film("skyfall", 2012, 92)
my_film.star_rating()

5

### Editing objects on the fly ###
The class definition does not set in stone what the object should do. Python allows us to tinker with objects once they have been created. We can add a variable to an object

In [46]:
p._nickname = "M"
p._nickname

'M'

or even get fancy and define a function and add it to the object:

In [47]:
def what_is_this():
    print("this is a patient record")
    
p.method2 = what_is_this

Let's call this function now:

In [48]:
p.method2()

this is a patient record


It's not so simple to add "proper" methods that use `self`, but it can be done.

### Class variables ###
The member variables such as `name` are unique to each object instance. Let's create another patient to show this.

In [49]:
p2 = patient("Moneypenny", 60, 1.55)

created patient Moneypenny with weight 60Kg and height 1.55m.


Sometimes however, it's useful to have a variable that is ***shared*** among all objects of a class. Let's create a very simple class that does this.

In [50]:
class simple2:
    i = 0

That's it. The variable `i` is a **class variable***. It applies to the whole class and all objects that are defined by it. Access it from any object:

In [51]:
bib = simple2
bib.i

0

Or even from the class itself:

In [52]:
bib.i

0

Because it is shared, any changes that are made will be seen from other objects. Let's create another object:

In [53]:
bob = simple2

and change the value of `i`:

In [54]:
bob.i = 1
print(bib.i, bob.i, simple2.i)

1 1 1


The change is seen everywhere. Suppose we want to assign a sequential number to each of the patient records we created earlier. We can do this with a class variable:
- have a class variable keep a count of the number of patient objects that are created; this number will be shared across the class
- increment the counter each time an object is created, by using the init function
- use the counter to set a member variable - the item number - for the individual object

Here is how our new class might look

In [55]:
class patient:
    
    count = 0
    
    def __init__(self, name, weight, height):
        self._name = name
        self._weight = weight
        self._height = height
        
        self._record_number = patient.count # get the record number from the count
        patient.count += 1                  # increment the count since we have created an object
        
        print(f"created patient {name} with weight {weight}Kg and height {height}m.")
        
        print(f"record number = {self._record_number}") # print the record number
        
    def bmi(self):
        return self._weight / self._height ** 2

Let's see this in action:

In [56]:
gloria = patient(name = "Messervy", height = 1.72, weight = 88.4)
dolores = patient(name = "Boothroyd", height = 1.83, weight = 70.2)
jim = patient(name = "Tanner", height = 1.74, weight = 76.1)


created patient Messervy with weight 88.4Kg and height 1.72m.
record number = 0
created patient Boothroyd with weight 70.2Kg and height 1.83m.
record number = 1
created patient Tanner with weight 76.1Kg and height 1.74m.
record number = 2


### <span class="girk">Ex 4.7</span> ###
A simple random number generator can be defined using the recurrence relation

$x_{n + 1} = 1103515245\times(x_n + 12345) \text{mod}\space 2 ^ {31}.$

We start with a value $x_1$ and generate the next in the sequence $x_2$, then use this to generate $x_3$, and so on. 

Write and test a class `random` that has a function `generate` that returns the next value in the sequence each time. To do this:

- use a class variable `x` with a starting value of 1, which will store the newly calculated value of $x$ each time;
- add a class function `generate` to generate the new value and store it in `x`.
- create a `random` object and call the `generate` function several times and see what values you get.


In [57]:
class random:
    
    x = 1
    
    def __init__(self):
        pass
        
    def generate(self):
        self.x = 1103515245 * (self.x + 12345) % 2 ** 31
        return self.x
    
r = random()

print(r.generate())
print(r.generate())
print(r.generate())
print(r.generate())
print(r.generate())

362951858
1772524047
1457694888
2075190733
767571086
