# A Crash Course in Python (Part 2): SOLUTIONS
In the second part of this crash course in Python, we will be focussing on more complex data structures such as tuples, lists and dictionaries.  

## Tuples

This allow us to store a fixed number of values as a single data object.  2-tuples (tuples with 2 values) are often referred to as pairs and 3-tuples (tuples with 3 values) are often referred to as triples.  Possibly the most obvious example of data which might be stored in tuple form is co-ordinates.

A tuple of values can be defined using ().  Note that tuples do not need all of the values to be of the same type


In [24]:
my_position=(4,3,-2)
print(type(my_position))
my_student=("Daniel",10)
print(type(my_student))

<class 'tuple'>
<class 'tuple'>


Individual elements of the tuple can be accessed through their indices using [] notation.  

In [25]:
print(my_position[0])
print(my_position[1])
print(my_position[2])

4
3
-2


### Exercise 1:
Complete the following function to compute the distance of a point from the Origin.

In [26]:
def find_distance(position):
    """
    function to find the distance from the Origin of a point 
    my_position: 3 tuple (x,y,z)
    returns: float
    """
   
    x=position[0]
    y=position[1]
    z=position[2]

    distance=(x**2+y**2+z**2)**0.5
    return distance
    
    

In [27]:
find_distance(my_position)

5.385164807134504

### Exercise 1b
Write a function which takes two triples (p1 and p2) and calculates their vector difference p1-p2

Remember, if p1 = (x1,y1,z1) and p2 = (x2,y2,z2) then p1-p2 = (x1-x2,y1-y2,z1-z2)

In [28]:
def diff(p1,p2):
    mydiff=(p1[0]-p2[0],p1[1]-p2[1],p1[2]-p2[2])
    return mydiff

In [29]:
print(diff(my_position,(4,-2,6)))

(0, 5, -8)


### Exercise 1c
Write a function which takes two triples (p1 and p2) and calculates the distance between the points in Cartesian space.

NB// Good programming tip: don't copy and paste and then combine the code from exercises 1a and 1b.  Write a **new function which calls each of your functions from 1a and 1b.**

In [30]:
def distance(p1,p2):
    return find_distance(diff(p1,p2))

In [31]:
print(distance(my_position,(4,-2,6)))

9.433981132056603


## Lists
Often we want to store lists of values.  These are similar to tuples but they do not have a fixed length.  In particular, it is common for them to grow over time.

A list of numbers can be initialised using [].  The length of a list can be found using the *len()* function.  Note the use of the *str()* function to convert the integer result into a string which can be concatenated to the end of the statement.

In [32]:
list_a=[4,6,"Adam","hello",10]
print("The length of the list is " +str(len(list_a)))


The length of the list is 5


It is also often useful to initialise a variable as the empty list (which items will be added to later). 

In [33]:
another_list=[]
print("The length of the list is "+str(len(another_list)))

The length of the list is 0


Individual elements can be accessed via their indices in the same way as tuples.  

In [34]:
print("The first element is "+str(list_a[0]))

The first element is 4


Note what happens if you try to access an element which is not there (i.e., *list index is out of range*)

In [35]:
print("The fifth element is "+str(list_a[5]))

IndexError: list index out of range

A useful feature (sometimes) is that you can also index backwards from the end of a list

In [36]:
print("The last element is "+str(list_a[-1]))

The last element is 10


Sublists can also be *spliced* from lists using the colon notation.  Note that the element at the first index is included and then all elements up to but not including the second index

In [37]:
print(list_a[1:4])

[6, 'Adam', 'hello']


In [38]:
print(list_a[1:])

[6, 'Adam', 'hello', 10]


In [39]:
print(list_a[:4])

[4, 6, 'Adam', 'hello']


Lists can also grow over time by appending elements to the end.  What happens if you run the following cell multiple times?


In [40]:
print("Length of list_a is "+str(len(list_a)))
list_a.append(3.14)
print("Length of list_a is now "+str(len(list_a)))


Length of list_a is 5
Length of list_a is now 6


We also sometimes want to concatenate two (or more) lists.  We can do this using +

In [41]:
list_b=[3,5,"data","science"]
list_c=list_a+list_b
print(list_c)

[4, 6, 'Adam', 'hello', 10, 3.14, 3, 5, 'data', 'science']


### List iteration
Frequently, we want to take a list and do something with every element in the list in turn.

Imagine we have a list of names and the function say_hello from the first part of the course


In [42]:
my_names=["Adam","Daniel","Bob","Esmerelda","Montgomery"]
def say_hello(mystring):
    if len(mystring)< 5:
        result="Hi "+mystring
    elif len(mystring)< 10:
        result="Hello "+mystring
    else:
        result="Good day "+mystring
    return result

How do we run this function for every element in the list?  We use the following construction to iterate over a list.

In [43]:
for thing in my_names:  
    print(say_hello(thing))

Hi Adam
Hello Daniel
Hi Bob
Hello Esmerelda
Good day Montgomery


Note that it does not matter what variable name we use to refer to each element of the list in turn.  After the list iteration has finished, this variable will store the last element of the list

In [44]:
thing

'Montgomery'

A very useful function on lists is the sorted() function.  This returns a list which is the sorted version of the input list (alphabetical sorting for Strings, numeric for integers and floats) 

In [45]:
print(sorted(my_names))

['Adam', 'Bob', 'Daniel', 'Esmerelda', 'Montgomery']


Note that this does not affect the original list.

In [46]:
print(my_names)

['Adam', 'Daniel', 'Bob', 'Esmerelda', 'Montgomery']


We could obviously however reassign my_names to be the sorted version.  We can also use an optional second argument to get the reverse order.

In [47]:
my_names=sorted(my_names,reverse=True)
print(my_names)

['Montgomery', 'Esmerelda', 'Daniel', 'Bob', 'Adam']


### Exercise 2:
Write a function which takes a list and returns the sum of the integers in that list.


In [48]:
def sumlist(alist):
    mysum=0
    for thing in alist:
        if isinstance(thing,int):
            mysum+=thing
    return mysum

In [49]:
sumlist([1,2,3,4])

10

In [50]:
sumlist([1.0,1,2,3,4,'thing'])

10

## Dictionaries

Dictionaries are a very useful structure for storing key-value pairs, where you want to be able to look up values for a given key.  They are initialised using {}.  Inside the {} is a list of key-value pairs where keys and values are separated by : and pairs are separated by ,


In [6]:
my_dict={"daniel":89,"evelyn":98,"bob":50}

In the example above, keys were of *type* __str__ and values were of *type* __int__ but they can be of any type (they can also be of mixed types but that is quite unusual)

It is also quite common to start with an empty dictionary where you will store things as you go along

In [7]:
another_dict={}

We can look things up in a dictionary using []

In [8]:
name="daniel"
print("The score for "+name+" is "+str(my_dict[name]))

The score for daniel is 89


We can add items to dictionaries and replace values in the same way.

In [9]:
my_dict["george"]=78
my_dict["daniel"]=83


In [10]:
name="daniel"
print("The score for "+name+" is "+str(my_dict[name]))

The score for daniel is 83


If you try to lookup a key that does not exist using [] then you will get an error:

In [11]:
name="beatrice"
print("The score for "+name+" is "+str(my_dict[name]))

KeyError: 'beatrice'

In this case it is better to use the dictionary's .get() method.  

In [12]:
name="beatrice"
print("The score for "+name+" is "+str(my_dict.get(name,0)))

The score for beatrice is 0


.get() is a method of the dictionary class.  It takes the particular dictionary which is given before the . and looks up the value of the first argument.  However, the second argument (0 in this case) is a default value which will be used if the key is not present in the dictionary.


Often we want to iterate over all of the items in a dictionary.  We might want to do something with the keys or values which match a certain condition.  

.keys() returns the keys of the dictionary as a list

.values() returns the values of the dictionary as a list

.items() returns the key,value pairs as a list of pairs

Note that the order of these lists is **not** _necessarily_ the same as the order the items were added to the dictionary.  

In [13]:
print(my_dict.keys())

dict_keys(['daniel', 'evelyn', 'bob', 'george'])


In [14]:
print(my_dict.values())

dict_values([83, 98, 50, 78])


In [15]:
print(my_dict.items())

dict_items([('daniel', 83), ('evelyn', 98), ('bob', 50), ('george', 78)])


So now we can straightforwardly iterate over the dictionary

In [17]:
def update_scores(dict_of_scores):
    """
    update_scores increases the scores in a dictionary based on the length of the key
    dict_of_scores: a dictionary of name-score pairs
    returns: a dictionary of name_score pairs    
    """
    for key in dict_of_scores.keys():  #take each key in the dict in turn
        if len(key)>5:  #check whether the length is greater than 5
            dict_of_scores[key]+=5  #if so the current value should be increased by 5
            
    return dict_of_scores

print(update_scores(my_dict))

{'daniel': 93, 'evelyn': 108, 'bob': 50, 'george': 88}


What happens if you run this again?  Do you get the same values or do they all increase by 5 again?

The dictionary is actually modified in place which means you don't actually need to return it, you would get the same effect with the following code.  Note, however, that now you do not print the result of update_scores (because it doesn't return anything).  You just print my_dict which has been modified.

In [41]:
def update_scores2(dict_of_scores):
    """
    update_scores increases the scores in a dictionary based on the length of the key
    dict_of_scores: a dictionary of name-score pairs
    """
    for key in dict_of_scores.keys():  #take each key in the dict in turn
        if len(key)>5:  #check whether the length is greater than 5
            dict_of_scores[key]+=5  #if so the current value should be increased by 5
            

update_scores2(my_dict)
print(my_dict)

{'daniel': 93, 'evelyn': 108, 'bob': 50, 'george': 88}


If you don't want to modify the argument to the function in place, you need to explicitly make a copy of it at the start of the function and then return that.  Use dict() to copy a dictionary and list() to copy a list.

Running the cell below repeatedly will only result in one update to the dictionary (because nothing is done with the return value other than printing it out).

In [18]:
def update_scores3(dict_of_scores):
    """
    update_scores increases the scores in a dictionary based on the length of the key
    dict_of_scores: a dictionary of name-score pairs
    returns: a dictionary of name_score pairs    
    """
    new_dict=dict(dict_of_scores)
    
    for key in new_dict.keys():  #take each key in the dict in turn
        if len(key)>5:  #check whether the length is greater than 5
            new_dict[key]+=5  #if so the current value should be increased by 5
            
    return new_dict

print(update_scores3(my_dict))

{'daniel': 98, 'evelyn': 113, 'bob': 50, 'george': 93}


### Exercise 3
Write a function which takes a dictionary such as my_dict and finds the (mean) average score.  There are library functions which will help you do this efficiently, but for now, just add up all of the values and divide by the length of the list of values.

In [44]:
def find_average(adict):
    values=list(adict.values()) #list() function turns the dict_values object into a standard list
    mymean=sumlist(values)/len(values)
    return mymean

In [45]:
find_average(my_dict)

84.75

### Exercise 4
Write a function which takes a dictionary such as my_dict and prints out a message for each person "X is above/below/equal to average" which depends on their score.

In [46]:
def print_message(adict):
    
    mymean=find_average(adict)
    for key in adict.keys():
        if adict[key]==mymean:
            print(key+" is equal to average")
        elif adict[key]<mymean:
            print(key+" is below average")
        else:
            print(key+" is above average")

In [47]:
print_message(my_dict)

daniel is above average
evelyn is above average
bob is below average
george is above average


## Files


We also often want to read and write things from file.  Depending on the filetype, there are lots of wasys of doing this but for plain text, the most simple where is probably to read (and write) a line at time.   Here we calcualte the length of each line and store it in a list.


In [1]:
inputfile="sometext.txt"


lengths=[]
with open(inputfile) as input:
    for line in input:
        print(line)
        lengths.append(len(line))
        
print(lengths)


This is a textfile.

There are 3 lines of text.

This is the last one.
[20, 27, 21]


In a similar way we can write the scores to a file.  Note that to use the write method, we need a String and we need to concatenate "\n" so that we get a newline for each score.

In [2]:
outputfile="results.txt"
with open(outputfile,"w") as output: # opens outputfile in writing mode
    for item in lengths:
        output.write(str(item)+"\n")  
        

### Exercise 5
There are lots of interesting built in functions for Strings.  Set somestring="My name is X\n" and then do the following:

1. somestring.split(" ")
2. somestring.lower()
3. somestring.upper()
4. somestring.strip()
5. somestring.startswith("My")
6. somestring.endswith("Adam")
7. somestring.contains("is")

Its also possible to iterate over a String as a list of characters.  Can you write a function which takes a String and returns a dictionary which consists of the letters it contains and their frequencies.  For example, count_letters("evelyn") should return something like {'e':2,'v':1,'l':1,'y':1,'n':1}

In [52]:
def count_letters(astring):
    count_dict={}
    for letter in astring:
        count_dict[letter]=count_dict.get(letter,0)+1
    return count_dict

In [53]:
count_letters("evelyn")

{'e': 2, 'l': 1, 'n': 1, 'v': 1, 'y': 1}

That's all for this crash course in Python - however, its just the tip of the iceberg when it comes to what you can do.  To improve your programming skills, keep practising!  Try things out and see what happens.  If it all goes horribly wrong, you can always restart the kernel :)  

There are also loads of good resources around - for example,
1. the official python tutorial at https://docs.python.org/3/tutorial/
2. *Data Science from Scratch* (Joel Grus, 2015), Chapter 2 comprises a crash course in Python (although it does assume some programming experience ... which you now have)