<div class="clearfix" style="padding: 10px; padding-left: 0px">
<img src="unilogoblack.jpeg" width="250px" style="display: inline-block; margin-top: 5px;">
<a href="http://www.python.org"><img src="Python-logo-notext.svg" width="70px" class="pull-right" style="display: inline-block; margin: 0px;"></a>
</div>


# Basic Scientific Python

J. D. Nichols 2015


## Section 3: Container objects, or Sequences

Crunching numbers generally requires doing calculations on bunches of values, and it is often useful to be able to collect numbers together in a single container.  However, Python allows more than simple numbers to be bunched together; it possesses a variety of so-called *container objects* or *sequences* (sometimes also called *iterables* depending on the context) that allow different data types to be lumped together in different ways.  We have met one of these already: the `string` type above, in which we saw that we could access part of each string by following the variable name with some numbers enclosed within square brackets.  Other container objects are *lists*, *tuples*, and *dictionaries*. Let's take a look at this further...  

### Strings

Execute the following line:

In [21]:
polly = 'Pining for the fjords!'

We saw previosuly that we could access the first letter by typing 

```python
print polly[0]
```
Do this in the following cell and note the result.

In [2]:
print polly[0]
#print the first letter of string

P


This tells us that the first letter in this string is accessed using the value 0. What about the second letter?  This is accessed by typing 

```python
print polly[1]
```

...and so on.  We also saw that we could retrieve a range of letters from the string in one go by including a colon and a second number.  Execute the following cell:

In [3]:
print polly[0:3]
print polly[3:12]
print polly[3:]
print polly[:-1]
print polly[:-4]
print polly[3:10:2]
print polly[::-1]

Pin
ing for t
ing for the fjords!
Pining for the fjords
Pining for the fjo
igfr
!sdrojf eht rof gniniP


What's going on here?  The string can be thought of as being a list of characters, each with a number representing its place in the order, called its *index*, starting from zero, i.e.:

| Letter        |  P  |  i  |  n  |  i  |  n  |  g  |     |  f  |  o  |  r  |     |  t  |  h  |  e |    |  f |  j |  o |  r |  d |  s |  ! |
|---------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
| Index         |  0  |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10 |  11 |  12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| Reverse index | -22 | -21 | -20 | -19 | -18 | -17 | -16 | -15 | -14 | -13 | -12 | -11 | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 |



<div class="alert alert-warning" style="width: 60%; margin-left: 20%; margin-top:20px">
<p>**IMPORTANT!**</p>
<p>Indices in Python start at zero!  This is not the case in all languages!</p>
</div>

Each index can be accessed using the square bracket syntax, i.e. 

```python
print polly[0]
```

returns `'P'`, since this is the letter in the first index. This is called *indexing* the sequence. A string containing a range of indices is returned using a colon to separate the start and end indices. This is called *slicing* the sequence.

<div class="alert alert-warning" style="width: 60%; margin-left: 20%; margin-top:20px">
<p>**IMPORTANT!**</p>
<p>A range [m:n] is **inclusive** of the value at index m and **exclusive** of the value at index n, i.e. the last index extracted is (n - 1)</p>
</div>

Hence, the range of `polly[0:3]` starts at the first letter, `'P'`, and finishes at index 3 - 1 = 2 ( letter `'n'`). Likewise, `polly[3:12]` begins at index 3 (`'i'`) and finished at index 12 - 1 = 11 (`'t'`).  If no second index is given, it is assumed that the range continues to the end of the list, and likewise if no first index is given, it is assumed that the range begins at the start.  Values at the end of the list can be accessed easily by counting down from the last value at index -1, such that `polly[0:-4]` starts at the beginning, and finishes at index -4 - 1 = -5, i.e. the fifth letter from the end (`'o'`).  A third colon delimits the step, such that e.g. a value of 2 means "take every second letter". Hence, polly[3:10:2] means "start at index 3 and take every other letter until you get to index 9", returning the 3rd, 5th, 7th, and 9th values (`'igfr'`).  Finally, a negative step means count backward, i.e. it provides a quick and convenient method of reversing sequences.

#### Task 3.1
In the cell below, write and execute code which does the following:

* Take every character, from start to finish.
* Take every other character, starting at the first.
* Take every other character starting at the second.
* Take every 5th character, starting from the fourth and finishing with the third from the end.
* Starting from the fourth letter from the end, take every second letter backward, finishing with the third letter from the beginning.
* Print one string containing `'Knights who say '` followed by the third, fourth and last character of `polly`.
* Extract from `polly` the string `'ni! '`, similar to the above above, and assign it to another identifier. Create a string with this repeated 200 times.


In [26]:
print polly
print polly[0::2]
print polly[1::2]
print polly[3:-2:5]
print polly[-4:2:-2], 2
print "Knights who say " + polly[2:4] + polly[-1]
ni = polly[2:4] + polly[-1] + polly[6]
print ni*200
#places in [::] can be left blank
#[i:] denotes, from index i to end
#[:i] denotes, from beginning to index i
#[::x] +ve x prints word as said by first colons, in steps of x, -ve prints in reverse
#when counting backwards in letters have to put the last letter in the first colon
#, writes one after the other
#+ combines two parts of a string

Pining for the fjords!
Pnn o h jrs
iigfrtefod!
ioer
rj h o n 2
Knights who say ni!
ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! ni! 


Uses of string manipulation in scientific coding include tasks such as building filepaths for data files or output plots, building labels and titles for plots, and working with date strings. 

We have seen here some simple ways of manipulating strings using indexing and slicing.  There are, however, other ways of manipulating strings in a way unique to strings, using what are called their *methods*. These are pieces of code that come along with any Python object, and perform operations associated with it.  To find out what methods are associated with an object, in the interpreter (or in the following cell) type the name of the object followed by a full stop '.', and press `<TAB>`.  The list of methods will appear.  If you wish to find out details of one, type its name, followed by a `'?'`, and its docstring will appear.  


<div class="alert alert-info" style="width: 60%; margin-left: 20%; margin-top:20px">
<p>**HINT**</p>
<p>You can save some typing by using tab-completion: type enough letters to uniquely identify a method, press TAB and its name will complete automagically </p>
</div>

Press `<TAB>` in the following cell, complete some of the names of the methods and take a look at what they do:

In [48]:
polly.find("i")
#capitals are different to lowecase

1

You can see that there are a number of methods that operate on strings - the effects of some are (hopefully) obvious.  For example:

In [49]:
print polly
print polly.upper()
print polly.lower()
print polly.islower()
print polly.lower().islower()
print polly.title()
print polly.swapcase()

Pining for the fjords!
PINING FOR THE FJORDS!
pining for the fjords!
False
True
Pining For The Fjords!
pINING FOR THE FJORDS!


Some methods take *arguments* inside the brackets, that is some extra information that the code requires to do its job, e.g.:

In [50]:
print polly.count('f')
print polly.replace('f', 'g')
print polly.replace('or', 'um')

2
Pining gor the gjords!
Pining fum the fjumds!


A particularly useful method is `format`.  This lets you insert objects into the string, with a specified format.  For example:

In [56]:
instructions = \
"""{} shalt thou {} count, 
neither count thou {}, 
excepting that thou then proceed to {:.2f}.
{:0>8.4f} is right out.""".format(4, 'not', 2., 2.995, 5)
print instructions
#.format(x, y, z) replaces {} in a string with x, y, then z
#must be at least as many x, y, z's as {}

4 shalt thou not count, 
neither count thou 2.0, 
excepting that thou then proceed to 3.00.
005.0000 is right out.


It is apparent from the above that each object is being inserted into the string in the order (4, 'not', 2., 2.995, 5) at the locations denoted by the curly braces `{}`.  Optional format specifiers may be given inside the braces.  The format syntax is, to be honest, its own mini-language and, although details of its syntax can be found [here](https://docs.python.org/2/library/string.html#format-string-syntax), it's more typical to look up examples that look like what you want.  In our example, above, the first three objects (4 , 'not', and 2) are inserted with no format specifiers, so take the default for their type and value.  The latter two take format specifiers denoted by a colon followed by some stuff: `{:.2f}` means "a fixed point number rounded to two decimal places", and `{:0>8.4f}` means "a fixed point number rounded to four decimal places, of total length eight digits, with the number being padded with zeros at the start".

#### Task 3.2

Create a `float` called `pi`, with the value 3.14159. Using the format method in each case, print:

* `"Pi = 3.14159"`
* `"Pi = 03.14"`
* `"Pi = 3.14 and 2Pi = 6.28"`
* `"Pi = 003"`


In [94]:
pi = 3.14159
"Pi = {}".format(pi)
"Pi = {:0>5.2f}".format(pi)
"Pi = {:.2f} and 2Pi = {:.2f}".format(pi, 2*pi)
"Pi = {:0>3.0f}".format(pi)
#{:>y.xf}
#f means fixed point number
#x means the number of decimal places to be printed,
#y is the number of digits to be displayed
#> means "0"s to placefill should be put before the number a
#< means "0"s to placefil should be after the decimal point
#if you want to do something to a number do it in the bracket of format

'Pi = 003'

### Lists

We have described the strings above as lists of characters, but the word *list* has a specific meaning in Python.  A list is a collection of objects in one container. Lists can contain any type of Python object, be it `int`, `string`, `float`, etc.  We have actually already met a list when we used some simple loops.  The object
```python 
range(10)
```
is a list of integers ranging from 0-9, as we saw.  Execute the following cell.

In [95]:
count = [1,2,5]
thingsdone = 'aquaduct sanitation roads irrigation medicine education health'
romans = thingsdone.split()
circus = [1., 'And now for something completely different', True]
print count
print romans
print circus
#showing that lsist can be made of one or more of any type

[1, 2, 5]
['aquaduct', 'sanitation', 'roads', 'irrigation', 'medicine', 'education', 'health']
[1.0, 'And now for something completely different', True]


The identifier `count` points to a list of `int`s, `romans` points to a list of `string`s created by using the `split` method on the `string` `thingsdone`, and `circus` points to a list that holds a `float`, a `string`, and a `bool`.  As is evident, there is no requirement for the list to contain one type of object. This makes lists useful and versatile tools, but not particularly good for crunching large data sets. The overheads required for checking each object type means that iterating over large Python lists is extremely slow compared to, say, iterating over arrays of numbers in C. This is where Numpy, which we'll meet later, comes in.

We have discussed above how to extract particular elements from a `string` using indexing and slicing. Lists behave in exactly the same way:

#### Task 3.3
Write code that will print the `romans` list in reverse order.  Then, create a new list containing the 1st, 2nd, and 3rd objects in the lists `romans`, `count`, and `circus`, respectively, assign it to an identifier of your choice and print it.  Now, create and print a list that contains the lists `romans`, `count`, and `circus`. That's right - lists can be lists of lists! (of lists of lists of lists of lists...)

In [122]:
print romans[::-1]
joint = romans[:3] + count[:3] + circus[:3]
print joint
lstlst = [romans, count, circus]
print lstlst

#works the same as string slicing, and joining

['health', 'education', 'medicine', 'irrigation', 'roads', 'sanitation', 'aquaduct']
['aquaduct', 'sanitation', 'roads', 1, 2, 5, 1.0, 'And now for something completely different', True]
[['aquaduct', 'sanitation', 'roads', 'irrigation', 'medicine', 'education', 'health'], [1, 2, 5], [1.0, 'And now for something completely different', True]]


#### Task 3.4
Take the following string and:
* split it into a list using '/' as the delimiter (look at the docstring for `str.split`)
* take every 2nd element in reverse order
* join the list using a space between each word (look at the docstring for `str.join`)
* Capitalize the first letter
* print the result

You could in prinicple do this all in one line!

In [113]:
words = \
"""TIP/STEP/RUBBISH/ALTERNATE/A/EVERY/ON/
TURN/TANK/HALF/WATER/AERIAL/OLD/FORWARD/AN/
A/IN/ONLY/LIVE/SILLY/TO/VERY/USED/NOT/WE"""

In [198]:
lst_wrd = words.split("/")
print lst_wrd

bck_lst_wrd = lst_wrd[::-2]
print bck_lst_wrd

bck_str_wrd = " ".join(bck_lst_wrd)
print bck_str_wrd

sentance = bck_str_wrd[0] + bck_str_wrd[1:].lower()
print sentance

print " ".join(words.split("/")[::-2])[0] + " ".join(words.split("/")[::-2])[1:].lower()

#bck_str_wrd.lower()[1:]

#("x") split finds each x, deletes it, and splits the string to a list at the breaks  
#"x".join(y) join works by inseting a string x between each element of a list y
#.lower() makes each letter lowercase
#methods can be used one after another, but may reduce readability

['TIP', 'STEP', 'RUBBISH', 'ALTERNATE', 'A', 'EVERY', 'ON', '\nTURN', 'TANK', 'HALF', 'WATER', 'AERIAL', 'OLD', 'FORWARD', 'AN', '\nA', 'IN', 'ONLY', 'LIVE', 'SILLY', 'TO', 'VERY', 'USED', 'NOT', 'WE']
['WE', 'USED', 'TO', 'LIVE', 'IN', 'AN', 'OLD', 'WATER', 'TANK', 'ON', 'A', 'RUBBISH', 'TIP']
WE USED TO LIVE IN AN OLD WATER TANK ON A RUBBISH TIP
We used to live in an old water tank on a rubbish tip
We used to live in an old water tank on a rubbish tip


In [195]:
.join

SyntaxError: invalid syntax (<ipython-input-195-57d31de7f23a>, line 1)

In [None]:
words.split("/").join

Sequence elements can be tested to see if it contains an object, and the index of a particular member can be obtained using the `index` method of a list:

In [None]:
romans = ['aquaduct', 'sanitation', 'roads', 'irrigation', 'medicine', 'education', 'health']
print 'sanitation' in romans
print 'semi-conductor lasers' in romans
print romans.index('irrigation')

An important property of lists is that they can be changed.  The technical term for this is that they are *mutable*.  This simply means that individual elements can be modified, and elements can be added or removed. This is done by assigning an object to that particular element, e.g. in the following task:  

#### Task 3.5
Execute the following cell and write down what is happening in each line.

In [201]:
romans = ['aquaduct', 'sanitation', 'roads', 'irrigation', 'medicine', 'education', 'health']
#assigning a list to the variable romans
circus = [1., 'And now for something completely different', True]
#assigning a list to the variable circus
print romans
#print the whole list romans
print 'medicine' in romans
#print whether it is true or false if medicine is in romans
romans[romans.index('aquaduct')] = 'medicine'
#search romans for aquaduct and return the index
#change the value of this index to the string medicine
print romans
#print the new romans list
romans[2:4] = ['education', 'health']
#change the values in index 2 - 3 inclusive to education and health respectively
print romans
#print new romans
print romans + circus
#combine the lists romans and circus then print the list

['aquaduct', 'sanitation', 'roads', 'irrigation', 'medicine', 'education', 'health']
True
['medicine', 'sanitation', 'roads', 'irrigation', 'medicine', 'education', 'health']
['medicine', 'sanitation', 'education', 'health', 'medicine', 'education', 'health']
['medicine', 'sanitation', 'education', 'health', 'medicine', 'education', 'health', 1.0, 'And now for something completely different', True]


Double-click to edit me

#### Task 3.6

Write a `for` loop that iterates over `romans`, and prints a `string` that says `'The Romans have given us X'` where `X` is each element of `romans` in turn. (Remind yourself how we wrote `for` loops in Section 2 if necessary)

In [208]:
for i in range(len(romans)):
    print "The Romans have given us " + romans[i]
    
#find the length of romans
#print the string the romns have given us x where x is the index of the word in the list romans
#repeat until i = to len(romans)

The Romans have given us medicine
The Romans have given us sanitation
The Romans have given us education
The Romans have given us health
The Romans have given us medicine
The Romans have given us education
The Romans have given us health


Now take a look at the following:

In [209]:
print romans
romans.append('wine')
print romans
romans.extend(['public baths', 'peace'])
print romans
del romans[-1]
print romans
romans.insert(3, 'public order')
print romans
romans.remove('sanitation')
print romans

['medicine', 'sanitation', 'education', 'health', 'medicine', 'education', 'health']
['medicine', 'sanitation', 'education', 'health', 'medicine', 'education', 'health', 'wine']
['medicine', 'sanitation', 'education', 'health', 'medicine', 'education', 'health', 'wine', 'public baths', 'peace']
['medicine', 'sanitation', 'education', 'health', 'medicine', 'education', 'health', 'wine', 'public baths']
['medicine', 'sanitation', 'education', 'public order', 'health', 'medicine', 'education', 'health', 'wine', 'public baths']
['medicine', 'education', 'public order', 'health', 'medicine', 'education', 'health', 'wine', 'public baths']


Notice again our use of some methods of lists in the above cell, in this case appending individual items or lists, or inserting and removing elements at certain indices. 

### Tuples

Tuples are similar to lists in most ways, but the key difference is that they *immutable*.  What does this mean? Simply that, unlike lists, they cannot be changed, once created.  Strings are also immutable.  Tuples are denoted by round parentheses (), rather than square brackets, and in some cases we don't need the brackets at all.
#### Task 3.7
Create a tuple using round brackets and see what happens when you attempt to modify one of the elements. 

In [12]:
tup = (1,2.0,"graham")
tup[0] = "bob"
print tup
#you get an error saying that you cannot change what is in a tuple

TypeError: 'tuple' object does not support item assignment

Tuples are useful for assigning scalar values to multiple identifiers in one go, and for passing parameters and results between functions.  For example, consider the following:

In [211]:
(x,y,z) = 1,2.,'Caerbannog'
print x
print y
print z

1
2.0
Caerbannog


Here, we could have omitted the brackets completely:

In [210]:
x,y,z = 1,2.,'Caerbannog'
print x
print y
print z

1
2.0
Caerbannog


The tuple is being "unpacked" for us into the individual variables.  An advantage of this construct is that we can easily swap identifiers.  So instead of the usual:

In [212]:
print x, y
temp = x
x = y
y = temp
print x, y

1 2.0
2.0 1


We can simply write:

In [213]:
print x, y
y, x = x, y
print x, y

2.0 1
1 2.0


One further thing to note is that, owing to the automatic unpacking mentioned above, if we need to create a tuple with only one element we keep the trailing comma, as omitting it unpacks the tuple into a scalar variable:

In [214]:
a = (1)
print a
a = (1,)
print a

1
(1,)


### Dictionaries

Dictionaries, or `dict`s are surprisingly powerful tools, not only for organising data, but also for conditional execution of code. They are like lists, but where each element can be accessed not by an index, but by referring to its *key*, which is some other object associated with that element. Consider the following examples:

In [216]:
questions = {'favourite colour': 'red', 'quest': 'holy grail', 
             'v_swallow0': ['african?', 'european?']}
print questions['quest']
print questions['v_swallow0']

holy grail
['african?', 'european?']


The `dict` `questions` is being accessed via its keys, in this case the `string` `'quest'`, which then returns `'holy grail'`, and `'v_swallow0'`, which returns the list `['african?', 'european?']`.

<div class="alert alert-success" style="width: 60%; margin-left: 20%; margin-top:20px">
<p>**INFO**</p>
<p>Although not enforced by Python, it is conventional to keep lines shorter that 80 characters in length. This can be done by splitting a line with a '\' character, or if you are inside a pair of brackets, simply continuing on the next line with a carriage return. It is also conventional to start continued lines aligned with the start of the code in the brackets. Many modern text editors will do this for you.</p>
</div>

There are a few other ways to create `dict`s.  Here is another one:

In [None]:
argument = dict(one = 5, two = 'the full half hour')
print argument['one']

#### Task 3.8

Create dictionaries containing three pieces of data of your choice from [this website](http://nssdc.gsfc.nasa.gov/planetary/factsheet/) for the planets Mercury, Earth, and Jupiter. Demonstrate how you access the data using the dictionary's keys.


In [29]:
jupiter = {"mass":1898, "diameter": 142984, "density":1326}
earth = {"mass":5.97, "diameter":12756, "density":5514}
mercury = {"mass":0.330, "diameter":4879,"density":5427}
planets = {"jupiter":jupiter, "earth":earth, "mercury":mercury}

mercury = dict(mass = 0.330, diameter = 4879, density = 5427)
planets = dict(jupiter = jupiter, earth = earth, mercury = mercury)
print planets["mercury"]["diameter"]

#dictionaries can be nested, and accesed by writing square bracket after square bracket
#different methods creating dictionaries create the same thing and can be used interchangeably, but
#stick to one type as they have different syntax

4879


Keys do not have to be `string`s - in fact, they can be any type.  In the next example, the keys are `bool`s, which then creates a very convenient way of executing code conditionally:

In [237]:
a = 3
b = 4
result = {True: 'a is greater than b', False: 'a is less than or equal to b'}
print result[a > b]
a = 5
b = 4
print result[a > b]

a is less than or equal to b
a is greater than b


#### Task 3.9

Modify the code you wrote for Task 2.5 to make use of a `dict` to print `'Messiah'` if y is both even and greater than or equal to 10, or `'Very naughty boy'` otherwise. 

In [32]:
monty = {}
for i in range (21):
    a = (i >= 10) & (i % 2 == 0)
    if a == True:
        monty[i] = "Messiah"
    else:
        monty[i] = "Very naughty boy"


monty = {True:"Messiah", False:"Very naughty boy"}
for i in range (21):
    print monty[(i >= 10) & (i % 2 == 0)]
    
#    if a == True:
#        print monty["t"]
#    else:
#        print monty["f"]
        
        
# {} creates an empty dictionary
# {'key': 'value'}
# monty['mynewkey'] = 'mynewvalue'
#{'mynewkey': 'mynewvalue', 'key': 'value'}

Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Very naughty boy
Messiah
Very naughty boy
Messiah
Very naughty boy
Messiah
Very naughty boy
Messiah
Very naughty boy
Messiah
Very naughty boy
Messiah


<div class="alert alert-danger" style="width: 60%; margin-left: 20%; margin-top:20px">
<p>**Checkpoint: Please have the above marked and signed off by a demonstrator before continuing**</p>
</div>