# Chapter 5: Lists

*Data Processing with Python, a course for Communication and Information Sciences*

<a href=mailto:s.wubben@tilburguniversity.edu>s.wubben@tilburguniversity.edu</a>

-------------------------------

## Variables and collections

As we have learned before, variables have one value in them. When we put a new value in the variable, the old value is overwritten. 

In this chapter we will have a first look at collections. A collection allows us to put many values in a single 'box'.
A collection is nice because we can carry lots of values around in one convenient package. It is very straight forward to store a list of items as a collection. Not surprisingly, we call this a list.


In [None]:
friends = [ 'Johh', 'Bob', 'Mary' ]
stuff_to_pack = [ 'socks', 'shirt', 'toothbrush' ]

* Lists are surrounded by square brackets and the elements in the list are separated by commas
* A list element can be any Python object - even another list
* A list can be empty


In [None]:
#list of integers
print( [1, 24, 76] )

#list of strings
print( ['red', 'yellow', 'blue'] )

#mixed list
print( ['red', 24, 98.6] )

#list with a list included
print( [ 1, [5, 6], 7] )

#empty list
print( [] )


##Looking inside lists

Python has a range of functions that operate on lists. We can easily get some simple calculations done with these functions:

In [None]:
nums = [3, 41, 12, 9, 74, 15]
print (len(nums))
print (max(nums))
print (min(nums))
print (sum(nums))
print (sum(nums)/len(nums))


Remember the example below? 

In [None]:
for i in [5, 4, 3, 2, 1] :
    print(i)
print('Blastoff!')

Lists and definite loops go really well together. That's why we sneakily used lists earlier. With a loop we can easily access all the elements in a list, one element at a time! 


Lists and loops are best friends!

In [None]:
friends = [ 'John', 'Bob', 'Mary' ]

for buddy in friends:
    print("Happy Halloween,", buddy)


Every item in the list has its own index number. Do you remember how Python is similar to an elevator? We start counting at 0! The index for our friends list is as follows:


John|Bob|Mary
---|---
0|1|2

###Mutability

Lists are mutable, which means that you can change the contents of a list, just as you can change the content of a variable. We can easily replace one friend with a new one:

In [None]:
friends = [ 'John', 'Bob', 'Mary' ]
print(friends[1])
friends[1] = 'Billy'
print (friends)

## Strings
A nice feature of Python is that we can read strings just like we do with lists. There is one important difference however: strings are immutable. This means we cannot change a part of a string like we can change an element of a list.

In [None]:
hero = 'batman'
#print first character
print(hero[0])
#but we cant change it:
hero[0] = 'c'

In [None]:
sentence = "Python's name is derived from the television series Monty Python's Flying Circus."

Words are made up of characters, and so are strings in Python, like the string stored in the variable `sentence` in the block above. For the sentence above, it might seem more natural for humans to describe it as a series of words, rather than as a series of characters. Say we want to access the first word in our sentence. If we type in:

In [None]:
first_word = sentence[0]
print(first_word)

Python only prints the first *character* of our sentence. (Think about this if you do not understand why.) We can transform our sentence into a `list` of words (represented by strings) using the `split()` function as follows: 

In [None]:
words = sentence.split()
print(words)

Make sure that you understand the syntax of this code! Here, we apply the `split()` function to the variable `sentence` and we assign the result of the function (we call this the 'return value' of the function) to the new variable `words`.

By default, the split() function in Python will split strings on the spaces between consecutive words and it will returns a list of words. However, we can pass an argument to `split()` that specifies explicitly the string we would like to split on. In the code block below, we will split a string on commas, instead of spaces. Do you get the syntax?

In [None]:
fruitstring = "banana,pear,apple"
fruitlist = fruitstring.split(",")
print(fruitlist)

The reverse of the `split()` function can be accomplished with `join()`, it turns a list into a string, with a specific 'delimiter' or the string you want to use to join the items.

In [None]:
fruitlist = ['banana', 'pear', 'apple']
delimiter = ","
fruitstring = delimiter.join(fruitlist)
print(fruitstring)

**Excercise:** The above four lines can be accomplished in a single line if code, can you figure out how? (Tip: replace all variables by their values)

In [None]:
# insert your oneliner here!

### ``replace()``

The `replace()` function is another function which can be called on a string. It will replace all occurrences of a specified substring with another string. Consider the lines in the code block below - and mind the order in which you pass the arguments to the function!



In [None]:
text = "You can not compare apples and pears"
text = text.replace("pears", "apples")
text = text.replace("not ", "")
print(text)

Python has two functions for changing the case of a string. `lower()` converts a string to lowercase characters and `upper()` returns an uppercased version:

In [None]:
my_string = "AllCaps"
print(my_string)
my_string_upper = my_string.upper()
print(my_string_upper)
my_string_lower = my_string.lower()
print(my_string_lower)

**Exercise:**
-  Can you come up with your own sentence `my_sentence` and split it into words along spaces? Print the new list of words.
-  You can recognize functions because they are always followed by (round brackets). Apart from the `split()` function, we already encountered other functions, also in the previous chapter. Which ones? Can you describe their functionality? Are there any differences in terms of syntax when you compare these to `split()`?
-  Is there a difference in length between the variables `sentence` and `words`? (Use functions to find this out!)

In [None]:
# your code goes here...

In many ways, list variables are very similar to strings. We can for example access its components using indexes and we can use slice indexes to access parts of the list. Let's try this out.

Write a small program that defines a variable `first_word` and assign to it the first word of our word list `words` from above. Do the same for the fifth word, the last word and the last but one word. Also, try to extract a slice from `words` and isolate the string of words between `derived` and `Flying` (the slice should not include `derived` and `Flying`). Also, make a slice of words that is identical to the title of the television series in `words`.   

In [None]:
# insert your code here

##List operations

A `list` acts like some kind of container where we can store all kinds of information. We can access a list using indexes and slices. We can also add new items to a list. For that you use the function `append()`. Let's see how that works. 

###``append()``
Say we want to keep a list of all our good reads. We first declare an empty list using square brackets. Next, we add some good books to the list:

In [None]:
# start with an empty list
good_reads = []
good_reads.append("The Hunger games")
print(good_reads)
good_reads.append("A Clockwork Orange")
print(good_reads)

Do you get the syntax that goes with the ``append()`` function? The list we wish to append the item to goes first and we join the ``append()`` function to this list using a dot (`.`). In between the round brackets that go with the function name, we place the actual string that we wish to add to the list. We call such a input value an 'argument' or a 'parameter' that we 'pass' to a function. Next, the function will return a 'return value'. Make sure that you are familiar with this terminology because you will often come across such terms when you look for help online!

Now, if for some reason we don't like a particular book anymore, we can change it as follows using the old item's index:

In [None]:
good_reads[0] = "Pride and Prejudice"
print(good_reads)

As you see, it is no problem to reset or update an individual item in a list. This is different, however, for strings. Run the following code in which we try to change a single character in a string. This will raise an error: this is your computer signalling that something is wrong. This is because `strings` (and some other types) are *immutable*. That means that they cannot be changed using the index, as opposed to `list`s which *are* mutable.

In [None]:
name = "Bonny"
print(name)
list_chars = list(name)
print(list_chars)
list_chars[3] = "t"
print(list_chars)
delimiter = ""
print(delimiter.join(list_chars))

**Exercise:** Here's another small exercise! Add two new titles to the list of `good_reads`. Then, try to change the title of the second book in our good reads collection:

In [None]:
# insert your code here

Lists are a really powerful way of dealing with your data in Python. Let's explore some other ways in which we can manipulate lists.

### ``remove()``

Let's assume our good read collection has grown a lot and we would like to remove some of the books from the list. Python provides the function `remove()` that you can call on a list and which takes as argument the item we would like to remove. 

In [None]:
good_reads = ["The Hunger games", "A Clockwork Orange", 
             "Pride and Prejudice", "Water for Elephants", "Illias"]
print(good_reads)
good_reads.remove("Water for Elephants")
print(good_reads)

If we try to remove a book that is not in our collection, Python raises an error to signal that something is wrong.

In [None]:
good_reads.remove("White Oleander")

Note, however, that `remove()` will only delete the *first* item in the list that is identical to the argument which you passed to the function. Execute the code in the block below and you will see that only the first instance of "Pride and Prejudice" gets deleted.

In [None]:
good_reads = ["The Hunger games", "A Clockwork Orange", 
             "Pride and Prejudice", "Water for Elephants", "Pride and Prejudice"]
good_reads.remove("Pride and Prejudice")
print(good_reads)

Just as with strings, we can concatenate two lists using the `+` operator. Here is an example:

In [None]:
# first we specify two lists of strings:
good_reads = ["The Hunger games", "A Clockwork Orange", 
              "Pride and Prejudice", "Water for Elephants",
              "The Shadow of the Wind", "Bel Canto"]
bad_reads = ["Fifty Shades of Grey", "Twilight"]

# then we combine them
all_reads = good_reads + bad_reads
print(all_reads)

### ``sort()``

It is always nice to organise your bookshelf. We can sort our collection alphabetically with the following expression:

In [None]:
good_reads.sort()
print(good_reads)

### An overview of list operations

There are many more operations which we can perform on lists. Here is an overview of some of them.

In [None]:
#define some lists and variables
a = [1,2,3]
b = 4
c = [5,6,7]
x = 1

#do some operations 
a.append(b)     # Add item b to the end of a
a.extend(c)     # Add the elements of list c at the end of a
a.insert(i,b)   # Insert item b at position i
a.pop(i)        # Remove from a the i'th element and return it. If i is not specified, remove the last element
a.index(x)      # Return the index of the first element of a with value x. Error if it does not exist
a.count(x)      # Return how often value x is found in a
a.remove(x)     # Remove from a the first element with value x. Error if it does not exist
a.sort()        # Sort the elements of list a
a.reverse()     # Reverses list a (no return value!)

print (a)

###Slicing lists

You can access subsections of a list using slices. In the slice the indexes of the items you want to access are defined.





In [None]:
a = ['red','green','yellow','orange','blue','pink']
print(a)
b = a[1:3]
c = a[:3]
d = a[3:]
print(b, c, d)

## Nested lists

A list can contain all kinds of data types, such as integers and even lists! Do you understand what is happening in the following example? Have a close look at the square brackets used.

In [None]:
nested_list = [[1, 2, 3, 4], [5, 6, 7, 8]]
print(nested_list[0])
print(nested_list[0][0])
print(nested_list[1][2])
print(nested_list[0][:-2])

We can traverse an multidimensional list by using nested loops:

In [None]:
nested_list = [['dragon','troll','ogre'],['elf','dwarf','hobbit']]

for l in nested_list:
    for i in l:
        print (i)
    print()

We can put nested lists to use to enhance our good read collection with a score for every book we have. An entry in our collection will consist of a score within the range of `1` and `10` and the title of our book. The first element is the title; the second the score `[title, score]`. We initialize an empty list:

In [None]:
good_reads = []

And add two books to it:

In [None]:
good_reads.append(["Pride and Prejudice", 8])
good_reads.append(["A Clockwork Orange", 9])

##Tuples
A data structure similar to a list is a tuple. A tuple can be used to group any number of items into a single compound value. We have already used tuples to retrieve multiple values from a function.
Tuples consist of comma separated values and are normally enclosed in round brackets ( ).

In [None]:
year_born = ("Paris Hilton", 1981)
print (year_born)

We can access the individual items in a tuple but other than lists, tuples are immutable!

In [None]:
#works fine
print (year_born[1])


In [None]:
#produces error
year_born[0] ="Natalie Portman"

We can pack and unpack variables into tuples:

In [None]:
b = ("Bob", 19, "CS")
(name, age, studies) = b  
print (name)

In [None]:
a = 1
b = 0
(a,b) = (b,a)
print (a,b)

### What we have learnt

To finish this section, here is an overview of the new concepts you have learnt. Go through them and make sure you understand them all.

-  lists
-  nested lists
-  *mutable* versus *immutable*
- string operations
- tuples
-  `.split()` vs. `.join()`
-  `.append()`
-  `.remove()`
-  `.sort()`
-  `.upper()` vs. `.lower()`


-------------

##Exercises

- **Exercise 1:** Update the `good_reads` collection with some of your own books and give them all a score and a publication year by nesting lists. Can you print out the score you gave to the first book in the list? And the publication year of the third item in your list? (Hint: you can pile up indexes using square brackets!)

In [None]:
# insert your code here

- **Exercise 2:** Consider the following strings `sentence1 = "Bert and Ernie are good friends"` and `sentence2 = "Bonny and Clyde are famous criminals"`. Split these strings into words and create the following strings via list manipulation: `sentence3 = "Bert and Ernie are famous criminals"` and `sentence4="Bonny+and+Clyde+are+good+friends"` (mind the plus signs!).

In [None]:
# sentences

- **Exercise 3:**

In [None]:
text = "It is often still possible to understand text even if all vowels are removed"
# insert your code here.. I suppose it's obvious what we want you to do ;-)

-----------------------------------------------------------------

You've reached the end of Chapter 5! 