## Data Structures

Python includes built-in variables types for a number of fundamental data structures, including lists, tuples, sets, and dictionaries (maps).

### Data Structures: Lists

A *list* is an ordered collection of other variables. These variables can have different types. Lists definitions are enclosed within square brackets [ and ]

In [None]:
mylist = []                         # an empty list
numbers = [12, 108, 21]             # a list of 3 integers
somedata = ["text", 7, 0.34, True]  # a list containing 4 different variables of different types
print(mylist)
print(numbers)
print(somedata)

In [None]:
full_list = [9, 12, 23, 18, 21]
print(full_list)

Values in a list are accessed by specifying the *index* in square brackets - i.e. the position of the value in the list. Note: We always count from 0 in Python, so the first value in a list has index 0.

In [None]:
values = [34, 9, 12, 34]
values[:2]

We can count from the end of the list backwards by using negative index values. Index -1 is the last value, index -2 is the second last value, and so on.

In [None]:
values[-2]

If we try to access a value to an index that is beyond the length of the list, we will get an error message.

In [None]:
values[50]

**Slicing**: Lists can also be *sliced* to access subsets of that list. The notation is [i:j], where *i* is the start index inclusive and *j* is the end index exclusive. Remember that we always count from index 0.

In [None]:
fulllist = [9, 12, 23, 18, 21]
fulllist[0:2] # start at 1st item, end before 3rd item

In [None]:
fulllist[0:3]  # first three items

When slicing, the default for i is 0, default for j is the end of the string.

In [None]:
fulllist[1:] # all items from the 2nd one onewards

In [None]:
fulllist[:4] # start at 1st item, end before 5th item

**Modifiying lists**: Values in a list can be changed after the list is created by specifying the index and performing assignment.

In [None]:
values = [34, 9, 12, 34]
values[2] = 5000
print(values)

If we try to assign a value to an index that is beyond the length of the list, we will get an error message.

In [None]:
values[99] = 343

Instead, we can add a value to the end of a list using the *append()* function:

In [None]:
values.append("extra")
print(values)

We can also concatenate two or more lists together using the plus + operator:

In [None]:
values = values + [11, 27]
print(values)

In [None]:
["A","B"] + ["Y","Z"]

**Membership operators**: The special 'in' keyword can be used to test if a value is contained in a list.

In [None]:
mylist = [3,6,9,12]

In [None]:
27 in mylist

In [None]:
9 in mylist

The logical 'not in' operator can be used to test if a value is missing from a list.

In [None]:
3 not in mylist

**Related functions:** A variety of built-in functions can be used with lists.

We can check the length of a list using the built-in *len()* function:

In [None]:
print(values)
print(len(values))

We can sort the items in a list by a calling the *sort()* function on the list. Note that this sorts the list "in place" - i.e. the list itself is modified, rather than copied.

In [None]:
letters = ["b","d","a","c", "B"]
print(letters)
letters.sort()
print(letters)

In [None]:
numbers = [100, 65, 23, 87, 34]
print(numbers)
numbers.sort(reverse = True)
print(numbers)

**Nesting**: Lists can also be contained within other lists, which allows the construction of hierarchical data structures.

In [None]:
child1 = [12, 108, 23]
child2 = [99, 4]
child3 = ["a","b","c"]
parent = [ child1, child2, child3 ]
print(parent)

Values in nested lists can be accessed using multiple indexes in square brackets:

In [None]:
parent[1][2]

In [None]:
parent[2][1]

### Data Structures: Tuples

Tuples are like lists but are "immutable" - this means that once they are created, they cannot be modified. Tuples are created using parenthesis notation ( and ).

In [None]:
suits = ("hearts", "diamonds", "spades", "clubs")
suits

Values in tuples are also accessed using the same square bracket index notation that we saw for lists.

In [None]:
suits[0]

However, unlike lists, we cannot modify the tuple once it has been defined. If we try to assign a new value to an index in the tuple, we will get an error message.

In [None]:
suits[0] = "cups"

Like lists, different types of variables can be contained within the same tuple.

In [None]:
t = (123, True, "UCD", 123.23)
t

Tuples and lists are very similar, but in general we will use lists rather than tuples. Here are reasons for using a tuple rather than a list
* **Tuples are faster than lists.** If you're defining a constant set of values and all you're ever going to do with it is iterate through it, use a tuple instead of a list.
* **It makes your code safer if you “write-protect” data that does not need to be changed.** Using a tuple instead of a list is like having an implied assert statement that this data is constant, and that special thought (and a specific function) is required to override that.
* **Some tuples can be used as dictionary keys (specifically, tuples that contain immutable values like strings, numbers, and other tuples).** Lists can never be used as dictionary keys, because lists are not immutable. 

### Data Structures: Sets

Sets are unordered lists which contain no duplicate values. They can be created from lists, strings or any other iterable value, using the *set* function. Sets do not have an order, so we cannot index into them by position.

In [None]:
myset = {4, 6, 7, 8, 1, 78, 12}
print(myset)

In [None]:
myset[4]

We often convert lists to serts - this can be an easy way to find the unique values in a list.

In [None]:
mylist = [1,3,1,4,3,8,6,1,4,4]
print(mylist)
s = set(mylist)
print(s)

We can convert a *set* back to a *list* by calling the built-in *list()* function:

In [None]:
s_l = list(s)
s_l[3]

**Note:** The elements in a set will be printed in increasing order (or alphabetical order for strings).

In [None]:
set("abcddabcdaacbcc")

The 'in' membership operator also works on sets:

In [None]:
names = {'Bill','Lisa','Ted'}

In [None]:
'Bill' in names

In [None]:
'Sharon' in names

We can calculate unions, intersections and differences between pairs of sets.

In [None]:
x = {1,2,3,4}
y = {3,4,5}

In [None]:
x.intersection(y)  # what values are in both x and y?

In [None]:
x.union(y)   # what are values are in either x or y, or both?

In [None]:
x.difference(y)    # what values are in x but not in y?

In [None]:
y.difference(x)    # what values are in y but not in x?

### Data Structures: Dictionaries

A *dictionary* (sometimes called a *map*) is a data structure containing an unordered set of *(key,value)* pairs. Each *key* is linked to a *value*. The keys and values can be any basic Python variable. 

Dictionaries can be created using curly bracket notation { }, and can either be initially empty or populated with one or more pairs.

Creat a dictionary to look up capital city names for countries.

In [None]:
country_capitals = {"Ireland":"Dublin", 
      "France":"Paris"}    # create a dictionary containing two pairs 
print(country_capitals)

We can access a value in a dictionary by using the square bracket notation and specifying the corresponding key:

In [None]:
country_capitals["Ireland"]

If we try to access a value for a non-existent key in a dictionary, we will get an error message:

In [None]:
country_capitals["Sweden"]

To avoid this type of error, check for the presence of a key in a dictionary using the **in** operator:

In [None]:
"Sweden" in country_capitals

In [None]:
"Ireland" in country_capitals

We can easily add new values to a dictionary using square bracket notation and assignment. If a does not already exist for a given key, it will be added. 

In [None]:
country_capitals["Germany"] = "Berlin"
country_capitals

If a value for the key exists, the previous value will be over-written.

In [None]:
country_capitals["Ireland"] = "Galway"
country_capitals

Dictionaries are an easy way to create a simple data structure. For example, to store the details of a person.

In [None]:
person = {"age": 22, 
      "name": "alice", 
      "employed" : False} # create a dictionary containing three pairs 
print(person)
person['age']

Note that types of keys and values in a dictionary can be mixed

In [None]:
mixedmap = {1:"ucd",\
            0.8:False,\
            "b":10,\
            "c":"d"}
print(mixedmap)

In [None]:
print(mixedmap[0.8])
print(mixedmap[1])

Dictionaries have various associated functions to access the keys and/or values.

In [None]:
country_capitals.keys() # get only the keys from a dictionary

In [None]:
country_capitals.values() # get only the values from a dictionary

In [None]:
country_capitals.items() # get all (key,value) pairs as tuples

We can check the number of key-value pairs in a dictionary using the built-in *len()* function:

In [None]:
len(country_capitals)