# Data Structures

Python includes built-in variable types for a number of fundamental data structures, including lists, tuples, sets, and dictionaries (sometimes called maps). 

In this notebook we will look at creating and using variables for each of these types.

## Lists

A *list* is an ordered collection of values. These values can have different types. Lists definitions are enclosed within square brackets.

In [None]:
# create an empty list
mylist = []       
mylist

In [None]:
# create a list of 3 integers
numbers = [12, 108, 21]
numbers

In [None]:
# a list containing 4 different variables of different types
somedata = ["text", 7, 0.34, True]
somedata

Entries in a list are accessed by specifying the *index* in square brackets - i.e. the position of the value in the list. 

Note: We always count from 0 in Python, so the first entry in a list has index 0.

In [None]:
values = [34, 9, 12, 34]
values[0]

We can count from the end of the list backwards by using negative index numbers. The index -1 is the last entry in the list, the index -2 is the second last entry, and so on.

In [None]:
values[-2]

**Nesting**: 

Lists can also be contained within other lists, which allows the construction of hierarchical data structures.

In [None]:
child1 = [12, 108, 23]
child2 = [99, 4]
child3 = ["a","b","c"]
parent = [ child1, child2, child3 ]
print(parent)

Values in nested lists can be accessed using multiple indexes in square brackets:

In [None]:
parent[0][2]

In [None]:
parent[2][1]

**Slicing**: 

Lists can also be *sliced* to access subsets of that list. The notation is [i:j], where *i* is the start index inclusive and *j* is the end index exclusive. Remember that we always count from index 0.

In [None]:
fulllist = [9, 12, 23, 18, 21]
fulllist[0:2] # start at 1st item, end before 3rd item

In [None]:
fulllist[0:3]  # first three items

When slicing, the default for i is 0, default for j is the end of the string.

In [None]:
fulllist[1:] # all items from the 2nd one onewards

In [None]:
fulllist[:4] # start at 1st item, end before 5th item

**Modifiying lists**: 

Entries in a list can be changed after the list is created by specifying the index and performing assignment.

In [None]:
values = [34, 9, 12, 34]
values[2] = 5000
print(values)

If we try to assign a value to an index that is beyond the length of the list, we will get an error message. Instead, we can add a value to the end of a list using the *append()* function:

In [None]:
values.append("extra")
print(values)

We can also concatenate two or more lists together using the plus + operator:

In [None]:
values + [11, 27]

In [None]:
["A","B"] + ["Y","Z"]

We can insert a new value at a particular location in a list by using its associated *insert()* function. We specify the position and the value to insert as arguments. All the entries after that position are shifted to the right.

In [None]:
values.insert(2, 88)
values

**Checking lists**: 

The *in* identity operator can be used to test if a value is contained in a list. The result is a boolean value.

In [None]:
mylist = [3,6,9,12]

In [None]:
3 in mylist

In [None]:
27 in mylist

The logical *not in* operator can be used to test if a value is missing from a list.

In [None]:
27 not in mylist

**Related functions:** 

A variety of built-in functions can be used with lists.

For instance, we can check the length of a list using the built-in *len()* function:

In [None]:
len(values)

We can sort the items in a list by a calling the *sort()* function on the list. Note that this sorts the list "in place" - i.e. the list itself is modified, rather than copied.

In [None]:
letters = ["b", "d", "a", "c"]
letters.sort()
print(letters)

We can also use the Python *sorted()* function, which returns a new sorted list, leaving the original list unchanged:

In [None]:
grades = ["B", "A", "C", "C", "A", "E"]
sorted(grades)

## Tuples

*Tuples* are like lists but are "immutable". This means that they cannot be modified after creation. 

Tuples are created using parenthesis notation, with values separated by commas.

In [None]:
suits = ("hearts", "diamonds", "spades", "clubs")
suits

Values in tuples are also accessed using the same square bracket index notation that we saw for lists.

In [None]:
suits[0]

In [None]:
suits[-1]

Like lists, different types of variables can be contained within the same tuple.

In [None]:
t = (123, True, "UCD", 123.23)
t

However, unlike lists, we cannot modify the tuple once it has been defined. If we try to assign a new value to an index in the tuple, we will get an error message.

In [None]:
t[3] = 3435

## Sets

Sets are unordered lists which contain no duplicate values. Sets do not have an order, so we cannot index into them by position.

We can create a new set using curly brackets notation:

In [None]:
countries = {"Ireland", "Spain", "Italy", "Croatia"}
countries

In [None]:
# a set with 4 different types
mix = {"UCD", 2000, True, 15.6}
mix

To make a set without any elements, we call the *set*() function without any argument:

In [None]:
elements = set()
elements

We can also create sets from lists, strings or any other iterable value, using the *set()* function:

In [None]:
mylist = [1, 3, 1, 4, 3, 6, 8, 1, 4, 4]
set(mylist)

Note that only the unique values from the original list are retained:

In [None]:
winners = ["Brazil", "Germany", "Argentina", "Italy", "Argentina", "Germany", "Brazil", "France", "Brazil", "Italy"]
set(winners)

The 'in' membership operator also works on sets:

In [None]:
names = {"Bill", "Lisa", "Ted"}

In [None]:
'Bill' in names

In [None]:
'Sharon' in names

**Modifying sets:** 

To add a single value to an existing set, we call its associated *add()* function:

In [None]:
names.add("Catherine")
names

Note that sets do not allow duplicates, so adding the same value multiple times has no effect:

In [None]:
names.add("Olivia")
names.add("Olivia")
names.add("Olivia")
names

We can add multiple values to an existing set using its *update()* function. This function can take tuples, lists, strings or other sets as its argument.

In [None]:
names.update(["Bob", "Alice", "John"])
names

**Comparing sets:** 

We can then calculate unions, intersections and differences between pairs of sets.

In [None]:
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}

In [None]:
# which values are in both x and y?
x.intersection(y)

In [None]:
# which are values are in either x or y, or both?
x.union(y)

In [None]:
# which values are in x but not in y?
x.difference(y)

In [None]:
# which values are in y but not in x?
y.difference(x)    

We can convert a *set* to a *list* by calling the built-in *list()* function:

In [None]:
list(x)

## Dictionaries

A *dictionary* (sometimes called a *map*) is a data structure consisting of *(key:value)* pairs. Each *key* is linked to a *value*, and keys are unique. 

Dictionaries can be created using curly bracket notation, and can either be initially empty or populated with one or more pairs.

In [None]:
# create an empty dictionary
d0 = {}
d0

In [None]:
# create a dictionary containing two pairs 
d1 = {"Ireland":"Dublin", "France":"Paris"}
d1

In [None]:
# create a dictionary containing three pairs 
d2 = {"age":22, "name":"alice", 100:False}
d2

We can check the number of key-value pairs in a dictionary using the built-in *len()* function:

In [None]:
len(d2)

Note that types of keys and values in a dictionary can be mixed

In [None]:
mixedmap = {1:"ucd", 0.8:False, "b":10, "c":"d"}
mixedmap

We can access a value in a dictionary by using the square bracket notation and specifying the corresponding key:

In [None]:
d1["Ireland"]

In [None]:
d2["name"]

In [None]:
mixedmap[1]

If we try to access a value for a non-existent key in a dictionary, we will get an error message:

In [None]:
d1["Sweden"]

To avoid this type of error, check for the presence of a key in a dictionary using the **in** operator:

In [None]:
"Ireland" in d1

In [None]:
"Sweden" in d1

We can easily add new values to a dictionary using square bracket notation and assignment. If a does not already exist for a given key, it will be added. 

In [None]:
d1["Germany"] = "Berlin"
d1

If a value for the key exists, the previous value will be over-written.

In [None]:
d1["Ireland"] = "Cork"
d1

Dictionaries have various associated functions to access the keys and/or values.

In [None]:
# get only the keys from a dictionary
d2.keys()

In [None]:
# get only the values from a dictionary
d2.values()

In [None]:
# get all (key:value) pairs as tuples
d2.items()