# More Data Structures

Term 1 2019 - Instructor: Teerapong Leelanupab

Teaching Assistant: Suttida Satjasunsern

***

Python includes built-in variables types for a number of fundamental data structures, including lists, tuples, sets, and dictionaries (maps).

## Data Structures: Lists

A *list* is an ordered collection of other variables. These variables can have different types. Lists definitions are enclosed within square brackets [ and ]

In [43]:
mylist = []                         # an empty list
numbers = [12, 108, 21]             # a list of 3 integers
somedata = ["text", 7, 0.34, True]  # a list containing 4 different variables of different types
somedata

['text', 7, 0.34, True]

Values in a list are accessed by specifying the *index* in square brackets - i.e. the position of the value in the list. Note: We always count from 0 in Python, so the first value in a list has index 0.

In [44]:
values = [34, 9, 12, 35]
values[0]

34

We can count from the end of the list backwards by using negative index values. Index -1 is the last value, index -2 is the second last value, and so on.

In [45]:
values[-1]

35

If we try to access a value to an index that is beyond the length of the list, we will get an error message.

In [46]:
values[50]

IndexError: list index out of range

***

### Nesting lists
Lists can also be contained within other lists, which allows the construction of hierarchical data structures.

In [None]:
child1 = [12, 108, 23]
child2 = [99, 4]
child3 = ["a","b","c"]
parent = [ child1, child2, child3 ]
print(parent)

Values in nested lists can be accessed using multiple indexes in square brackets:

In [None]:
parent[0][2]

In [None]:
parent[2][1]

***

### Slicing lists 
Lists can also be *sliced* to access subsets of that list. The notation is [i:j], where *i* is the start index inclusive and *j* is the end index exclusive. Remember that we always count from index 0.

In [None]:
fulllist = [9, 12, 23, 18, 21]
fulllist[0:4] # start at 1st item, end before 3rd item

In [None]:
fulllist[-1::-1] # run backward

When slicing, the default for i is 0, default for j is the end of the string.

In [None]:
fulllist[1:] # all items from the 2nd one onewards

In [None]:
fulllist[:4:2] # start at 1st item, end before 5th item by 2 step

***

### Modifiying lists

Values in a list can be changed after the list is created by specifying the index and performing assignment.

In [None]:
values = [34, 9, 12, 34]
values[2] = 5000
print(values)

If we try to assign a value to an index that is beyond the length of the list, we will get an error message.

In [None]:
values[99] = 343

Instead, we can add a value to the end of a list using the *append()* function:

In [None]:
values.append("extra")
values.append(567)
print(values)

We can also concatenate two or more lists together using the plus + operator:

In [None]:
values + [11, 27] + ["true", True]

In [47]:
["A","B"] + ["Y","Z"]

['A', 'B', 'Y', 'Z']

***

### Membership operators

The special 'in' keyword can be used to test if a value is contained in a list.

In [48]:
mylist = [3,6,9,12]
mylist

[3, 6, 9, 12]

In [49]:
3 in mylist

True

In [50]:
27 in mylist

False

The logical 'out' operator can be used to test if a value is missing from a list.

In [51]:
27 not in mylist


True

*** 
### Related functions

A variety of built-in functions can be used with lists.

We can check the length of a list using the built-in *len()* function:

In [52]:
print(values)
len(values)

[34, 9, 12, 35]


4

We can sort the items in a list by a calling the *sort()* function on the list. Note that this sorts the list "in place" - i.e. "the list itself is modified, rather than copied."

In [53]:
letters = ["b","d","a","c","az","ax","a"]
letters.sort()
print(letters)

['a', 'a', 'ax', 'az', 'b', 'c', 'd']


***

## Data Structures: Tuples

Tuples are like lists but are "*immutable*" - this means that "once they are created, they cannot be modified". Tuples are created using parenthesis notation ( and ).

In [54]:
suits = ("hearts", "diamonds", "spades", "clubs", 26, True)
suits

('hearts', 'diamonds', 'spades', 'clubs', 26, True)

Values in tuples are also accessed using the same square bracket index notation that we saw for lists.

In [55]:
suits[0]

'hearts'

In [56]:
suits[-1:0:-1]

(True, 26, 'clubs', 'spades', 'diamonds')

Like lists, different types of variables can be contained within the same tuple.

In [57]:
t = (123, True, "Teerapong", 123.23)
t

(123, True, 'Teerapong', 123.23)

However, unlike lists, we cannot modify the tuple once it has been defined. If we try to assign a new value to an index in the tuple, we will get an error message.

In [58]:
t[3] = 3435

TypeError: 'tuple' object does not support item assignment

***

## Data Structures: Sets

Sets are unordered lists which contain no duplicate values. "They can be created from lists, strings or any other iterable value, using the *set* function." Sets do not have an order, so we cannot index into them by position.
#set(listx)
Sets is like an Math operator set (can union or some kind of that)

In [59]:
mylist = [1,3,1,4,3,6,8,1,4,4]
set(mylist)

{1, 3, 4, 6, 8}

In [60]:
set("abcddabcdaacbcc")

{'a', 'b', 'c', 'd'}

In [61]:
set("1,2,3,4,5,6,1,1,2")

{',', '1', '2', '3', '4', '5', '6'}




The 'in' membership operator also works on sets:

In [62]:
names = set(['Bill','Lisa','Ted','Bill','lisa'])
names

{'Bill', 'Lisa', 'Ted', 'lisa'}

In [63]:
'Bill' in names

True

In [64]:
'Sharon' in names

False

We can then calculate unions, intersections and differences between pairs of sets.

In [65]:
x = set([1,2,3,4])
y = set([3,4,5])

In [66]:
x.intersection(y)  # what values are in both x and y?

{3, 4}

In [67]:
x.union(y)   # what are values are in either x or y, or both?

{1, 2, 3, 4, 5}

In [68]:
x.difference(y)    # what values are in x but not in y? อยู่ใน x แต่ไม่อยู่ใน y

{1, 2}

In [69]:
y.difference(x)    # what values are in y but not in x? อยู่ใน y แต่ไม่อยู่ใน x

{5}

We can convert a *set* back to a *list* by calling the built-in *list()* function:

In [70]:
list(set([1,1,1,2,3,4,5,6,1.123,4,5,6,7]))

[1, 2, 3, 4, 5, 6, 1.123, 7]

***

## Data Structures: Dictionaries

A *dictionary* (sometimes called a *map*) is a data structure containing an unordered set of *(key,value)* pairs. Each *key* is linked to a *value*. The keys and values can be any basic Python variable. 

Dictionaries can be created using curly bracket notation { }, and can either be initially empty or populated with one or more pairs.

In [71]:
d0 = {}                                        # create an empty dictionary
d1 = {"Thailand":"Bangkok", "France":"Paris"}    # create a dictionary containing two pairs 
d2 = {"age": 22, "name": "kim", 100 : False} # create a dictionary containing three pairs 

Note that types of keys and values in a dictionary can be mixed

In [72]:
mixedmap = {1:"teerapong",0.8:False,"b":10,"c":"d"}

We can access a value in a dictionary by using the square bracket notation and specifying the corresponding key:

In [73]:
d1["Thailand"]

'Bangkok'

In [74]:
d2["name"]

'kim'

If we try to access a value for a non-existent key in a dictionary, we will get an error message:

In [75]:
d1["Singapore"]

KeyError: 'Singapore'

In [76]:
d1["Paris"] # Cant access using value too you must use only Key

KeyError: 'Paris'

To avoid this type of error, check for the presence of a key in a dictionary using the **in** operator:

In [None]:
"Paris" in d1["France"]

In [77]:
"France" in d1

True

In [78]:
"Singapore" in d1

False

We can easily add new values to a dictionary using square bracket notation and assignment. If a does not already exist for a given key, it will be added. 

In [79]:
d1["Germany"] = "Berlin"
d1

{'Thailand': 'Bangkok', 'France': 'Paris', 'Germany': 'Berlin'}

If a value for the key exists, the previous value will be "over-written".

In [80]:
d1["Thailand"] = "Changmai"
d1

{'Thailand': 'Changmai', 'France': 'Paris', 'Germany': 'Berlin'}

Dictionaries have various associated functions to access the keys and/or values.

In [81]:
d1.keys() # get only the keys from a dictionary

dict_keys(['Thailand', 'France', 'Germany'])

In [82]:
d1.values() # get only the values from a dictionary

dict_values(['Changmai', 'Paris', 'Berlin'])

In [83]:
d1.items() # get all (key,value) pairs as tuples

dict_items([('Thailand', 'Changmai'), ('France', 'Paris'), ('Germany', 'Berlin')])

We can check the number of key-value pairs in a dictionary using the built-in *len()* function:

In [84]:
len(d1)

3