# Tutorial: Built-in data structures

In this tutorial, you will learn about the capabilities built into the Python language that will be useful throughout the rest of the course. 

Learning objectives:
1. Use Python’s built-in data structures: tuples, lists, dicts, and sets.
2. Use list comprehension techniques to create built-in data structures

## 1. Python’s built-in data structures

Built-in data structures in Python consist of objects that act containers of other objects.

## Tuples

A tuple is a fixed-length, **immutable** sequence of Python objects.

The easiest way to create one is with a comma-separated sequence of values:

In [1]:
a_tuple = 4, 5, 6
a_tuple

(4, 5, 6)

While the objects stored in a tuple may be mutable themselves, once the tuple is created it’s not possible to modify which object is stored in each slot:

In [2]:
a_tuple[0] = 1

TypeError: 'tuple' object does not support item assignment

As anything else in Python, tuple are objects. Tuples are objects from the class tuple:

In [3]:
type(a_tuple)

tuple

You can also use parentheses to create tuples: 

In [4]:
a_tuple = (4, 5, 6)
a_tuple

(4, 5, 6)

Elements can be accessed with square brackets [] as with most other sequence types.

As in many other languages, tuples and all other sequences are 0-indexed in Python:

In [5]:
a_tuple = (4, 5, 6)
a_tuple[0]

4

### Tuple unpacking

Tuple unpacking (also known as multiple assignment), a common practice in Python, allows you to assign multiple variables at the same time in one statement.

Let's first create a tuple:

In [6]:
a_tuple = 4, 5, 6
a_tuple

(4, 5, 6)

Tuple unpacking is done by assigning a tuple-like expression (i.e., an expression that evaluates to a tuple) to a variable. When this happens, Python will attempt to unpack the value on the right hand side of the equals sign:

In [7]:
a, b, c = a_tuple

print(a, b, c)

4 5 6


This is a shorter version:

In [8]:
a, b, c = 4, 5, 6

print(a, b, c)

4 5 6


## Lists

In contrast with tuples, lists are **mutable** objects, which means their length is variable and their contents can be modified in-place.

You can define them using square brackets []:

In [9]:
a_list = [2, 3, 7]
a_list

[2, 3, 7]

We can also use the list function to create lists from other objects, such as tuples:

In [10]:
a_tuple = 2, 3, 7
a_list = list(a_tuple)
a_list

[2, 3, 7]

Lists are objects from the class list:

In [11]:
type(a_list)

list

> Lists and tuples are semantically similar and can be used interchangeably in many functions. The main difference is that tuples cannot be modified.

### Manipulating elements in a list

We can manipulate elements in a list using different methods:

In [12]:
a_list = ['foo', 'bar', 'baz']

Elements can be appended to the end of the list with the append method:

In [13]:
a_list.append('dwarf')
a_list

['foo', 'bar', 'baz', 'dwarf']

Using insert you can insert an element at a specific location in the list:

In [14]:
a_list.insert(1, 'red')
a_list

['foo', 'red', 'bar', 'baz', 'dwarf']

In [15]:
The inverse operation to insert is pop, which removes and returns an element at a particular index:

SyntaxError: invalid syntax (3148373685.py, line 1)

In [16]:
a_list.pop(2)

'bar'

In [17]:
a_list

['foo', 'red', 'baz', 'dwarf']

Elements can be removed by value with remove, which locates the first such value and removes it from the list:

In [18]:
a_list.remove('foo')
a_list

['red', 'baz', 'dwarf']

Check if a list contains a value using the in keyword:

In [19]:
'dwarf' in a_list

True

The keyword not can be used to negate in:

In [20]:
print('dwarf' not in a_list)

False


> Checking whether a list contains a value is a lot slower than doing so with dicts and sets (to be introduced shortly), as Python makes a linear scan across the values of the list, whereas it can check the others (based on hash tables) in constant time.

### Concatenating and combining lists

Adding two lists together with + concatenates them (you can concatenate tuples in the same way):

In [21]:
[4, 5, 'foo'] + [7, 8, 9]

[4, 5, 'foo', 7, 8, 9]

If you have a list already defined, you can append multiple elements to it using the extend method:

In [22]:
a_list = [4, 5, 'foo']
a_list.extend([7, 8])

a_list

[4, 5, 'foo', 7, 8]

> Note that list concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable.

### Sorting

You can sort a list in-place (without creating a new object) by calling its sort function:

In [23]:
a_list = [7, 2, 5, 1, 3]
a_list.sort()

a_list

[1, 2, 3, 5, 7]

Python also has a number of built-in functions to work with lists and iterable objects in general:

Return the largest item:

In [24]:
max(a_list)

7

Sums items of a list (or any iterable object) and returns the total:

In [25]:
sum(a_list)

18

### Slicing

You can select sections of most sequence types by using slice notation, which in its basic form consists of the pattern start:stop:step passed to the indexing operator []:

In [26]:
seq=[0, 1, 2, 3, 4, 5, 6, 7]
seq

[0, 1, 2, 3, 4, 5, 6, 7]

s[start:stop] → returns items from start to stop-1

In [30]:
seq[0:4] # items from position 0 to 3(4 minus 1):

[0, 1, 2, 3]

s[start:] → items start through the rest of the array

In [31]:
seq[4:] # items from position 4 to the end

[4, 5, 6, 7]

s[:stop] → items from the beginning through stop-1 

In [32]:
seq[:4] # items from the beginning to position 3(4 minus 1)

[0, 1, 2, 3]

s[:] → a copy of the whole array

In [33]:
seq[:] # all items

[0, 1, 2, 3, 4, 5, 6, 7]

Negative indices slice the sequence relative to the end where the last position corresponds to -1:

In [34]:
seq=[0, 1, 2, 3, 4, 5, 6, 7]
print("sequence: ", seq)

print("last element:", seq[-1])
print("item at position -6(2):", seq[-6])
print("items from position -6(2) to position -2(6):", seq[-6:-2]) 
print("items from position -4(4) to the end:", seq[-4:])

sequence:  [0, 1, 2, 3, 4, 5, 6, 7]
last element: 7
item at position -6(2): 2
items from position -6(2) to position -2(6): [2, 3, 4, 5]
items from position -4(4) to the end: [4, 5, 6, 7]


A **step** can also be used after a second colon to slice the list every certain number of items.

s[::step] → list contents every step items 

In [35]:
seq[::2] # list contents two by two

[0, 2, 4, 6]

A step may be a negative number to indicate reverse sequencing.

All items in the array, reversed:

In [36]:
seq[::-1]

[7, 6, 5, 4, 3, 2, 1, 0]

## Dictionaries

Dictionaries are likely the most important built-in Python data structure. A more common name for it is hashmap or associative array.

A dictionary is a flexibly sized collection of key-value pairs, where key and value are Python objects. One approach for creating one is to use curly braces {} and colons to separate keys and values:

In [37]:
dictionary = {'a' : 'some value', 'b' : [1, 2, 3, 4]}
dictionary

{'a': 'some value', 'b': [1, 2, 3, 4]}

Dictionaries are objects from the dict class:

In [38]:
type(dictionary)

dict

You can access, insert, or set elements using the same syntax as # for accessing elements of a list or tuple:

In [39]:
dictionary[7] = 'an integer'
dictionary

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

You can check if a dict contains a key using the same syntax used for checking whether a list or tuple contains a value:

In [40]:
'b' in dictionary

True

In [41]:
'c' in dictionary

False

You can delete values either using the *del* keyword or the *pop* method (which simultaneously returns the value and deletes the key):

In [42]:
del dictionary['a']
dictionary

{'b': [1, 2, 3, 4], 7: 'an integer'}

In [43]:
dictionary.pop(7)

'an integer'

In [44]:
dictionary

{'b': [1, 2, 3, 4]}

The keys and values method give you iterators of the dict’s keys and values, respectively.

While the key-value pairs are not in any particular order, these functions output the keys and values in the same order:

In [45]:
dictionary = {'a' : 'some value', 'b' : [1, 2, 3, 4]}

dictionary.keys()

dict_keys(['a', 'b'])

In [46]:
dictionary.values()

dict_values(['some value', [1, 2, 3, 4]])

## Valid types for keys

While the values of a dict can be any Python object, the keys generally have to be immutable objects like scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too).

The technical term here is hashability. You can check whether an object is hashable (can be used as a key in a dict) with the hash function:

In [47]:
print( hash('string') )
print( hash((1, 2, (2, 3))) )

3870778937854110384
-9209053662355515447


In [48]:
hash((1, 2, [2, 3])) # fails because lists are mutable

TypeError: unhashable type: 'list'

## Sets

A set is an unordered collection of **unique** elements.

Sets support mathematical set operations like union, intersection, difference, and symmetric difference. 

A set can be created in two ways:
- Using the *set* function
- Using the set literal with curly braces

In [49]:
set([2, 2, 2, 1, 3, 3])

{1, 2, 3}

In [50]:
{2, 2, 2, 1, 3, 3}

{1, 2, 3}

Sets are objects of the set class:

In [51]:
a_set = {1, 2}
type(a_set)

set

## 2. List Comprehensions

The simplest form of list comprehension is just to replicate a list into another list:

In [52]:
a_list = [1, 2, 3]

In [53]:
[l for l in a_list]

[1, 2, 3]

We can assign the result of a list comprehension to another list:

In [54]:
another_list = [l for l in a_list]
another_list

[1, 2, 3]

We can do much more. For example, given a list of strings, we could filter out trings with length 2 or less and also convert them to ppercase like this:

In [55]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

list_comp = [s.upper() for s in strings if len(s) > 2]
list_comp

['BAT', 'CAR', 'DOVE', 'PYTHON']

The code above is equivalent to the following code using a for statement:

In [56]:
list_comp = []
for s in strings:
    if(len(s) > 2):
        list_comp.append(s.upper())
    
list_comp

['BAT', 'CAR', 'DOVE', 'PYTHON']

### Review: mutable and immutable objects

Lists, dicts, and most user-defined types (classes), are mutable.

This means that the object or values that they contain can be modified:

In [57]:
list = ['foo', 2, [4, 5]]
list[2] = (3, 4)

list

['foo', 2, (3, 4)]

Others, like strings, tupbles, and sets, are immutable:

In [58]:
string = "Hello"
string[1] = 'a'

TypeError: 'str' object does not support item assignment