# Built-in Data Structures II: Tuples and Dictionaries

## Contents

- [Tuples](#section1)
    - [Create tuples](#subsection1.1)
    - [Basic features of tuples](#subsection1.2)
    - [Tuple unpacking](#subsection1.3)
- [Dictionaries](#section2)
    - [Create dictionaries](#subsection2.1)
    - [Data arrangement of dictionaries](#subsection2.2)
    - [Basic features of dictionaries](#subsection2.3)
    - [Loops and iterations with dictionaries](#subsection2.4)
- [Summary](#section3)
    - [Built-in data structures](#subsection3.1)
    - [Parentheses and brackets](#subsection3.2)

## Tuples <a id="section1"></a>

### Create tuples <a id="subsection1.1"></a>
Similar to lists, tuples are sequences of data items. Tuples can be created as data items separated by commas, with or without parentheses, as demonstrated by examples below. 

In [1]:
colors = 'red', 'blue', 'green' # Comma-separated items
mixed = ('Jack', 32.5, [1, 2])  # Comma-separated items within parentheses

print(type(colors))
print(type(mixed))

<class 'tuple'>
<class 'tuple'>


Be careful on two special cases:
- An empty tuple is created by empty parentheses.

In [2]:
feel_empty = ()

print(type(feel_empty))
print(len(feel_empty))

<class 'tuple'>
0


- In creating tuples with only one item, the comma is necessary in the statement.

In [3]:
tuple_one = 'here',         # The comma creates a tuple type object
item_one = ('there')        # Only the parentheses do not create a tuple

print(type(tuple_one))
print(type(item_one))

<class 'tuple'>
<class 'str'>


### Basic features of tuples <a id="subsection1.2"></a>
- Iterable
- The same `len()` function, indexing and slicing expressions as strings and lists
- The same operations with `+` and `*` as strings and lists
- **Immutable**
- **No method** is defined for tuple objects

In [4]:
letters = 'A', 'B', 'C'
numbers = 2.0, 3.0

In [5]:
mixed = letters + numbers*3
print(mixed)
print(len(mixed))

('A', 'B', 'C', 2.0, 3.0, 2.0, 3.0, 2.0, 3.0)
9


In [6]:
for item in mixed[6:]:
    print(item)

3.0
2.0
3.0


### Tuple unpacking <a id="subsection1.3"></a>

Any data sequence (or iterable) can be **unpacked** into variables using just one assignment statement. In **tuple unpacking**, the left-hand-side is a tuple of variables, and the right-hand-side is an arbitrary iterable data sequence. The only requirement is that the number of variables matches the number of items in the data sequence. A few examples are given below.

In [7]:
location = (1.290, 103.852)         # A tuple of two items

latitude, longitude = location      # Unpack the tuple into two variables

print(latitude)
print(longitude)

1.29
103.852


In [8]:
x, y, z = 'abc'     # Unpack a string of three characters

print(x)
print(y)
print(z)

a
b
c


<div class="alert alert-block alert-success">
<b>Example 1:</b>  
    There are two variables <span style='font-family:Courier'><b>cage</b></span> and <span style='font-family:Courier'><b>travolta</b></span> and their values are "bad guy" and "good guy", respectively. Swap the values of variables <span style='font-family:Courier'><b>cage</b></span> and <span style='font-family:Courier'><b>travolta</b></span> so that <span style='font-family:Courier'><b>cage</b></span> becomes "good guy" and <span style='font-family:Courier'><b>travolta</b></span> becomes "bad guy".
</div>

By using tuple unpacking, these two variables can be swapped in a neater and more readable way, as shown by the following code segment.

In [9]:
cage = 'bad guy'
travolta = 'good guy'

cage, travolta = travolta, cage

print(cage)
print(travolta)

good guy
bad guy


Besides the **parallel assignment** shown above, the tuple unpacking can be applied for iterating over multiple iterable data structures via `for` loops.

<div class="alert alert-block alert-success">
<b>Example 2:</b>  
For a biased die, each outcome of rolled number and the corresponding probability are given in lists <span style='font-family:Courier'><b>outcomes</b></span> and <span style='font-family:Courier'><b>probs</b></span>, respectively. Calculate the expected value of the rolled numbers.
</div>

In [10]:
outcomes = [1, 2, 3, 4, 5, 6]                   # Rolled numbers
probs = [0.15, 0.24, 0.18, 0.1, 0.21, 0.12]     # Probabilities

Let $X$ denote the random variable indicating the rolled number of the biased die, which follow a discrete distribution, the expected value can be expressed as $\mathbb{E}(X)=\sum_{i=1}^6p_ix_i$, where $x_i$ is each of the roll numbers, and $p_i$ is the corresponding probability. It can be seen that in calculating the expected value, we need to iterate items from both lists `outcomes` and `probs` in parallel. This can be done by using the `zip()` function.

In [11]:
print(type(zip(outcomes, probs)))

<class 'zip'>


The code cell above shows that the `zip()` function creates a `zip` type object, which is an iterable data sequence. In order to explore the data items in the sequence, we use the following code segment to print them in a `for` loop.

In [12]:
for item in zip(outcomes, probs):
    print(item)

(1, 0.15)
(2, 0.24)
(3, 0.18)
(4, 0.1)
(5, 0.21)
(6, 0.12)


It can be seen that items of the `zip` sequence are tuples with two elements: the first element is from the list `outcomes`, and the second element is from the list `probs`. We thus unpack the tuple in each iteration as the following `for` loop.

In [13]:
for outcome, prob in zip(outcomes, probs):
    print('outcome={}, prob={}'.format(outcome, prob))

outcome=1, prob=0.15
outcome=2, prob=0.24
outcome=3, prob=0.18
outcome=4, prob=0.1
outcome=5, prob=0.21
outcome=6, prob=0.12


The printed message above shows that by using the `zip()` function, variables `outcome` and `prob` respectively iterate items in lists `outcomes` and `probs` in parallel. Such an iteration can be used to calculate the expected value.

In [14]:
exp_value = 0
for outcome, prob in zip(outcomes, probs):
    exp_value += outcome*prob

print(exp_value)

3.34


An alternative approach is to create a list of results for $x_ip_i$, and sum them together by using the `sum()` function.

In [15]:
products = [outcome*prob 
            for outcome, prob in zip(outcomes, probs)]
print(products)

exp_value = sum(products)
print(exp_value)

[0.15, 0.48, 0.54, 0.4, 1.05, 0.72]
3.34


In this example, we are using the `zip()` function to iterate items from two lists in parallel. In fact, the `zip()` function is very flexible such that
- It can be applied to other iterable types, such as lists or tuples.
- It can be used to iterate more than two data sequences.
- The given data sequences are allowed to have different lengths. The iteration stops when the shortest sequence is running out of items. 

In [16]:
s1 = 'abcde'                         # A string with five characters
s2 = range(3)                        # A tuple with three items
s3 = 'one', 'two', 'three', 'four'   # A tuple with four items

for x, y, z in zip(s1, s2, s3):
    print(x, y, z)

a 0 one
b 1 two
c 2 three


Lastly, tuple and unpacking are often used in defining the function outputs. This topic will be discussed in the next lecture. 

## Dictionaries <a id="section2"></a>

### Create dictionaries <a id="subsection2.1"></a>

In Python, **dictionaries** are coded in curly brackets and consist of a series of comma-separated <code><i>key</i>:<i>value</i></code> pairs.

In [17]:
stocks = {'AMZN': 170.40,
          'TSLA': 130.11,
          'TWTR': 32.48,
          'AAPL': 76.60,
          'ORCL': 51.58, 
          'GOOG': 1434.23}
stocks

{'AMZN': 170.4,
 'TSLA': 130.11,
 'TWTR': 32.48,
 'AAPL': 76.6,
 'ORCL': 51.58,
 'GOOG': 1434.23}

In [18]:
print(type(stocks))

<class 'dict'>


As for data types, please note that:
- Dictionary keys must be *hashable*. They are often strings, but can actually be any of Python’s immutable types: boolean, integer, float, tuple, and others.
- Dictionary values can be any types

### Data arrangement of dictionaries <a id="subsection2.2"></a>

Similar to lists, a dictionary is a collection of changeable and indexed data. The main difference is that list elements are accessed by their position in the list via indexing, while dictionary elements are accessed via keys. 

Take the dictionary created above, for example, the keys are `'AMZN'`, `'TSLA'`, `'TWTR'`, `'AAPL'`, and `'ORCL'`. The comparison between lists and dictionaries are illustrated by the following tables.

170.40 | 130.11 &nbsp; | 32.48 &nbsp;&nbsp;| 76.60&nbsp;&nbsp; | 51.58&nbsp;&nbsp; | 1434.23
:-----|:-----|:-----|:-----|:-----|:-----
0 | 1 | 2 | 3 | 4 | 5
-6 | -5 | -4 | -3 | -2 | -1

170.40 | 130.11 &nbsp; | 32.48 &nbsp;&nbsp;| 76.60&nbsp;&nbsp; | 51.58&nbsp;&nbsp; | 1434.23
:-----|:-----|:-----|:-----|:-----|:-------
'AMZN' | 'TSLA' | 'TWTR' | 'AAPL' | 'ORCL' | 'GOOG'

Values in a dictionary can be accessed by the keys, just like items in a list can be accessed by the positional index. 

In [19]:
name = 'AMZN'
print('{}: {}'.format(name, stocks[name]))

AMZN: 170.4


The syntax of changing values in a dictionary is similar to changing values in lists.

In [20]:
stocks['TSLA'] = 150.00     # Change the value of 'TSLA' to be 150.00
stocks['AAPL'] += 0.50      # Increase the value of 'AAPL' by 0.50

stocks                      # Display the updated dictionary

{'AMZN': 170.4,
 'TSLA': 150.0,
 'TWTR': 32.48,
 'AAPL': 77.1,
 'ORCL': 51.58,
 'GOOG': 1434.23}

The same syntax can also be applied to add new items, in terms of a new pair of key and value, to the dictionary.

In [21]:
stocks['ZM'] = 76.30    # Add a new key ZM and new value 76.30

stocks                  # Display the updated dictionary

{'AMZN': 170.4,
 'TSLA': 150.0,
 'TWTR': 32.48,
 'AAPL': 77.1,
 'ORCL': 51.58,
 'GOOG': 1434.23,
 'ZM': 76.3}

<div class="alert alert-block alert-success">
<b>Example 3: </b>  
    The variable <span style='font-family:Courier'><b>words</b></span> is a list containing words of a song. Create a dictionary where the keys are all words appearing in the song, and the values are the numbers of appearances of these words. 
</div>

In [22]:
words = ['hey', 'jude', "don't", 'make', 'it', 'bad', 
         'take', 'a', 'sad', 'song', 'and', 'make', 'it', 'better', 
         'remember', 'to', 'let', 'her', 'into', 'your', 'heart', 
         'then', 'you', 'can', 'start', 'to', 'make', 'it', 'better', 
         'hey', 'jude', "don't", 'be', 'afraid', 
         'you', 'were', 'made', 'to', 'go', 'out', 'and', 'get', 'her', 
         'the', 'minute', 'you', 'let', 'her', 'under', 'your', 'skin', 
         'then', 'you', 'begin', 'to', 'make', 'it', 'better']

Then the list of words can be used to create the dictionary for word counts. 

In [23]:
records = {}                # Initialize records as an empty dictionary
for word in words:          # Iterate all words
    if word in records:
        records[word] += 1  # Increase counts of the word
    else:
        records[word] = 1   # The count of the word is 1

print(records)

{'hey': 2, 'jude': 2, "don't": 2, 'make': 4, 'it': 4, 'bad': 1, 'take': 1, 'a': 1, 'sad': 1, 'song': 1, 'and': 2, 'better': 3, 'remember': 1, 'to': 4, 'let': 2, 'her': 3, 'into': 1, 'your': 2, 'heart': 1, 'then': 2, 'you': 4, 'can': 1, 'start': 1, 'be': 1, 'afraid': 1, 'were': 1, 'made': 1, 'go': 1, 'out': 1, 'get': 1, 'the': 1, 'minute': 1, 'under': 1, 'skin': 1, 'begin': 1}


Here the `in` operator is used to check if the value of `word` is a key of the dictionary `count`. Based on the status of this boolean condition, we have:
1. In the case of `True`, the word has been seen in previous iterations, so the associated value is increased by one for one more appearance.
2. In the case of `False`, the word has not been seen in previous iteration, so the associated value is one, implying that this is the first time to see this word.

After iterating all words in the list, we have a dictionary that counts their appearances.  

### Basic features of dictionaries <a id="subsection2.3">

- Iterable
- The same `len()` function
- Indexing via keys
- **Mutable**

### Loops and iterations with dictionaries  <a id="subsection2.4"></a>

Similar to strings, lists, tuples, dictionaries are also iterable, in the sense that items in a dictionary can be iterated sequentially in a `for` loop. Since each item of the dictionary has two components: the key and the value, the ways of iterating these items are quite flexible. 

In [24]:
stocks = {'AMZN': 170.40,
          'TSLA': 130.11,
          'TWTR': 32.48,
          'AAPL': 76.60,
          'ORCL': 51.58, 
          'GOOG': 1434.23}

#### Iterating over keys

All keys of the dictionary are returned as an iterable sequence by the `keys()` method of a `dict` type object. Such a sequence can be used in a `for` loop to iterate all keys.

In [25]:
for name in stocks.keys():
    print(name)

AMZN
TSLA
TWTR
AAPL
ORCL
GOOG


In fact, if the dictionary is directly iterated in a `for` loop, the key of the data item is returned in each iteration. This is a more concise way to iterate dictionary keys. 

In [26]:
for name in stocks:
    print(name)

AMZN
TSLA
TWTR
AAPL
ORCL
GOOG


The keys can then be used to access values of dictionary data items, so that item values are iterated at the same time. 

In [27]:
for name in stocks:
    print('{}: {}'.format(name, stocks[name]))

AMZN: 170.4
TSLA: 130.11
TWTR: 32.48
AAPL: 76.6
ORCL: 51.58
GOOG: 1434.23


#### Iterating over values

Values of items in a dictionary can be returned as an iterable sequence by the `values()` method of the `dict` type object. Such a sequence can be used in `for` loops for iterating over values of data items. 

In [28]:
for price in stocks.values():
    print(price)

170.4
130.11
32.48
76.6
51.58
1434.23


#### Iterating over keys and values in parallel

Keys and values of a dictionary can also be iterated as a pair using the `items()` method. 

In [29]:
for item in stocks.items():
    print(item)

('AMZN', 170.4)
('TSLA', 130.11)
('TWTR', 32.48)
('AAPL', 76.6)
('ORCL', 51.58)
('GOOG', 1434.23)


It can be seen that the `items()` method creates an iterable sequence of tuples and each tuple is in the format of <code>(<i>key</i>, <i>value</i>)</code>. Such tuples can be unpacked into the key and value components in each iteration. 

In [30]:
for name, price in stocks.items():
    print('{}: {}'.format(name, price))

AMZN: 170.4
TSLA: 130.11
TWTR: 32.48
AAPL: 76.6
ORCL: 51.58
GOOG: 1434.23


<div class="alert alert-block alert-info">
<b>Question 1:</b>  
    The variable <span style='font-family:Courier'><b>works</b></span> is a dictionary containing a few of Wolfgang Amadeus Mozart's works, where the keys are the Köchel catalogue indexes of music works composed by Mozart, and values are the titles of these works. Write a program to group these works into two dictionaries: <span style='font-family:Courier'><b>concertos</b></span> that contains all concertos, and <span style='font-family:Courier'><b>symphonies</b></span> that contains all symphonies. 
</div>

In [31]:
works = {'K. 162': 'Symphony No. 22 in C major',
         'K. 216': 'Violin Concerto No. 3',
         'K. 218': 'Violin Concerto No. 4',
         'K. 219': 'Violin Concerto No. 5', 
         'K. 550': 'Symphony No. 40 in G minor',
         'K. 551': 'Symphony No. 41 in C major, "Jupiter"'}

works

{'K. 162': 'Symphony No. 22 in C major',
 'K. 216': 'Violin Concerto No. 3',
 'K. 218': 'Violin Concerto No. 4',
 'K. 219': 'Violin Concerto No. 5',
 'K. 550': 'Symphony No. 40 in G minor',
 'K. 551': 'Symphony No. 41 in C major, "Jupiter"'}

## Summary <a id="section3"></a>

### Built-in data structures <a id="subsection3.1"></a>

 <b> </b> | String | List | Tuple | Dictionary
:--------:|:------:|:----:|:-----:|:----------:
 **mutable**  | No     | Yes  |  No   |   Yes 
 **indexing and slicing** | integers | integers | integers | key names
 **operators** `+` **and** `*` | Yes | Yes | Yes | No
 **iterable** | Yes | Yes | Yes | Yes 
 **methods** | Yes | Yes | No | Yes
 
 
### Parentheses and brackets <a id="subsection3.2"></a>

 <b> </b> | `()` | `[]` | `{}` 
:--------|:------|:----|:-----
    **Usage**  | 1. Enclose input arguments of <br> function and method <br> 2. Create tuples  | 1. Create lists <br> 2. Indexing and slicing  |  1. Dictionary and Set <br>2. Used in f-strings or <br> `format()` method
**Examples**| `print('Hello')` <br> `string.upper()` <br> Empty tuple `()` | `[1, 2, 3]` <br> `string[3:]` <br> `dictionary['key']` | `{'key': 'value'}`
**Remarks** | 1. Cannot be omitted even when <br> there is no input argument <br> 2. Can be omitted when creating tuples | - | Set is not covered in this course