< [Files](ZFiles.ipynb) | [PyFinLab Index](ALWAYS-START-HERE.ipynb) | [About this Lab Book](Welcome.ipynb) >

<a id = "ref00"></a>

<a><img src="figures/UUBS.png" width="180" height="180" border="10" /></a>
<hr>

<h2> Notebook A8: Dictionaries</h2>

<div class="alert alert-block alert-info" style="margin-top: 20px">

<li><a href="#ref1">Aims and Objectives</a></li>
<li><a href="#ref2">Introduction</a></li>
<li><a href="#ref3">Dictionary operations</a></li>
<li><a href="#ref4">Dictionary methods</a></li>
<li><a href="#ref5">Aliasing and copying</a></li>
<li><a href="#ref6">Accumulating Multiple Results in a Dictionary</a></li>
<li><a href="#ref7">Accumulating Results from a Dictionary</a></li>
<li><a href="#ref8">Accumulating the Best Key</a></li>
<li><a href="#ref9">When to use a dictionary</a></li>
<li><a href="#ref10">Glossary</a></li>
<li><a href="#ref11">Exercises</a></li>
<br>
<p></p>
Indicative Completion Time: <strong>3 hrs</strong>(not including video viewing)
</div>

<hr>

<a id="ref1"></a>
<h3>Aims</h3>

To introduce and understand the concept and use of: 
* key-value pairs
* unordered sequences
* parallel construction in lists
* performance benefits and simplicity of dictionaries over parallel lists   
* dictionary iteration (iteration over keys)

<h3>Objectives</h3>

On completion of this notebook you should be able to correctly apply the:
* index operator to add a key,value pair to a dicionary
* del operator to remove an antry
* index operator - retrieval by key
* search - contains in / not in
* .items(),.keys(),.values() methods
* .get() - with a default value
* iteration idioms to parse and sort file content


<a id="ref2"></a>
<h2>Introduction </h2>

The compound data types we have studied in detail so far — strings and lists — are sequential collections. This means that the items in the collection are ordered from left to right and they use integers as indices to access the values they contain. This also means that looking for a particular value requires scanning the many items in the list until you find the desired value.

Data can sometimes be organised more usefully by associating a key with the value we are looking for. For example, if you are asked for the page number for the start of chapter 12 in a large textbook, you might flip around the book looking for the chapter 12 heading. If the chapter number appears in the header or footer of each page, you might be able to find the page number fairly quickly but it’s generally easier and faster to go to the index page and see what page chapter 12 starts on.

This sort of direct look-up of a value in Python is done with an object called a dictionary. Dictionaries are a different kind of collection. They are Python’s built-in mapping type. A map is an unordered, associative collection. The association, or mapping, is from a key, which can be of any immutable type (e.g., the chapter name and number in the analogy above), to a value (the starting page number), which can be any Python data object. You will learn how to use these collections in this notebook.



Here is the video collecton for this notebook which introduces some important concepts in the creation and use of Python dictionaries. View them more than once as they lay the foundation for what follows. Videos 1 and 2 provide an introduction to the mechanics of key-value pairs, hash functions and an overwiew of dictionary <i>collection</i> objects. Videos 3-5 provide a more indepth study of dictionaries and look at an important example of how they can be used in data analytics. Videos 4 and 5 can be left until you are working through the <a href="#ref6">Accumulating Multiple Results in a Dictionary</a> section of this notebook.  

<a href="http://www.youtube.com/watch?feature=player_embedded&v=OlW_7XL9muI
" target="_blank"><img src="https://i.ytimg.com/vi/0h4fQefIYIs/maxresdefault.jpg" 
alt="Key Value Pairs" width="200" height="180" border="10" /></a>

<p></p>
<center><b>Video 1:</b>  Key value pairs and hashing</center>

<a href="http://www.youtube.com/watch?feature=player_embedded&v=BtGO4RAPLJM
" target="_blank"><img src="https://i.ytimg.com/vi/0h4fQefIYIs/maxresdefault.jpg" 
alt="Key Value Pairs" width="200" height="180" border="10" /></a>

<p></p>
<center><b>Video 2:</b>  Python dictionaries - an overview </center>

<a href="http://www.youtube.com/watch?feature=player_embedded&v=oS0PXyLuEAk
" target="_blank"><img src="https://i.ytimg.com/vi/0h4fQefIYIs/maxresdefault.jpg" 
alt="Key Value Pairs" width="200" height="180" border="10" /></a>

<p></p>
<center><b>Video 3:</b>  Dictionaries in more detail</center>

<a href="http://www.youtube.com/watch?feature=player_embedded&v=89yO8EqlIos
" target="_blank"><img src="https://i.ytimg.com/vi/0h4fQefIYIs/maxresdefault.jpg" 
alt="Key Value Pairs" width="200" height="180" border="10" /></a>

<p></p>
<center><b>Video 4:</b>  Counting with dictionaries </center>

<a href="http://www.youtube.com/watch?feature=player_embedded&v=GqHR05Jn7EM
" target="_blank"><img src="https://i.ytimg.com/vi/0h4fQefIYIs/maxresdefault.jpg" 
alt="Key Value Pairs" width="200" height="180" border="10" /></a>

<p></p>
<center><b>Video 5:</b>  Dictionaries and files</center>

And this image provides a useful comparison between dictionaries and lists

<a><img src="figures/dict1.png" width="380" height="180" border="10" /></a>

<center><b>Figure 1:</b> Comparing a dictionary to a list.</center>


Creating a dictionary one key at a time

In [None]:
D1 = {}                   # here the use of {} tells Python to assign an empty dictionary to the variable D1
D1['one'] = 'une'         # now we have added the key 'one' to D1 with value 'une'
D1['two'] = 'deux'        # and another key-value pair
D1['three'] = 'trois'     # and another
D1                        # now let's have a look at D1 as a whole


 Or all at once

In [None]:
D2={'key1':1,'key2':'2','key3':[3,3,3],'key4':(4,4,4),'key5':5,(0,1):6}
D2

 Keys can be strings...

In [None]:
D2['key1']  # 'key1' is a key and it is a string. 
            # If you run this cell you will get the 
            # value associated with this key.

...but they can also be any immutable object; a tuple for instance 

In [None]:
D2[(0,1)]

Notice how each key is separated from its value by a colon `:`. Commas `,` separate the items (i.e., the key/value pairs), and the entire dictionary is enclosed in curly brackets. An empty dictionary is created with just two curly brackets, like this `{}`. Consider this dictionary `release_year_dict`

In [None]:
release_year_dict = {"Thriller":"1982", "Back in Black":"1980", 
                    "The Dark Side of the Moon":"1973", "The Bodyguard":"1992", 
                    "Bat Out of Hell":"1977", "Their Greatest Hits (1971-1975)":"1976", 
                    "Saturday Night Fever":"1977", "Rumours":"1977"}
release_year_dict

Like a list, a dictionary holds a sequence of elements. Unlike lists, dictionaries have no concept of order. Each element is represented by a key and its corresponding value. For every key there can only be a single value, however, multiple keys can hold the same value as can be seen in `release_year_dict`. Keys can only be strings, numbers, or tuples, but values can be any data type.

It is helpful to visualise the dictionary as a table, as in Figure 2. The first column represents the keys, the second column represents the values.

<a><img src="figures/dict2.png" width="360" height="180" border="10" /></a>

<center><b>Figure 2:</b> Tabular representation of a dictionary</center>  


Consider the dictionary below:

In [None]:
soundtrack_dic = { "The Bodyguard":"1992", "Saturday Night Fever":"1977"}
soundtrack_dic 

In the dictionary `soundtrack_dict` what are the keys ?

 <div align="right">
<a href="#Dict1" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict1" class="collapse">

```
The keys are "The Bodyguard" and "Saturday Night Fever" 
```
</div>

In the dictionary `soundtrack_dict` what are the values ?

 <div align="right">
<a href="#Dict2" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict2" class="collapse">

```
The values are "1992" and "1977"
```

</div>

We can retrieve values based on the associated key name:

In [None]:
release_year_dict['Thriller'] 

This corresponds to: 


<a><img src="figures/dict3.png" width="350" height="180" border="10" /></a>
  
 <center><b>Figure 3:</b> Tabular representation of accessing the value for "Thriller"</center>   


Similarly for The Bodyguard     


In [None]:
release_year_dict['The Bodyguard'] 

<a><img src="figures/dict4.png" width="350" height="180" border="10" /></a>
  
<center><b>Figure 4:</b> Accessing the value for the "The Bodyguard"</center>
 




We can add an entry to the dictionary

In [None]:
release_year_dict['Graduation']='2007'
release_year_dict

Or we can delete an entry   

In [None]:
del(release_year_dict['Thriller'])
del(release_year_dict['Graduation'])
release_year_dict

 We can determine if an element is in the dictionary 

In [None]:
'The Bodyguard' in release_year_dict

**Have a go:** 
1. Create a dictionary that keeps track of Ireland's all-time olympics medal count. Each key of the dictionary should represent the type of medal (gold, silver, or bronze) and each key’s value should be the number of that type of medal Ireland has won. As of December 2018 Ireland has 9 gold medals, 10 silver, and 12 bronze. Create a dictionary saved in the variable `medals` that reflects this information.

In [None]:
# Enter your code here

 <div align="right">
<a href="#Dict5" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict5" class="collapse">

```python
medals = {"gold":9,"silver":10,"bronze":12}
```
 
</div>

2. Someone was keeping track of medals for Italy at the 2016 Rio Summer Olympics. At the time they logged the data, Italy had won 7 gold medals, 8 silver metals, and 6 bronze medals in those Games. Create a dictionary called `olympics` where the keys are the numbers of each type of medal won and the values are the type of medal won.

In [None]:
# Enter your code here

 <div align="right">
<a href="#Dict6" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict6" class="collapse">

```python
 olympics = {7:"gold",8:"silver",6:"bronze"}
```

</div>

More on these operations in the next section

<a id="ref3"></a>
<h2>Dictionary operations</h2>

<div align="right"><a href="#ref00">back to top</a></div>

The `del` statement removes a key-value pair from a dictionary. For example, the following dictionary contains the names of various albums and the associated release date. We can remove an entry from the dictionary.

In [None]:
release_year_dict = {"Thriller":"1982", "Back in Black":"1980", 
                    "The Dark Side of the Moon":"1973", "The Bodyguard":"1992", 
                    "Bat Out of Hell":"1977", "Their Greatest Hits (1971-1975)":"1976", 
                    "Saturday Night Fever":"1977", "Rumours":"1977"}

print('before delete')
print(release_year_dict)
del(release_year_dict['Back in Black'])
print('after delete')
print(release_year_dict)

Dictionaries are mutable, as the delete operation above indicates. As we’ve seen before with lists, this means that the dictionary can be modified by referencing an association on the left hand side of the assignment statement. In the previous example, instead of deleting the entry for `"Back in Black"`, we could have edited its value to `"1981"`.

In [None]:
release_year_dict['Back in Black'] = "1981"
release_year_dict

Consider the following dictionary containing information on the number of major awards won by these albums

In [None]:
major_awards_dict = {"Thriller":8, "Back in Black":3, 
                    "The Dark Side of the Moon":3, "The Bodyguard":1, 
                    "Bat Out of Hell":4, "Their Greatest Hits (1971-1975)":0, 
                    "Saturday Night Fever":6, "Rumours":0}
major_awards_dict

Suppose `"Bat Out of Hell"` wins another award. This could be handled as follows

In [None]:
major_awards_dict["Bat Out of Hell"] = major_awards_dict["Bat Out of Hell"] +1 
major_awards_dict
# note what happens to the output if you run this cell multiple times; if only winning awards was this easy.

**Have a go:**
1. Modify the `?????` below to reflect the fact that a new album entry `"Run to The Hills"` has 13 awards
    

In [None]:
major_awards_dict["Run to the Hills"] = major_awards_dict["Bat Out of Hell"] +  major_awards_dict["?????"]
major_awards_dict

 <div align="right">
<a href="#Dict7" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict7" class="collapse">

```python
major_awards_dict["Run to the Hills"]= 
major_awards_dict["Bat Out of Hell"] + major_awards_dict["Thriller"]
```

</div>

2. Update the `major_awards_dict` dictionary to fully account for the fact that one of the awards intended for `"Thriller"` was accidentally allocated to `"Saturday Night Fever"`

In [None]:
# Enter your code here



 <div align="right">
<a href="#Dict8" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict8" class="collapse">

```python
major_awards_dict["Thriller"] = major_awards_dict["Thriller"] + 1

major_awards_dict["Saturday Night Fever"] = major_awards_dict["Saturday Night Fever"] - 1
```

</div>

<a id="ref4"></a>
<h2>Dictionary methods</h2>

<div align="right"><a href="#ref00">back to top</a></div>

Dictionaries have a number of useful built-in methods. The following table provides a summary and more details can be found in the [Python Documentation](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict).


| Method 	| Parameters 	| Description                                             	|
|--------	|------------	|----------------------------------------------------------	|
| keys   	| none       	| Returns a view of the keys in the dictionary            	|
| values 	| none       	| Returns a view of the values in the dictionary          	|
| items  	| none       	| Returns a view of the key-value pairs in the dictionary 	|
| get    	| key        	| Returns the value associated with key; None otherwise   	|
| get    	| key,alt    	| Returns the value associated with key; alt otherwise    	|

<np></np>
<center><b>Table 1</b></center>


As we saw earlier with strings and lists, dictionary methods use dot `.` notation, which specifies the name of the method to the right of the dot and the name of the object on which to apply the method immediately to the left of the dot. The empty brackets in the case of `keys` indicates that this method takes no parameters. If `x` is a variable whose value is a dictionary, `x.keys` is the method object, and `x.keys()` invokes the method, returning a view of the value.

The `keys` method returns the keys, not necessarily in the same order they were added to the dictionary or any other particular order.



Take some time to evaluate the code below before running it. What do you think it is doing?

In [None]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

for akey in inventory.keys():     # the order in which we get the keys is not defined
    print("Got key", akey, "which maps to value", inventory[akey])

print(list(inventory.keys()))


It is so common to iterate over the keys in a dictionary that you can omit the `keys` method call in the `for` loop; iterating over a dictionary implicitly iterates over its keys.

In [None]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

for k in inventory:
    print("Got key", k, "which maps to value", inventory[k])


The `values` and `items` methods are similar to the `keys` method. They return the objects which can be iterated (looped) over. Note that the `items` objects are tuples containing the key and the associated value.

In [None]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

print(list(inventory.values()))
print(list(inventory.items()))

for k in inventory:
    print("Got",k,"that maps to",inventory[k])


The `in` and `not in` operators can test if a key is in the dictionary:

In [None]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}
print('apples' in inventory)
print('cherries' in inventory)

if 'plums' in inventory:
    print(inventory['plums'])
else:
    print("We have no plums")


The `get` method allows us to access the value associated with a key, similar to the `[ ]` operator. The important difference is that `get` will not cause a runtime error if the key is not present. It will instead return None. There exists a variation of `get` that allows a second parameter that serves as an alternative return value in the case where the key is not present. This can be seen in the final example below. In this case, since “cherries” is not a key, return 0 (instead of None).

In [None]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

print(inventory.get("apples"))
print(inventory.get("cherries"))

print(inventory.get("cherries",0))


**Have a go:** 

1. Edit the code below to replace `?????` with appropriate content to ensure the result of running the code is 3

In [None]:
animals = {"cat":12, "dog":6, "elephant":23, "bear":20}
answer = animals.get("elephant")//animals.get(?????)
print(answer)

<div align="right">
<a href="#Dict1133" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict1133" class="collapse">

```python
answer = animals.get("elephant")//animals.get("dog")

```

</div>

2. Activate one of the two commented lines in the code below to ensure the result of running the code is True.

In [None]:
animals = {"cat":12, "dog":6, "elephant":23, "bear":20}
#print("dog" in animals)
#print(23 in animals)

<div align="right">
<a href="#Dict1133dd" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict1133dd" class="collapse">

```python

animals = {"cat":12, "dog":6, "elephant":23, "bear":20}
print("dog" in animals)

```

</div>

3. Implement indenting in the code below to ensure it runs and returns 43

In [None]:
total = 0
animals = {"cat":12, "dog":6, "elephant":23, "bear":20}
for akey in animals:
if len(akey) > 3:
total = total + animals[akey]
print(total)

<div align="right">
<a href="#Dict33" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict33" class="collapse">

```python
total = 0
animals = {"cat":12, "dog":6, "elephant":23, "bear":20}
for akey in animals:
    if len(akey) > 3:
        total = total + animals[akey]
print(total)

```

</div>

4. Every four years the summer olympics are held in a different country. Add a key-value pair to the dictionary `places` to reflect the fact that the 2016 Olympics were held in Brazil. Don't rewrite the entire dictionary to do this.

In [None]:
places = {"Australia":2000, "Greece":2004, "China":2008, "England":2012}
# Enter your code here



<div align="right">
<a href="#Dict3" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict3" class="collapse">

```python
places['Brazil'] = 2016
```

</div>

5. We have a dictionary of the specific events that Italy has won medals in and the number of medals they have won for each event. Assign to the variable `events` a list of the keys from the dictionary `medal_events`. Don't hard code this.

In [None]:
medal_events = {'Shooting': 7, 'Fencing': 4, 'Judo': 2, 'Swimming': 3, 'Diving': 2}
# Enter your code here
events

 <div align="right">
<a href="#Dict4" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict4" class="collapse">

```python
events = medal_events.keys()
```

</div>

<a id="ref5"></a>
<h2>Aliasing and copying</h2>

<div align="right"><a href="#ref00">back to top</a></div>

Because dictionaries are mutable, you need to be aware of aliasing (as we saw with lists). Whenever two variables refer to the same dictionary object, changes to one affect the other. For example, `opposites` is a dictionary that contains pairs of opposites.

In [None]:
opposites = {'up': 'down', 'right': 'wrong', 'true': 'false'}
alias = opposites

print(alias is opposites)

alias['right'] = 'left'
print(opposites['right'])

As you can see from the `is` operator, `alias` and `opposites` refer to the same object. If you want to modify a dictionary and keep a copy of the original, use the dictionary `copy` method. Since the variable `acopy` is a copy of the dictionary, changes to it will not effect the original.

In [None]:
opposites = {'up': 'down', 'right': 'wrong', 'true': 'false'}
acopy = opposites.copy()
acopy['right'] = 'left'    # here we have made a change to acopy


print('opposites has not been changed in the process',opposites)

**Have a go:**
1. Run the cell below but try to predict the output before doing so. Then fix it according to what you think *fix* might mean in this context.

In [None]:
myanimals = {"cat":12, "dog":6, "elephant":23, "bear":20}
youranimals = myanimals
youranimals["elephant"] = 999
print(myanimals["elephant"])

<div align="right">
<a href="#Dict393" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict393" class="collapse">

```python

myanimals = {"cat":12, "dog":6, "elephant":23, "bear":20}
youranimals = myanimals.copy()
youranimals["elephant"] = 999
print(myanimals["elephant"])

```

</div>

<a id="ref6"></a>
<h2>Accumulating Multiple Results in a Dictionary</h2>

<div align="right"><a href="#ref00">back to top</a></div>

You have previously seen the accumulator pattern; it goes through the items in a sequence, updating an accumulator variable each time. Rather than accumulating a single result, it’s possible to accumulate many results. Suppose, for example, we wanted to find out which letters are used most frequently in English.

For convenience we are going to use a paragraph of text to perform this task. The paragraph just written seems as good a choice as any.

In [None]:
txt = "You have previously seen the accumulator pattern; \
it goes through the items in a sequence, updating an \
accumulator variable each time. Rather than accumulating \
a single result, it’s possible to accumulate many results. \
Suppose, for example, we wanted to find out which letters \
are used most frequently in English."

print(txt)

# The "\" symbol is used here so we can dislay the content of the string 
# variable "txt" within the page width. Without it txt would display 
# as in the cell below because we are using a Code cell to define the 
# variable, as we must.

In [None]:
txt = "You have previously seen the accumulator pattern; it goes through the items in a sequence, updating an accumulator variable each time. Rather than accumulating a single result, it’s possible to accumulate many results. Suppose, for example, we wanted to find out which letters are used most frequently in English."

We can accumulate counts for more than one character as we traverse the text. Suppose, for example, we wanted to compare the counts of `t` and `s` in the text.

In [None]:

t_count = 0 #initialise the t accumulator variable
s_count = 0 #initialise the s accumulator as well
for c in txt:
    if c == 't':
        t_count = t_count + 1   #increment the t counter
    elif c == 's':
        s_count = s_count + 1   #increment the s counter
print("t: " + str(t_count) + " occurrences")
print("s: " + str(s_count) + " occurrences")


That worked but you can hopefully see that this is going to get tedious if we try to accumulate counts for all letters. We will have to initialise a lot of accumulators, and there will be a very long `if..elif..elif` statement. Using a dictionary, we can do much better.

A single dictionary can hold all of the accumulator variables. Each key in the dictionary will be one letter, and the corresponding value will be the count so far of how many times that letter has occurred.

In [None]:
x = {} # start with an empty dictionary
x['t'] = 0  # initialise the t counter
x['s'] = 0  # initialise the s counter
for c in txt:
    if c == 't': # this is line 5
        x['t'] = x['t'] + 1  # increment the t counter
    elif c == 's':
        x['s'] = x['s'] + 1  # increment the s counter

print("t: " + str(x['t']) + " occurrences")
print("s: " + str(x['s']) + " occurrences")


This hasn’t exactly improved things yet, but look closely at lines 5-8 in the code above. Whichever character we’re seeing, `t` or `s`, we’re incrementing the counter for that character. So lines 6 and 8 could really be the same.

In [None]:
x = {} # start with an empty dictionary
x['t'] = 0  # intiialise the t counter
x['s'] = 0  # initialise the s counter
for c in txt:
    if c == 't':
        x[c] = x[c] + 1   # increment the t counter
    elif c == 's':        # this is line 7
        x[c] = x[c] + 1   # increment the s counter

print("t: " + str(x['t']) + " occurrences")
print("s: " + str(x['s']) + " occurrences")


Lines 6 and 8 above may seem a little confusing at first. Previously, our assignment statements referred directly to keys, with `x['s']` and `x['t']`. Here we are just using a variable `c` whose value is ‘s’ or ‘t’, or some other character.

Note that, as with all assignment statements, the right side is evaluated first. In this case `x[c]` has to be evaluated. As with all expressions, we first have to substitute values for variable names. `x` is a variable bound to a dictionary. `c` is a variable bound to one letter from the string that `txt` is bound to (that’s what the `for` statement says to do: execute lines 5-8 once for each character in `txt`, with the variable `c` bound to the current character on each iteration.) So, let’s suppose that the current character is the letter s (we are on line 8). Then `x[c]` looks up the value associated with the key `s` in the dictionary `x`. If all is working correctly, that value should be the number of times ‘s’ has previously occurred. For the sake of argument, suppose it’s 14. Then the right side evaluates to 14 + 1, 15. 

Now we have assigned the value 15 to `x[c]`. That is, in dictionary `x`, we set the value associated with the key `s` (the current value of the variable `c`) to be 15. In other words, we have incremented the value associated with the key `s` from 14 to 15.

We can do better still. One other nice thing about using a dictionary is that we don’t have to prespecify what all the letters will be. In this case, we know in advance what the alphabet for English is, but later in the chapter we will count the occurrences of words, and we do not know in advance all the of the words that may be used. Rather than pre-specifying which letters to keep accumulator counts for, we can start with an empty dictionary and add a counter to the dictionary each time we encounter a new thing that we want to start keeping count of.

In [None]:
x = {} # start with an empty dictionary
for c in txt:
    if c not in x:
        # we have not seen this character before, so initialise a counter for it
        x[c] = 1

    #whether we've seen it before or not, increment its counter
    x[c] = x[c] + 1 # this is line 8

print("t: " + str(x['t']) + " occurrences")
print("s: " + str(x['s']) + " occurrences")


Notice that in the `for` loop, we no longer need to explicitly ask whether the current letter is an ‘s’ or a ‘t’. The increment step on line 8 works for the counter associated with whatever the current character is. Our code is now accumulating counts for all letters, not just ‘s’ and ‘t’. Here's what `x` looks like now... 

In [None]:
print(x)

**Have a go:**
    
1. Which of the following will print out `True` if there are more occurrences of `e` than `t` in `txt` and `False` if `t` occurred  more frequently.

```python
A. print(txt['e'] > txt['t'])
B. print(x['e'] > x['t'])
C. print(x[e] > x[t])
D. print(x[c] > txt[c])
E. print(e[x] > t[x])

```

In [None]:
x # this will print x out explicitly allowing you to view the results of the accumulating process

# you can experiment here to help you check by removing the #Letter# below one at a time

#A# print(txt['e'] > txt['t'])
#B# print(x['e'] > x['t'])
#C# print(x[e] > x[t])
#D# print(x[c] > txt[c])
#E# print(e[x] > t[x])

 <div align="right">
<a href="#Dict100" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict100" class="collapse">

```python
B. print(x['e'] > x['t'])
```

</div>

Note that the `print` statements on lines 10 and 11 of the original code pick out the specific keys `t` and `s`. We can generalise that too, to print out the occurrence counts for all of the characters, using a `for` loop to iterate through the keys in `x`.

In [None]:
letter_counts = {} # start with an empty dictionary
for c in txt:
    if c not in letter_counts:
        # we have not seen this character before, so initialise a counter for it
        letter_counts[c] = 0

    #whether we've seen it before or not, increment its counter
    letter_counts[c] = letter_counts[c] + 1

for c in letter_counts.keys():
    print(c + ": " + str(letter_counts[c]) + " occurrences")


Note that only those letters (characters) that actually occur in the text are shown. The most frequently occuring character was actualy ' ' the space character; this fact is easy to determine by inspection because the dictionary `x` is relatively short. In everyday data analytics problems of interest this will not be so easy to determine by inspecton so will require to be answered using code. But that is precisely what we are here for! More on this later.

2. Create a dictionary called `char_d` from the string `stri`, so that the key is a character and the value is how many times it occurs.

In [None]:
stri = "what can I do to solve this problem?"

# Enter your code here

 <div align="right">
<a href="#Dict10" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict10" class="collapse">

```python

char_d = {}
for c in stri:
    if c not in char_d:
        char_d[c] = 0
    char_d[c] = char_d[c] + 1

print(char_d)
```

</div>

3. In the cell below is a string, saved to the variable name `sentence`. Split the string into a list of words, then create a dictionary that contains each word and the number of times it occurs. Save this dictionary to the variable name `word_counts`.

In [None]:
sentence = "The dog chased the rabbit into the forest but the rabbit was too quick."

# Enter your code here

 <div align="right">
<a href="#Dict9" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict9" class="collapse">

```python
split_sentence = sentence.split()
word_counts = {}

for word in split_sentence:
    if word not in word_counts:
        word_counts[word] = 0
    word_counts[word] = word_counts[word] + 1  
    
print(word_counts)

```

</div>

<a id="ref7"></a>
<h2>Accumulating Results from a Dictionary</h2>

<div align="right"><a href="#ref00">back to top</a></div>

Just as we have iterated through the elements of a list to accumulate a result, we can also iterate through the keys in a dictionary, accumulating a result that may depend on the values associated with each of the keys.

For example, suppose that we wanted to compute a Scrabble score for the `txt` string earlier. Each occurrence of the letter ‘e’ earns 1 point, but ‘q’ earns 10 points. We have a second dictionary, stored in the variable `letter_values` below. Now, to compute the total score, we start an accumulator at 0 and go through each of the letters in the `counts` dictionary. For each of those letters that has a letter value (no points for spaces, punctuation, capital letters, etc.), we add to the total score.

In [None]:
x = {} # start with an empty dictionary
for c in txt:
    if c not in x:
        # we have not seen this character before, so initialise a counter for it
        x[c] = 1

    #whether we've seen it before or not, increment its counter
    x[c] = x[c] + 1

letter_values = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f':4, 'g': 2, 'h':4, 'i':1, 'j':8, 'k':5, 'l':1, 'm':3, 'n':1, 'o':1, 'p':3, 'q':10, 'r':1, 's':1, 't':1, 'u':1, 'v':8, 'w':4, 'x':8, 'y':4, 'z':10}

tot = 0
for y in x:
    if y in letter_values:
        tot = tot + letter_values[y] * x[y] # this is line 15

print(tot)


Line 15 is the tricky one. We are updating the variable `tot` to have its old number plus the score for the current letter times the number of occurrences of that letter. Try changing some of the letter values and see how it affects the total. Try changing `txt` to be just a single word that you might play in Scrabble. What about `"supercalifragilisticexpialidocious"`?

**Have a go**

1. The dictionary `travel` contains the number of countries within each continent that Claire has travelled to. Find the total number of countries Claire has been to, and save this number to the variable name `total`. Don't hard code this.

In [None]:
travel = {"North America": 2, "Europe": 8, "South America": 3, "Asia": 4, "Africa":1, "Antarctica": 0, "Australia": 1}

#Enter your code here

 <div align="right">
<a href="#Dict11" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict11" class="collapse">

```python
total = 0
for countries in travel:
    total = total + travel[countries]
print(total)

```

</div>

2. `schedule` is a dictionary where a course name is a key and its value is how many credits it is worth. Go through and accumulate the total number of credits that have been earned so far by a student who has successfully completed `SPANISH 103`, `SPANISH 231` and `ANTHRO 101`. Assign the result to the variable `total_credits`.

In [None]:
schedule = {"UARTS 150": 3, "SPANISH 103": 4, "ENGLISH 125": 4, "SI 110": 4, 
            "ENS 356": 2, "WOMENSTD 240": 4, "SI 106": 4, "BIO 118": 3, 
            "SPANISH 231": 7, "PSYCH 111": 4, "LING 111": 3, "SPANISH 232": 4, 
            "STATS 250": 4, "SI 206": 4, "COGSCI 200": 4, "AMCULT 202": 4, 
            "ANTHRO 101": 4}

# Enter your code here

 <div align="right">
<a href="#Dict12" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict12" class="collapse">

```python
total_credits = 0
completed_courses = ["SPANISH 103","SPANISH 231","ANTHRO 101"]
for course in completed_courses:
    total_credits = total_credits + schedule[course]
    
total_credits
```

</div>

<a id="ref8"></a>
<h2>Accumulating the Best Key</h2>

<div align="right"><a href="#ref00">back to top</a></div>

What if you want to find the key associated with the maximum value? It would be nice to just find the maximum value and then look up the key associated with it, but dictionaries don’t work that way. You can look up the value associated with a key, but not the key associated with a value. (The reason for that is there may be more than one key that has the same value).

The trick is to have the accumulator keep track of the best key so far instead of the best value so far. For simplicity, let’s assume that there are at least two keys in the dictionary. Then, similar to our first version of computing the maximum of a list, we can initialise the best-key-so-far to be the first key, and loop through the keys, replacing the best-key-so-far whenever we find a better one.

In the first exercise below, I have provided some skeleton code. See if you can fill it in. An answer is provided, but you’ll learn more if you try to write it yourself first.

**Have a go:**

1. Write a program that finds the key in a dictionary that has the maximum value. If two keys have the same (maximum) value, it’s OK to print out either one. Fill in the missing code

In [None]:
d = {'a': 194, 'b': 54, 'c':34, 'd': 44, 'e': 312, 'full':31}

ks = d.keys()
best_key_so_far = list(ks)[0]  # this turns ks into a real list before using [] to select an item
for k in ks:
    if d[k] > ?????????:     # This line is where you need to enter the correct code to make this script work properly
        best_key_so_far = k

print("key " + best_key_so_far + " has the highest value, " + str(d[best_key_so_far]))


 <div align="right">
<a href="#Dict13" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict13" class="collapse">

```python
????????? = d[best_key_so_far]
```

</div>

2. Create a dictionary called `d` that keeps track of all the characters in the string `Game_of_Thrones` and notes how many times each character was seen. Then, find the key with the lowest value in this dictionary and assign that key to `min_value`.

In [None]:
Game_of_Thrones = "The Dark Hedges is an avenue of large mature \
beech trees, which were planted by James Stuart to frame an \
avenue to his home. The trees were planted around 1775 when he \
built nearby Gracehill House. The trees are on both sides of \
the road, forming a “tunnel” that is between 6 and 10 meters \
in width."

#Enter your code here


 <div align="right">
<a href="#Dict14" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict14" class="collapse">

```python
d = {}
for c in Game_of_Thrones:
    if c not in d:
        d[c] = 0 
    d[c] = d[c] + 1     # this for loop creates the required dictionary
    
ks = d.keys()
min_value = list(ks)[0]  # this turns ks into a real list before 
                         # using [] to select an item
for k in ks:
    if d[k] < d[min_value]: # This line is where you need to enter the
                            # correct code to make this script work properly
        min_value = k

print("key " + min_value + " has the lowest value, "
      + str(d[min_value]))
```

</div>

3. Create a dictionary called `lett_d` that keeps track of all of the characters in the string `product` and notes how many times each character was seen. Then, find the key with the highest value in this dictionary and assign that key to `max_value`.

In [None]:
product = "iphone or android phone, what's your preference?"

#Enter your code here

 <div align="right">
<a href="#Dict144" class="btn btn-default" data-toggle="collapse">Click here for the answer</a>

</div>
<div id="Dict144" class="collapse">

```
No explicit answer given to this one as you should be able 
to use what has gone before to write code for this task. 
You can check your answer by direct observation since the 
string "iphone or android phone, what's your preference?" 
is not very long.

```

</div>

Checking your answer to this task manually is a good idea, even if you feel you successfully completed the code. Doing so should give you a healthy respect for the convenience that coding brings to tasks like this because even in a short phrase such as `"phone or android phone, what's your preference?"` you will probably have found it quite tedious to keep track and come up with an answer manually. Computers are so much better at this than humans!

<a id="ref9"></a>
<h2>When to use a Dictionary</h2>

<div align="right"><a href="#ref00">back to top</a></div>

Now that you have experience using lists and dictionaries, it will often be necessary to decide which one is best to use in a particular situation. The following guidelines will help you recognise when a dictionary will be beneficial:

When a piece of data consists of a set of properties of a single item, a dictionary is often better. For instance, you could try to keep track mentally that the customers' postcode information is located at index 2 in a list, but your code will be easier to read and you will make fewer mistakes if you can look up `mydiction[‘postcode’]` rather than if you look up `mylst[2]`.

When you have a collection of data pairs, and you will often have to look up one of the pairs based on its first value, it is better to use a dictionary than a list of `(key, value)` tuples. With a dictionary, you can find the value for any `(key, value)` tuple by looking up the key. With a list of tuples you would need to iterate through the list, examining each pair to see if it had the key that you want.

On the other hand, if you have a collection of data pairs where multiple pairs share the same first data element, then you can’t use a dictionary, because a dictionary requires all the keys to be distinct from each other.

<a id="ref10"></a>
<h2>Glossary</h2>

<div align="right"><a href="#ref00">back to top</a></div>

**dictionary:**  a collection of key-value pairs that maps from keys to values. The keys can be any immutable type, and the values can be any type.

**key:**  a data item that is mapped to a value in a dictionary. Keys are used to look up values in a dictionary.

**value:**  the value that is associated with each key in a dictionary.

**key-value pair:**  one of the pairs of items in a dictionary. Values are looked up in a dictionary by key.

**mapping type:**  a mapping type is a data type comprised of a collection of keys and associated values. Python’s only built-in mapping type is the dictionary. Dictionaries implement the [associative array](https://en.wikipedia.org/wiki/Associative_array) abstract data type.

<a id="ref11"></a>
<h2>Exercises</h2>

<div align="right"><a href="#ref00">back to top</a></div>

The albums 'Back in Black', 'The Bodyguard' and 'Thriller' have the following music recording sales in millions 50, 50 and 65, respectively:

1. Create a dictionary `album_sales_dict` where the keys are the album name and the sales in millions are the values. 

In [None]:
# write your code here



<div align="right">
<a href="#q9" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q9" class="collapse">

```python
album_sales_dict= { "The Bodyguard":50, "Back in Black":50,
                    "Thriller":65}
```
</div>

2. Use the dictionary created in (1.) to find the total sales of "Thriller".

In [None]:
# write your code here



<div align="right">
<a href="#q10b" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q10b" class="collapse">

```python
album_sales_dict["Thriller"]
```

</div>

3. Find the names of the albums from the dictionary using the `keys` method.

In [None]:
# write your code here



<div align="right">
<a href="#q10c" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q10c" class="collapse">

```python

album_sales_dict.keys()

```


4. Find the recording sales from the dictionary using the `values` method.

In [None]:
# write your code here

<div align="right">
<a href="#q10d" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q10d" class="collapse">

```python
album_sales_dict.values()
```
</div>

5. Try to predict what the following line of code would print.

In [None]:
d = {'spring': 'autumn', 'autumn': 'fall', 'fall': 'spring'}
print(d['autumn'])

<div align="right">
<a href="#q10dff" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q10dff" class="collapse">

```python
fall
```
</div>

6. In order to get the following code to print 'failure', what should the value of `?` be.

In [None]:
d = { 'work': 'success', 'success': 'failure', 'failure': 'money', 'time': 'work', 'industry': 'time'}
d[d[?]]

<div align="right">
<a href="#q1a" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q1a" class="collapse">

```
replace `?` with 'work'
```

</div>

7. Consider the block of code

```python
d =  {'a': 2, 'b': 3, 'c': 1}
e = {}
for c in d:
    e[d[c]] = c
print(e)
```

Without running it decide which of the following options accurately conveys what it does. 

* A. It creates a new copy of `d`.
* B. It creates a new dictionary which swaps the keys and values in `d`.
* C. It throws an error.
* D. It creates a new dictionary which maps each of `d`'s keys to itself.
* E. It creates a new dictionary which maps each of `d`'s values to itself.

<div align="right">
<a href="#q1b" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q1b" class="collapse">

```
The answer is B. d[c] gets the value from dictionary d with key c. In dictionary e, we are putting d[c] as a key and value as c. 
```

</div>

8. Consider the following code.

In [None]:
alphabet = 'abcdefghijklmnopqrstuvwxyz'
values = {}
answer = 0
for i in range(len(alphabet)):
    values[?????] = i+1;
    answer += values[?????]
print('The value of answer is',answer,' and values =',values)

Replace `?????` in the code to result in the following output
```
The value of answer is 351  and values
```
```python
= {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 
   'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12,
   'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 
   'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 
   'w': 23, 'x': 24, 'y': 25, 'z': 26}
```

<div align="right">
<a href="#q1c" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q1c" class="collapse">

```
replace `?????` with `alphabet[i]`
```

</div>

9. The table below contains a basic vocabulary map from English to Pirate. Write a program that asks the user to input a sentence (to include several of the English words in the table) and then translates the sentence to Pirate.


| English    | Pirate        |
|------------|---------------|
| sir        | matey         |
| hotel      | fleabag inn   |
| student    | swabbie       |
| boy        | matey         |
| madam      | proud beauty  |
| professor  | foul blaggart |
| restaurant | galley        |
| your       | yer           |
| excuse     | arr           |
| students   | swabbies      |
| are        | be            |
| lawyer     | foul blaggart |
| the        | th’           |
| restroom   | head          |




In [None]:
# Write your code here



<div align="right">
<a href="#q1d" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q1d" class="collapse">

```python
str_to_translate = input('Enter a sentence to translate from English to Pirate: \nThe English version \n')

#  As a for instance you could copy and paste...
# "I see you are enjoying your stay at the hotel 
# with the students and the #professor and you all 
# ate at the restaurant with madam and boy"

english_to_pirate = {'sir':'matey',
                  'hotel':'fleabag inn',
                  'student': 'swabbie',
                  'boy': 'matey',
                  'madam':'proud beauty',
                  'professor': 'foul blaggart',
                  'restaurant': 'galley',
                  'your':'yer',
                  'excuse':'arr',
                  'students':'swabbies',
                  'are':'be',
                  'lawyer':'foul blaggart',
                  'the':'th’',
                  'restroom':'head'}

split_string = str_to_translate.split() 
pirate_string = split_string.copy()
for word in range(len(split_string)):
    if split_string[word] in english_to_pirate:
        pirate_string[word] = english_to_pirate[split_string[word]]
    
print("\nThe Pirate version\n",' '.join(pirate_string))

```

</div>

10. Write a program that finds the most used 7 letter word in `titanic.txt`. This file should be saved in the same directory/folder that you are using to access this notebook.

In [None]:
# Write your code here



<div align="right">
<a href="#q1e" class="btn btn-default" data-toggle="collapse">Click here for the solution</a>

</div>
<div id="q1e" class="collapse">

```python
f = open('titanic.txt', 'r')
contents = f.read()
d = {}

for w in contents.split():
    if len(w) == 7:
        if w not in d:
            d[w] = 1
        else:
            d[w] = d[w] + 1

dkeys = list(d.keys())
most_used = dkeys[0]

for k in dkeys:
    if d[k] > d[most_used]:
        most_used = k

print("The most used word is '"+most_used+"', 
      which appears "+str(d[most_used])+" times")
```


</div>

< [Files](ZFiles.ipynb) | [PyFinLab Index](ALWAYS-START-HERE.ipynb) | [About this Lab Book](Welcome.ipynb) >

<div align="right"><a href="#ref00">back to top</a></div>