## 1.0  Introduction to Dictionary

Let's create a variable, temps, containing a list of temperature of the following cities:  Amsterdam, Hong Kong, London, San Francisco, Singapore.  This was taken from a website on May 9, 2018.

In [1]:
temps = [17, 26, 9, 20, 34]

In [2]:
temps[4]  #use index to extract temperature of Singapore

34

Note that when a list is used above, you must remember the index position for each city to extract the temperature for that city.  The alternative is to use a dictionary where the value of the temperature is attached to a city.
The data for a dictionary is entered in pairs:  what we call a key-value pair.  In this case, the city is the key and the city's temperature is the value.  
A dictionary is defined by enclosing the key-value pair is curly brackets:  {  }
The illustration below is a dictionary of the temperatures of the cities that we list above.

In [13]:
cityTemps = {'Amsterdam':17, 'Hong Kong':26, 'London':9,
             'San Francisco':20, 'Singapore':34}

To use the dictionary to extract the temperature of a country, just call the country name as a subscript to the variable.  
In the dictionary, the general syntax for accessing a data is:  d['key'], where d is the name of the dictionary (cityTemps in this illustration) and 'key' is a specific country name.

In [14]:
cityTemps['Singapore']

34

Below are two alternative methods of defining a dictionary.  The first method defines it using the function dict, and assigning the values to the keys using the "=" sign.
The second method takes two lists, one containing the key (city) and the other the value (temps) and combines them using the zip function.  The results of the combined list is then defined as a dictionary by the dict function.

In [14]:
cityTempsEqual = dict(Amsterdam=17, HongKong=26, London=9,
             SanFrancisco=20, Singapore=34)
cityTempsEqual['HongKong']

26

In [30]:
temps = [17, 26, 9, 20, 34]
city = ['Amsterdam', 'Hong Kong', 'London', 'San Francisco', 'Singapore']
cityTempsZip = dict(zip(city,temps))
cityTempsZip

{'Amsterdam': 17,
 'Hong Kong': 26,
 'London': 9,
 'San Francisco': 20,
 'Singapore': 34}

## 2.0 Dictionary Operations

You can access the data stored in the dictionary by using the .item() method.  The output contains both the key and value.  Note that the output is not in the order that you entered the data.  We will deal this lack of order a little later.

In [31]:
cityTemps.items()

dict_items([('London', 9), ('Amsterdam', 17), ('Singapore', 34), ('San Francisco', 20), ('Hong Kong', 26)])

You do not have to access both the keys and the values at the same time.  If you wish to access just the keys, use the .keys() method.  If you wish to access just the values, use the .values() method.  

In [31]:
cityTemps.keys()

dict_keys(['Hong Kong', 'Sydney', 'Amsterdam', 'San Francisco', 'Singapore', 'London'])

In [32]:
cityTemps.values()

dict_values([26, 23, 17, 20, 34, 9])

New key-pair are easily added to an existing dictionary by the normal assignment.  Define a new key ('Sydney') within the dictionary (cityTemps), and assign its value (23)

In [21]:
cityTemps['Sydney']=23
cityTemps

{'Amsterdam': 17,
 'Hong Kong': 26,
 'London': 9,
 'San Francisco': 20,
 'Singapore': 34,
 'Sydney': 23}

Use the del function to remove a key-pair from a dictionary.  It is sufficient to identify just the key of the key-pair to be removed.

In [22]:
del cityTemps['Sydney']
cityTemps

{'Amsterdam': 17,
 'Hong Kong': 26,
 'London': 9,
 'San Francisco': 20,
 'Singapore': 34}

As discussed before, it is easy to access the dictionary data by calling its key.  

In [45]:
cityTemps['Singapore']

34

However, if the key is not found in the dictionary, an error is returned.  If you have a program that accesses the dictionary data through the keys multiple times, with possibly different keys being called each time, then an error will stop the program.  This interruption to the program may not be desirable in certain circumstances.

In [46]:
cityTemps['Berlin']

KeyError: 'Berlin'

To prevent an error being returned and program stopping because of attempts to call a key that is not in the dictionary, we use a .get() method with a default value.  

The syntax is as follows.
d.get('key',defualt value) where d is the name of the dictionary.

The first argument in the .get() method is the key that we want to search for.  If it is found, the value attached to that key is returned.  The second argument of the .get() method is the default value returned when the key is not found.

We illustrate the use of the method below.

In [48]:
cityTemps.get('Singapore','Temperature not listed')

34

In [49]:
cityTemps.get('Berlin','Temperature not listed')

'Temperature not listed'

## 3.0 Ordering Dictionary Data

Assume that the data below is the number of clients in each of the 6 months shown.  We print the data and realise that it is not in the order they are entered.  This does not make sense for us.  We wish our data to be ordered as Jan, Feb, Mar ... 

In [1]:
clientNumber = {'Jan': 10, 'Feb': 9, 'Mar': 14, 
                'Apr': 12, 'May': 5, 'Jun': 6}
for month in clientNumber:
    print(month, clientNumber[month])

Mar 14
Jun 6
Jan 10
Apr 12
Feb 9
May 5


An alternative way of looking at the data by using list()

In [28]:
clientNumber = {'Jan': 10, 'Feb': 9, 'Mar': 14, 
                'Apr': 12, 'May': 5, 'Jun': 6}
list(clientNumber.items())

[('Mar', 14), ('Jun', 6), ('Jan', 10), ('Apr', 12), ('Feb', 9), ('May', 5)]

To solve this problem, we use the sort function to see if it works.  The example shows that it does not work as it sorts by alphabetical order. 

In [52]:
for month in sorted(clientNumber):
    print(month, clientNumber[month])

Apr 12
Feb 9
Jan 10
Jun 6
Mar 14
May 5


In our example, the order of the keys (months) in the keys-value pairs are important.  We wish to preserve the order in which they are entered.  

This is normally the case for data read in from a database when we wish to preserve the order in which they are read.   
We can use the OrderedDict function.  

OrderedDict is a function within the collections module and must be imported before it can be used.

The key-value pair is entered as a tuple (enclosed by ()) in the OrderedDict function.  The function can take only one argument so multiple tuples must be all combined into one list (encloded by []).  

The order in which the data is entered into the dictionary is preserved.

In [2]:
from collections import OrderedDict
clientNumberOrdered = OrderedDict([('Jan', 10), ('Feb', 9), ('Mar', 14), 
                                   ('Apr', 12), ('May', 5), ('Jun', 6)])
for month in clientNumberOrdered:
    print(month, clientNumberOrdered[month])

Jan 10
Feb 9
Mar 14
Apr 12
May 5
Jun 6


An alterantive way of entering the data in a dictionary (here we use OrderedDict function) is given below.  It looks unwieldly here but this is what actually happens if you read data in line by line from a database.
You must first define as empty dictionary before trying to fill it up with data in the manner shown below. 

In [58]:
clientNumberOrderedAlt = OrderedDict()    #define an empty OrderedDict dictionary
clientNumberOrderedAlt['Jan'] = 10
clientNumberOrderedAlt['Feb'] = 9
clientNumberOrderedAlt['Mar'] = 14
clientNumberOrderedAlt['Apr'] = 12
clientNumberOrderedAlt['May'] = 5
clientNumberOrderedAlt['Jun'] = 6
clientNumberOrderedAlt


OrderedDict([('Jan', 10),
             ('Feb', 9),
             ('Mar', 14),
             ('Apr', 12),
             ('May', 5),
             ('Jun', 6)])

## 4.0 Neat Things You Can Do With A Dictionary

### 4.1  Sparse Matrices

Suppose you have a 5x5 matrix and only 3 elements are non-zeros.  Matrices of this nature whereby most of the elements are zeros are called sparse matrix.

This matrix is represented as a list below.  However, this is a very inefficient method of representing a sparse matrix with a lot os 0's.

matrix = [[0, 0, 0, 1, 0],
          [0, 0, 0, 0, 0],
          [0, 2, 0, 0, 0],
          [0, 0, 0, 0, 0],
          [0, 0, 0, 3, 0]]

An alternative is to use a dictionary representation.

The keys will be a tuple containing the row and column positions of the non-zero elements.

Since there are only three non-zero elements, we need only three key-value pairs for this sparse matrix.

The non-zero elemtents are at the following positions:
(0,3) , (2,1) and (4,3) where the tuples represent (row index, column index).  The values of the non-zero elements in those positions are 1, 2 and 3 respectively.

Below is the dictionary representation of the sparse matrix.

In [1]:
matrix = {(0,3):1 , (2,1):2 , (4,3):3}
matrix[(4,3)]

3

Obviously, there is a high probability that a key in a sparse matrix is not in the dictionary.  In the above 5x5 matrix, there are only 3 cells with non-zero values.  The other 22 cells (row index, column index) will not be in the dictionary.

For example, the element in first row, first column (0, 0) is not in dictionary.

To prevent an error when we call key not in the dictionary, we use the .get() method and put 0 as the default values - that is, calling those row and column indexes not in the dictionary will result in a 0 being returned - which is the real value of that cell.

An illustration is provided below.

Calling (0,3) returns its value of 1.  
Calling the non-existent (0,0) returns the default value of 0.

In [20]:
print(matrix.get((0,3),0),'\n',

matrix.get((0,0),0))

1 
 0


### 4.2 Counting 

Dictionaries provide an elegant way to count the number of occurrences of letters or digits in a string.  
This can be used to generate frequency tables.
An illustration is given in the code below.

We first define a dictionary named letter_counts
The string that we want to count the occurrence of letter in this case is 'Mississippi'
Define a for loop to take one letter in seqeunce from the string 'Mississippi', starting with the letter 'M'.

Let us take a look at the expression:

    letter_counts[letter]=letter_counts.get(letter,0)+1

The left hand side letter_counts[letter] treats letter as the key of the dictionary letter_counts.  The key attached to the name letter varies with each iteration of the 'for' loop.  The key (letter) in the first iteration is 'M'.  In the second iteration, the key (letter) is 'i' and so on.

Since we take one letter from the enitre word for each iteration, we add 1 to the count of that letter taken.  This is what the right hand side of the expression does (the '+1') part.

The letter_counts.get(letter,0) part of the right hand side of the expression have two choices of output - if the key that is attached to the variable letter is found, it is already in the dictionary with a value (which is number of occurrences up to the current iteration).  Just add 1 to the existing value.

If the key is not found in the dictionary, this will be the first time it occurs.  The expression letter_counts.get(letter,0) will reutrn a value of 0 and the addition of 1 to letter_counts[letter] puts the key into the dictionary with a value of 1.

In [23]:
letter_counts = {}
for letter in 'Mississippi':
    letter_counts[letter]=letter_counts.get(letter,0)+1
letter_counts

{'M': 1, 'i': 4, 'p': 2, 's': 4}

There is a built-in function for counting.  It is called Counter and can be imported from the collections module.

In [28]:
from collections import Counter
Counter('Mississipi')

Counter({'M': 1, 'i': 4, 'p': 1, 's': 4})

You can use the command below if you are case-insensitve. 

In [29]:
from collections import Counter
Counter('Andorra la Vella')     #case sensitive

Counter({' ': 2,
         'A': 1,
         'V': 1,
         'a': 3,
         'd': 1,
         'e': 1,
         'l': 3,
         'n': 1,
         'o': 1,
         'r': 2})

In [32]:
Counter(map(str.lower,'Andorra la Vella'))     #case insensitive

Counter({' ': 2,
         'a': 4,
         'd': 1,
         'e': 1,
         'l': 3,
         'n': 1,
         'o': 1,
         'r': 2,
         'v': 1})