## Data structures continued

## Dictionaries 

Dictionaries are used to store data in key value pairs. You can think about the key in a dictionary as a key to a vault where the values are stored.   

You can access the element/elements stored in the value by using the particular **key**.   

**Curly braces** are used to define a dictionary. 

Different **key-value pairs are separated by commas**.

<!-- We store the data behind the scenes in a hash map. This means that we use the key to generate a unique index (called a hash) and store the value in the location marked by that index. This makes retrieval very fast. -->

In [1]:
contacts = {'John': '312-555-1234', 'Paul': '312-555-3123', 'George': '312-555-3333', 'Ringo': '312-555-2222'}

In [2]:
type(contacts)

dict

Here we have defined a dictionary and we saved it in a variable named `contacts`.  

A key-value pair is defined using a colon **`:`** - `key : value`.

Different key value pairs here are: 
- "John" : "312-555-1234"
- "Paul" : "312-555-3123" 
- "George" : "312-555-3333" 
- "Ringo" : "312-555-2222"

All the **keys** in a dictionary **are always unique**, the *values can be repeated*.

### Accessing values using the keys

In [3]:
contacts['John']  # This is very similar to lists except that here we are using keys as indexes

'312-555-1234'

In [4]:
contacts['Paul']

'312-555-3123'

**Note:** The keys are case sensitive. For eg. if you use the following code, it will produce an error as shown:

In [5]:
contacts['PAUL']

KeyError: 'PAUL'

Check which keys are there in the dictionary using **`.keys()`** method.

In [7]:
contacts.keys()

dict_keys(['John', 'Paul', 'George', 'Ringo'])

Similarly we can also check the values in the dictionary using **`.values()`** mehod.

In [8]:
contacts.values()  # But this directly does not tell us to which keys are these values associated 

dict_values(['312-555-1234', '312-555-3123', '312-555-3333', '312-555-2222'])

If you closely observe, the values in the above result are stored in square brackets. Hence, we can say that 
`contacts.values()` returns the output as a list.

### Add new elements to a dictionary

To add new key-value pair in a dictionary, we use a very simple syntax.

You have to give the name of the key in square brackets and then assign value to it:

`<Name of dictionary>[<key>] = value`

In [9]:
contacts['Himanshu'] = '480-111-2222'  
# It is important that the key has to be unique 

In the previous example, the keys and values stored in the dictionary were only strings. But we can have any
kind of data type for the keys and values.

In [10]:
student_names = {1: 'John', 2: 'Smith', 3: 'Matt', 4: 'Jimmy', 5: 'Sue'}
student_names[1]

'John'

In [11]:
student_names[2]

'Smith'

The main idea here is that the values are identified in a dictionary using the keys. 

Python is really powerful in the sense that it provides user the flexibility to create more complicated data structures. 

Values inside a dictionary can be lists or dictionaries itself (we call them *nested* dictionaries). We will look at some of those examples.

In [13]:
student_data = {
    'Name': 'Himanshu', 
    'E-mail': 'himanshuagg@gmail.com', 
    'Age': 28, 
    'subjects': ['math', 'science', 'history', 'geography'] 
}

In [14]:
student_data.keys()

dict_keys(['Name', 'E-mail', 'Age', 'subjects'])

In [15]:
# We will again emphasize here that the keys are case sensitive 

In [16]:
student_data['Name']

'Himanshu'

In [17]:
student_data['E-mail']

'himanshuagg@gmail.com'

In [18]:
student_data['Age']

28

In [19]:
student_data['subjects']

['math', 'science', 'history', 'geography']

**Note**: Value associated with the key 'subjects' is a list. The key uniquely identifies the value linked with the key, in this case a list. 

According to this, if we want to access 'science', we need first to access the `subjects` key in the dictionary. Then, as the `value` is a list, we need to use the corresponding element of the list.

In [20]:
student_data['subjects'][1]

'science'

### Exercises

1. Use the dictionary given below. This is a dictionary that shows the frequency of the words in a text paragraph. The keys are the words, and the values are its frequency.


In [2]:
word_freq = {'love': 25, 'conversation': 1, 'every': 6, "we're": 1, 'plate': 1, 'sour': 1, 'jukebox': 1, 'now': 11, 'taxi': 1, 'fast': 1, 'bag': 1, 'man': 1, 'push': 3, 'baby': 14, 'going': 1, 'you': 16, "don't": 2, 'one': 1, 'mind': 2, 'backseat': 1, 'friends': 1, 'then': 3, 'know': 2}

Answer the following questions:   

- How many *key-value* pairs are in this dictionary?
- What *keys* are present in this dictionary? 
- What is the frequency of following words in the dictionary: 
    - 'friends'
    - 'taxi' 
    - 'jukebox'
- Is the word `begin` present in the dictionary?
- Add the following words and their frequencies to the dictionary:
        - 'begin': 1
        - 'start': 2
        - 'over': 1
        - 'body': 17
- Use the following code to convert the result from `word_freq.keys()` to a list: `list(word_freq.keys())`.
  - Store the results of the above code in a variable called `word`.
  - What is the first word in the dictionary? What is the frequency of that word?
  - What is the last word in the dictionary? What is the frequency of that word?

In [20]:
#How many key-value pairs are in this dictionary?
len(word_freq) #23

#What keys are present in this dictionary?
word_freq.keys() #dict_keys(['love', 'conversation', 'every', "we're", 'plate', 
#'sour', 'jukebox', 'now', 'taxi', 'fast', 'bag', 'man', 'push', 'baby', 'going', 'you', 
#"don't", 'one', 'mind', 'backseat', 'friends', 'then', 'know'])

#What is the frequency of following words in the dictionary:
word_freq['friends'] #1
word_freq['taxi'] #1
word_freq['jukebox'] #1

#Is the word begin present in the dictionary? NO
#word_freq['begin'] 

#Add the following words and their frequencies to the dictionary:
word_freq['begin'] = 1
word_freq['start'] = 2 
word_freq['over'] = 1
word_freq['body'] = 17

word_freq

{'love': 25,
 'conversation': 1,
 'every': 6,
 "we're": 1,
 'plate': 1,
 'sour': 1,
 'jukebox': 1,
 'now': 11,
 'taxi': 1,
 'fast': 1,
 'bag': 1,
 'man': 1,
 'push': 3,
 'baby': 14,
 'going': 1,
 'you': 16,
 "don't": 2,
 'one': 1,
 'mind': 2,
 'backseat': 1,
 'friends': 1,
 'then': 3,
 'know': 2,
 'begin': 1,
 'start': 2,
 'over': 1,
 'body': 17}

In [41]:
#Use the following code to convert the result from word_freq.keys() to a list: list(word_freq.keys()).
#Store the results of the above code in a variable called word.

word = list(word_freq.keys())
#What is the first word in the dictionary? What is the frequency of that word? 
word[0] #love
#What is the last word in the dictionary? What is the frequency of that word? 
word[-1] #body

#both frequencies are only 1 each

'body'

2. Can a dictionary have two key-value pairs with the same key?

In [23]:
# no

3. Can a dictionary have two key-value pairs with the same value but different keys?

In [24]:
# yes

# Additional Content: Nested Dictionaries 

Students will not be assesed on this. 

In [25]:
contacts = {1: {'Name' : 'John' , 'Phone':'312-555-1234'},
            2: {'Name' : 'Paul' , 'Phone':'312-555-3123'},
            3: {'Name' : 'George' , 'Phone':'312-555-3333'},
            4: {'Name' : 'Ringo' , 'Phone':'312-555-2222'}}

In [26]:
# In this case we have a dictionary as the values associated with the keys here. 
type(contacts)

dict

In [27]:
contacts

{1: {'Name': 'John', 'Phone': '312-555-1234'},
 2: {'Name': 'Paul', 'Phone': '312-555-3123'},
 3: {'Name': 'George', 'Phone': '312-555-3333'},
 4: {'Name': 'Ringo', 'Phone': '312-555-2222'}}

In [28]:
contacts.keys()

dict_keys([1, 2, 3, 4])

In [29]:
contacts.values()

dict_values([{'Name': 'John', 'Phone': '312-555-1234'}, {'Name': 'Paul', 'Phone': '312-555-3123'}, {'Name': 'George', 'Phone': '312-555-3333'}, {'Name': 'Ringo', 'Phone': '312-555-2222'}])

In [30]:
print(contacts[1])

{'Name': 'John', 'Phone': '312-555-1234'}


As you can see, the `keys` of a dictionary can be either:

- strings 
- numbers

In this above example, the dictionary `keys` which are numbers. In this particular case, the `values` are also dictionaries.

In [31]:
contacts[1].keys()

dict_keys(['Name', 'Phone'])

In [32]:
contacts[1].values()

dict_values(['John', '312-555-1234'])

In [33]:
contacts[1]['Name']

'John'

In [34]:
contacts[1]['Phone']

'312-555-1234'

You can also have lists in which each element is a dictionary. Therefore the amount of possible combinations and levels of nesting is endless.

In [35]:
my_data = [{'Mike': [25,23000],'Jane': [38, 40000],'Bill': [45,35000]},{'Developers': ['Mike','Bill']},{'HR': ['Jane']}]
my_data

[{'Mike': [25, 23000], 'Jane': [38, 40000], 'Bill': [45, 35000]},
 {'Developers': ['Mike', 'Bill']},
 {'HR': ['Jane']}]

This is an example of a very nested list with three elements, each of them being a dictionary. In such cases, it is very important to know what you have in your hands. What type of variable is `my_data`?

In [36]:
type(my_data)

list

Then let's access the first element of the list. As this is a **list**, we need to access by **index**.

In [37]:
my_data[0]

{'Mike': [25, 23000], 'Jane': [38, 40000], 'Bill': [45, 35000]}

This element is a dictionary as it can be clearly seen. In case of doubt we can use the function `type()`.

In [38]:
type(my_data[0])

dict

Then let's access the data of 'Jane'. As this element is a **dictionary** we need to access by **key**.

In [39]:
my_data[0]['Jane']

[38, 40000]

In return, we are getting another **list**. To access the last element of this **list**, we need to access it by **index**.

In [40]:
my_data[0]['Jane'][1]

40000

## Summary

It is always crucial to know what kind of variable you have in your hands because this determines how to retrieve the data. 

- lists -> accessed by **index**
- dictionaries -> accessed by **key** (misleadingly, the key can be a number but usually is a string)

Lists are easily identified because they start and end with square brackets: `[ ]`

In contrast, dictionaries start and end with curly brackets: `{ }`.