# Lesson 4.7: Dictionaries
# Activity 7A: What Are Dictionaries?


## Introduction

When you parse files you need a data type that can store information about what you retrieve. So if you have a number of usernames, IP addresses, or ports and you want to count them, a dictionary with its key/value pair will be useful.
* The username will be the key
* The value will be the number of logins by this user

Once you have that data organized you can order it by the highest number of logins.

Another reason to use a dictionary is to store information in an easy-to-read way by parsing information in a log file to a key/value form.

When you parse a file you need a data type that can store the information you retrieve. We’ve already covered one data type, lists, that can store multiple pieces of information. But dictionaries are far more flexible. 

For example, let’s say we have a list of IP addresses, with each address belonging to a specific user’s machine. 

IP_addresses
* `115.34.225.201 #Marta`
* `166.199.144.36 #TaeHo`
* `97.188.82.46 #Phil`
* `202.44.140.187 #Uzo` 
* `154.236.84.239 #Julio`

You can retrieve a particular user’s IP address via the index. 

`Marta_IP = IP_addresses[0]`

However, you would have to remember which index is associated with each specific user. It would be better to do the following program to retrieve Marta’s IP. In fact, Python has a data structure that allows us to extract data. Let’s see how we can create a data type that would allow us to associate an IP address with a specific user. 

In [None]:
IP_addresses = {'Marta': '115.34.225.201', 'TaeHo': '166.199.144.36', 'Phil': '97.188.82.46' , 'Uzo': '202.44.140.187'} 

Now if you want to retrieve Marta's IP it can be done by using this key/value connection.

In [None]:
Marta_ip = IP_addresses['Marta']
print(Marta_ip)

You can do the same for each of the IPs.

In [None]:
Phil_ip = IP_addresses['Phil']
print(Phil_ip)

Let's take another example:

In [None]:
dict1={"Name":"Paul", "Telephone":"(909)-(111111)", "Address":"155 Jasper", "State":"California"}

If you want to access any of the fields, say the key you are interested in.
    

In [None]:
print("The employee's name is: "+dict1["Name"])
print("The employee's telephone is "+dict1["Telephone"])
print("The employee's address is: " + dict1["Address"])
print("The employee's state is: " + dict1["State"])

Accessing a key that does not exist will raise an error.

In [None]:
print(dict1["email"])

## Instructor Demo
`dict_1 = {"Name":"sarah", "age":33, "years of experience":22}`

Write a script that:
* Prints the name
* Prints the age 
* Prints the years of experience 

In [None]:
dict_1={"Name":"sarah", "age":33, "years of experience":22}
print("The name is: "+dict_1["Name"])
print("The age is: "+ str(dict_1["age"]))
print("Years of experience: "+ str(dict_1["years of experience"]))

## Student Exercise
### Problem 1 

`thisdict = {"brand":"Ford", "model": "Mustang", "year": 1964}`

Write a script that:
* Prints the brand of the car
* Prints the model of the car 
* Prints the year of the car 

### Problem 2 
`dict = {'Name':'Zara', 'Age':7, 'email':'Zara@email.com'}`

Write a script that:
* Prints the name 
* Prints the age
* Prints the email 

# Activity 7B: Creating Dictionaries
## Introduction
The syntax to create a dictionary is as simple as placing items in `{}` brackets and a colon `:` between the key and the value.

`dictionary_name = {"key":value}`

An item has a key and a corresponding value that is expressed as a pair (`key:value`). The key can be a string, number, or tuple. Each key in a dictionary must be unique. Values can be any data type, including lists and dictionaries. They do not need to be unique. 

Use the following scripts to create a new dictionary.

In [None]:
dict_1 = {}

To add a `key,value` pair:

In [None]:
dict_1["user_name"] = "Paul"

In [None]:
print(dict_1)

We can keep adding information:

In [None]:
dict_1["password"] = "passwd"
print(dict_1)

In [None]:
dict_1["IP"] = "10.10.2.1"
print(dict_1)

## Changing and Adding Dictionary Elements
Dictionaries are mutable. We can add new items or change the value of existing items using an assignment operator. If the key is already present, then the existing value gets updated. If the key is not present, a new `key:value` pair is added to the dictionary.

In [None]:
dict_1['IP'] = "10.0.2.2"
print(dict_1)

To add, create a `key:value` pair:

In [None]:
dict_1['Port'] = 80
print(dict_1)

## Instructor Demo
`keys = ['Ten', 'Twenty', 'Thirty']`

`values = [10, 20, 30]`

Write a script that creates a dictionary from these two lists.

In [None]:
keys = ['Ten', 'Twenty', 'Thirty']
values = [10, 20, 30]

dict_1 = {"Ten":10, "Twenty":20, "Thirty":30}
print(dict_1)

`dict2 = {'Thirty': 30, 'Forty': 40, 'Fifty': 50}`

Write a script that:
* Adds a `key,pair` value of  `Sixty:60` to the dictionary
* Prints the list 

In [None]:
dict2 = {'Thirty': 30, 'Forty': 40, 'Fifty': 50}
dict2["Sixty"]=60
print(dict2)

## Student Exercise
### Problem 1 
`dict_1 = {'Name':'Kelly', 'designation':'Developer', 'salary': 8000}`

Write a script that:
* Adds a `key:value` pair of `city, New York`
* Prints the new list

### Problem 2
`dict_2 = {'Name':'Bob', 'designation':'accountant', 'salary': 9000}`

Write a script that:
* Updates Bob's salary to 8000
* Prints the new list 

# Activity 7C: Dictionary Methods
* `clear()`: Removes all items from the dictionary.
* `get(key[,d])`: Returns the value of the key. If the key does not exist, returns d (defaults to None).
* `items()`: Returns a new object of the dictionary's items in (key, value) format.
* `keys()`: Returns a new object of the dictionary's keys.
* `pop(key[,d])`: Removes the item with the key and returns its value or d if key is not found. If d is not provided and the key is not found, it raises KeyError.
* `popitem()`: Removes and returns an arbitrary item (key, value). Raises KeyError if the dictionary is empty.
* `update([other])`: Updates the dictionary with the key/value pairs from other, overwriting existing keys.
* `values()`: Returns a new object of the dictionary's values.

## Instructor Demo
`sample_dict = {'a': 100, 'b': 200, 'c': 300}`

Write a script that:
* Iterates over the dictionary 
* Prints only the keys of this dictionary 

In [None]:
sample_dict = {'a': 100, 'b': 200, 'c': 300}
for k in sample_dict.keys():
    print(k)

`sample_dict = {'a': 100, 'b': 200, 'c': 300}`

Write a script that:
* Checks if this dictionary has the value 200
* If yes, prints "200 is present in this dict"

In [None]:
sample_dict = {'a': 100, 'b': 200, 'c': 300}
if 200 in sample_dict.values():
    print('200 is present in a dict')

Write a script that:
* Iterates over the dictionary
* Checks for the highest grade 
* Prints the highest grade 

In [None]:
sample_dict = {
    'Physics': 82,
    'Math': 65,
    'history': 75
}
max=0

for i in sample_dict.values():
    if i>max:
        max=i
print(max)

Write a script that deletes the key/value pair of 'mani'

In [None]:
test_dict = {"Arushi": 22, "Mani": 21, "Haritha": 21}
 
# Printing dictionary before removal
print("The dictionary before performing remove is : ", test_dict)
 
# Using del to remove a dict
# removes Mani
del test_dict['Mani']

## Student Exercise
### Problem 1
`test_dict = {'sarah' : 3, 'sam' : 7, 'bob' : 10, 'hans' : 6, 'laura' : 30}`

Write a script that:
* Iterates over this dictionary
* Prints all the key, value pairs under 7 

### Problem 2 
`sample_dict = {"name":"Kelly", "age":25, "salary":8000, "city":"New York"}`

Write a script that: 
* Deletes the city and salary key and value pairs
* Prints the new dictionary

### Problem 3
`sample_dict = {"laura":5000, "sarah":2000, "kelly":8000, "bob":9000}`

Write a script that:
* Iterates over the salaries dictonary 
* Prints the highest salary 

# Activity 7D: Storing Results Into a Dictionary
## Instructor Demo

Let’s use this new data type to continue parsing the file from our last lesson:
`/voc/public/passwd`

In [None]:
with open(".voc/public/passwd") as f:
    for line in f.readlines():
        dict = {}
        list = line.split(":")
        
        
        user = list[0]
        dict["user"] = user
        description = list[4]
        
        dict["description"] = description
        home_dir = list[5]
        dict["home_dir"] = home_dir
        print(dict)
        print("***************")

Parse the `network.log` file and add the results to a dictionary.

In [None]:
with open(".voc/public/access1.log") as f:
   for line in f.readlines():
   #Initialize a dictionary 
     Dict_1 = {}
     #parse the line 
     list1 = line.split()
     ip_src = list1[0]
     date = list1[3]
     Http_method = list1[5]
     response_code = list1[8]
     user_agent = list1[11]
#create a dictionary for different data
     Dict_1 = {"ip_src":ip_src,"date":date,"http_method":Http_method,"response_code":response_code,"user_agent":user_agent}
     print(Dict_1)

## Student Exercise

## Student Exercise
### Problem 1 
Write a script that:
* Opens the `iptables` log 
* Iterates over the file line
* Parses out the ip_src, id_dst, port_src and port_dst
* Creates a dictionary from each data point
* Prints the dictionaries 

### Problem 2
Write a script that:
* Opens the `network` log
* Iterates over the file 
* Parses out the ip_src, ip_dst, port_src and port_dst
* Creates a dictionary from each data point
* Prints the dictionaries

# Activity 7E: Counting and Ordering 
Another use of dictionaries is the abilty to store the number of times a value appears in a log in an organized way. Now, the value itself is the key. If you want to count how many times a `source_ip` appears in a log, you can use a dictionary.

Let's use the same `apache` log we used before.

In [None]:
with open(".voc/public/access1.log") as f:
   Dict_1 = {}
   for line in f.readlines():
   #Initialize a dictionary 

     #parse the line 
     list1 = line.split()
     ip_src = list1[0]
     if ip_src not in  Dict_1:
        Dict_1[ip_src] = 1
     else:
        Dict_1[ip_src] += 1
   print(Dict_1)

You can do the same thing to user_agent.

In [None]:
with open(".voc/public/access1.log") as f:
   Dict_2 = {}
   for line in f.readlines():
   #Initialize a dictionary 
     
     #parse the line 
     list1 = line.split()
     user_agent = list1[11]

     if user_agent not in  Dict_1:
        Dict_2[user_agent] = 1
     else:
        Dict_2[user_agent] += 1
     
    
   #print(Dict_1)
   for k,v in Dict_2.items():
        print(k+":"+str(v))
        print("******************")

## Student Exercise
### Problem 1 
Write a script that:
* Opens the `iptables` log
* Iterates over the file
* Counts each destination IP
* Puts the results in a dictionary 

### Problem 2
Write a script that:
* Opens the `network` log 
* Iterates over the file 
* Counts each destination IP 
* Puts the results in a dictionary 

# Activity 7F: Ordering a List
Now that you have a count of every value that we're interested in, order it to get the highest number. There are a couple of ways to do that.

The `sorted()` function can accept three parameters: the iterable, the key, and reverse:
`sorted(iterable, key, reverse)`

* The first argument is to iterate through. In our case it is 
`Dict_1.items()` which will give us the keys and the values.
* The second argument would be the field to sort through. In our case it is the second field or [1]. So we will add this argument.
* The third argument is if we want to reverse the list or not.

In [None]:
ordered_list = sorted(Dict_1.items(), key = lambda x: x[1],reverse = True)
print(ordered_list)
# The (), key = lambda

In [None]:
ordered_list = sorted(Dict_2.items(), key = lambda x: x[1],reverse = True)
print(ordered_list)


##  Instructor Demo
Write a script that:
* Opens the `dns.log` file
* Iterates over the file
* Counts each domain that appears in the log
* Creates a dictionary that stores the count
* Orders the dictionary 
* Prints the dictonary

In [None]:
def count_query(f):
   Dict_1 = {}
   for  line in f.readlines():
      
         list_1 = line.split()
         if "A" in list_1:   
            domain = list_1[9]
            if domain in Dict_1:
    
               Dict_1[domain] += 1
            else:
               Dict_1[domain] = 1
  
   ordered_list=sorted(Dict_1.items(), key = lambda x: x[1],reverse = True)
   return(ordered_list)

def main():

   with open(".voc/public/dns.log") as f:
      print(count_query(f))
      
main()

## Student Exercise
### Problem 1 
Write a script that:
* Opens the `http.log` file
* Iterates over the file
* Counts each domain that appears in the log
* Creates a dictionary that stores the count
* Orders the dictionary 
* Prints the dictonary