## Dictionaries

A dictionary is like a list, but more general. In a list, the positions (a.k.a. indices) have to be integers; in a dictionary the indices can be (almost) any type.

You can think of a dictionary as a mapping between a set of indices (which are called keys) and a set of values. Each key maps to a value. The association of a key and a value is called a key-value pair or sometimes an item

### An Example of Dictionary

Let us build a dictionary that maps English to Spanish words. In this dictionary the keys will be some english words and their corresponding values will be the corresponding spanish words

In [1]:
eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}

In [2]:
eng2sp

{'one': 'uno', 'two': 'dos', 'three': 'tres'}

In [3]:
type(eng2sp)

dict

In [6]:
dic = {1.01:'one', 'two':2, 3:[1,2,3]}

In [7]:
dic

{1.01: 'one', 'two': 2, 3: [1, 2, 3]}

In [9]:
marks = {"andy":{"ML":44, "ST":76, "CP":88},
         "bill":{"ML":48, "ST":72, "CP":78}}

In [10]:
marks["andy"]

{'ML': 44, 'ST': 76, 'CP': 88}

In [11]:
marks["andy"]["ST"]

76

In [12]:
marks["bill"]

{'ML': 48, 'ST': 72, 'CP': 78}

In [13]:
marks["bill"]["CP"]

78

In [14]:
marks2 = [{"ML":44, "ST":76, "CP":88},
          {"ML":48, "ST":72, "CP":78}]

In [15]:
marks2[0]

{'ML': 44, 'ST': 76, 'CP': 88}

**Note:** 

* A dictionary is created using a curly bracket

* The values in a dictionary are indexed by keys. The keys are needed to be specified while creating a dictionary. 

* As an example, 'one':'uno' represents a key-value pair.

*What if we enclose some values (separated by commas) in a curly bracket? What type of object do we get?*

* *A dictionary with only keys.*
* *A dictionary with elements indexed by integer*
* *This object is not a dictionary*

In [101]:
p = {1,2,3,4}

In [102]:
type(p)

set

### Slicing a dictionary

In [16]:
key = ["one", "three"]

for k in key:
    print(eng2sp[k])

uno
tres


In [17]:
key = ["one", "two", "three"]

for k in key[0:2]:
    print(eng2sp[k])

uno
dos


### Adding Key-Value in a dictionary

In [36]:
dic = {1.01:'one', 'two':2, 3:[1,2,3]}

In [38]:
dic["new"]=5

In [37]:
dic

{1.01: 'one', 'two': 2, 3: [1, 2, 3]}

In [39]:
dic["new"]=8

In [40]:
dic

{1.01: 'one', 'two': 2, 3: [1, 2, 3], 'new': 8}

### in Operator for Dictionary

The in operator works on dictionaries; it tells you whether something appears as a key in the dictionary

In [41]:
'one' in eng2sp

True

In [42]:
'uno' in eng2sp

False

### Counting the Frequency Distribution of Letters in a Word

In [None]:
#Frequency distriution of the letters in 'banana'

s = 'I love pizza'


In [None]:
count = {}

for letter in s:
    
    if letter not in count:
        count[letter] = 1
    else:
        count[letter] += 1
        
print(count)



solve this with the concepts you have learnt in (1) list and (2) variables.


Create a function which will take a string as an input and return the frequency distribution of the letters in the string.

freq_distn(string):


**The 'get' method for dictionary**

 Dictionaries have a method called get that takes a key and a default value. If the key appears in the dictionary, get returns the corresponding value; otherwise it returns the default value. 

In [29]:
count.get("a")

5

In [30]:
count.get("j", -1)

-1

**Making the counting simple using the 'get' method**

In [None]:
count = {}

for letter in s:
    
    if letter not in count:
        count[letter] = 1
    else:
        count[letter] += 1
        
print(count)

In [None]:
count = {}

for letter in s:
    count[letter] = count.get(letter, 0) + 1
    
print(count)    

### A Common Use of Dictionaries

One of the common uses of a dictionary is to count the occurrence of words in a ﬁle with some written text.

In [None]:
#Exercise: Read the Ashop1.txt file
ashop1 = open('datasets/Ashop1.txt')


In [66]:
ashop1 = open('datasets/Ashop1.txt')
# count = 0
freq = {}

for line in ashop1:                            # loop over each line
    line = line.rstrip()                       # removing '\n' and extra spaces
    word_list = line.split()                   # spliting line in words having space as delimiter
    
    for word in word_list:
        # print(word)
        freq[word] = freq.get(word, 0) + 1     # adding the word if not present in dictionary else increasing the counter
        # count += 1

# print(count)
print(freq)

{'The': 1, 'Cock': 2, 'and': 3, 'the': 6, 'Pearl': 2, 'A': 1, 'cock': 1, 'was': 1, 'once': 1, 'strutting': 1, 'up': 1, 'down': 1, 'farmyard': 1, 'among': 1, 'hens': 1, 'when': 1, 'suddenly': 1, 'he': 2, 'espied': 1, 'something': 1, 'shinning': 1, 'amid': 1, 'straw': 2, 'Ho': 1, 'ho': 1, 'quoth': 2, 'thats': 1, 'for': 3, 'me': 2, 'soon': 1, 'rooted': 1, 'it': 2, 'out': 2, 'from': 1, 'beneath': 1, 'What': 1, 'did': 1, 'turn': 1, 'to': 2, 'be': 2, 'but': 2, 'a': 4, 'that': 3, 'by': 1, 'some': 1, 'chance': 1, 'had': 1, 'been': 1, 'lost': 1, 'in': 1, 'yard': 1, 'You': 1, 'may': 1, 'treasure': 1, 'Master': 1, 'men': 1, 'prize': 2, 'you,': 1, 'I': 1, 'would': 1, 'rather': 1, 'have': 1, 'single': 1, 'barley-corn': 1, 'than': 1, 'peck': 1, 'of': 1, 'pearls': 1, 'Precious': 1, 'things': 1, 'are': 1, 'those': 1, 'can': 1, 'them': 1}


**Exercise:** Find the frequency distribution of the of the words present in this file. 


Note: The printing is not in any particular order

In [70]:
#Printing words with count greater than or equal to 2
ashop1 = open('datasets/Ashop1.txt')
freq = {}

for line in ashop1:                            # loop over each line
    line = line.rstrip()                       # removing '\n' and extra spaces
    word_list = line.split()                   # spliting line in words having space as delimiter
    
    for word in word_list:
        freq[word] = freq.get(word, 0) + 1     # adding the word if not present in dictionary else increasing the counter
        
for key in freq:
    if freq[key] >= 2:
        print(freq[key], key)

2 Cock
3 and
6 the
2 Pearl
2 he
2 straw
2 quoth
3 for
2 me
2 it
2 out
2 to
2 be
2 but
4 a
3 that
2 prize


In [18]:
#Printing keys in alphabetical order:

#But before that....
#Method keys - makes a dict of keys

ashop1 = open('datasets/Ashop1.txt')
# count = 0
freq = {}

for line in ashop1:                            # loop over each line
    line = line.rstrip()                       # removing '\n' and extra spaces
    word_list = line.split()                   # spliting line in words having space as delimiter
    
    for word in word_list:
        # print(word)
        freq[word] = freq.get(word, 0) + 1     # adding the word if not present in dictionary else increasing the counter
        # count += 1

# print(count)
# print(freq)

key_list = list(freq.keys())                   # getting keys of freq in a listt
print(key_list)

for i in range(len(key_list)-1):               # sorting the list
    
    for j in range(i+1, len(key_list)):
        
        if key_list[i] > key_list[j]:
            temp = key_list[i]
            key_list[i] = key_list[j]
            key_list[j] = temp
            
print(key_list)

['The', 'Cock', 'and', 'the', 'Pearl', 'A', 'cock', 'was', 'once', 'strutting', 'up', 'down', 'farmyard', 'among', 'hens', 'when', 'suddenly', 'he', 'espied', 'something', 'shinning', 'amid', 'straw', 'Ho', 'ho', 'quoth', 'thats', 'for', 'me', 'soon', 'rooted', 'it', 'out', 'from', 'beneath', 'What', 'did', 'turn', 'to', 'be', 'but', 'a', 'that', 'by', 'some', 'chance', 'had', 'been', 'lost', 'in', 'yard', 'You', 'may', 'treasure', 'Master', 'men', 'prize', 'you,', 'I', 'would', 'rather', 'have', 'single', 'barley-corn', 'than', 'peck', 'of', 'pearls', 'Precious', 'things', 'are', 'those', 'can', 'them']
['A', 'Cock', 'Ho', 'I', 'Master', 'Pearl', 'Precious', 'The', 'What', 'You', 'a', 'amid', 'among', 'and', 'are', 'barley-corn', 'be', 'been', 'beneath', 'but', 'by', 'can', 'chance', 'cock', 'did', 'down', 'espied', 'farmyard', 'for', 'from', 'had', 'have', 'he', 'hens', 'ho', 'in', 'it', 'lost', 'may', 'me', 'men', 'of', 'once', 'out', 'pearls', 'peck', 'prize', 'quoth', 'rather', 'r

### Advanced Text Parsing

The actual text for this particular Ashop's fable is given in the file Ashop.txt. The actual file has lots of punctuations. We should also take care of the case sensitivity.

In [90]:
#Before we do so lets look at some other thing.

#1. punctuation
import string
p = string.punctuation
p

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [None]:
#2. translate method for string
# str.maketrans()

#This uses the 3-argument version of str.maketrans with arguments (x, y, z) where 'x' and 'y'
# must be equal-length strings and characters in 'x' are replaced by characters in 'y'. 'z'
# is a string (string.punctuation here) where each character in the string is mapped to None



In [79]:
trans = str.maketrans("w", "W", "!")

In [73]:
trans

{119: 87, 33: None}

In [77]:
string = "wwoooowww!!!"

In [78]:
string.translate(trans)

'WWooooWWW'

In [84]:
t = str.maketrans("abc", "xyz", "Go")
s = 'asdbbolcacGbkoek'

In [85]:
s.translate(t)

'xsdyylzxzykek'

In [88]:
t1 = str.maketrans("abc", "xyz", "ao")
s1 = 'asdbbolcacGbkoek'

In [89]:
s1.translate(t1)

'sdyylzzGykek'

In [100]:
ashop1 = open('datasets/Ashop1.txt')
t = str.maketrans("", "", string.punctuation)

for line in ashop1:                            # loop over each line
    line = line.lower()                        # converting into lowercase
    #line = line.rstrip()
    line = line.translate(t)                   # remove the punctuations
    
    print(line) 

the cock and the pearl



a cock was once strutting up and down the farmyard among the

hens when suddenly he espied something shinning amid the straw

ho ho quoth he thats for me and soon rooted it out from

beneath the straw  what did it turn out to be but a pearl that by

some chance had been lost in the yard  you may be a treasure

quoth master cock to men that prize you but for me i would

rather have a single barleycorn than a peck of pearls



precious things are for those that can prize them


In [97]:
st = 'ad-gd  fakldj, dsfj$sfd'
st.translate(t)

'adgd  fakldj dsfjsfd'