#### Dictionaries: A dictionary is like a list, but more general. In a list, the index positions have to be integers;
#### in a dictionary, the indices can be (almost) any type.

#### You can think of a dictionary as a mapping between a set of indices (which are called keys) and a set of values. 
#### Each key maps to a value. The association of a key and a value is called a key-value pair or sometimes an item.

In [4]:
eng2sp = dict()
print(eng2sp) # The curly brackets {}, represent an empty dictionary. To add items to the dictionary, you can use square brackets

{}


In [5]:
eng2sp['one'] = 'uno'

In [6]:
print(eng2sp)

{'one': 'uno'}


In [11]:
eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}
print(eng2sp) #In general, the order of items in a dictionary is unpredictable.


{'one': 'uno', 'two': 'dos', 'three': 'tres'}


In [None]:
#elements of a dictionary are never indexed with integer indices. Instead, you use the keys to look up the corresponding values

In [12]:
print(eng2sp['two'])

dos


In [13]:
len(eng2sp)

3

In [14]:
'one' in eng2sp

True

In [15]:
'uno' in eng2sp

False

In [20]:
vals = list(eng2sp.keys())
print(vals)

['one', 'two', 'three']


In [21]:
vals = list(eng2sp.values())
print(vals)

['uno', 'dos', 'tres']


In [22]:
'uno' in vals

True

In [1]:
'''The in operator uses different algorithms for lists and dictionaries. For lists, it
uses a linear search algorithm. As the list gets longer, the search time gets longer
in direct proportion to the length of the list. For dictionaries, Python uses an
algorithm called a hash table that has a remarkable property: the in operator
takes about the same amount of time no matter how many items there are in a
dictionary. I won’t explain why hash functions are so magical, but you can read
more about it at wikipedia.org/wiki/Hash_table.'''


'The in operator uses different algorithms for lists and dictionaries. For lists, it\nuses a linear search algorithm. As the list gets longer, the search time gets longer\nin direct proportion to the length of the list. For dictionaries, Python uses an\nalgorithm called a hash table that has a remarkable property: the in operator\ntakes about the same amount of time no matter how many items there are in a\ndictionary. I won’t explain why hash functions are so magical, but you can read\nmore about it at wikipedia.org/wiki/Hash_table.'

In [13]:
dictionary = dict()
key =0
fhand = open('mbox-short.txt')
for line in fhand:
    words = line.split()
    for word in words:
        dictionary[key] = word
        key = key+1
print (dictionary)        




In [9]:
print(dictionary[202])

source@collab.sakaiproject.org


In [14]:
print (dictionary.keys())

dict_keys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219,

In [15]:
print (dictionary.values())



In [18]:
#Dictionary as a set of counters
word = 'brontosaurus'
d = dict()
for c in word:
    if c not in d:
        d[c] = 1
    else:
        d[c] = d[c] + 1
print(d)

{'b': 1, 'r': 2, 'o': 2, 'n': 1, 't': 1, 's': 2, 'a': 1, 'u': 2}


In [19]:
#Get method
counts = { 'chunk' : 1, 'annie': 42, 'jan': 100}
print(counts.get('jan',0))

100


In [20]:
print(counts.get('tim', 0))

0


In [21]:
word = 'brontosaurus'
d = dict()
for c in word:
    d[c] = d.get(c,0) +1
print(d)

{'b': 1, 'r': 2, 'o': 2, 'n': 1, 't': 1, 's': 2, 'a': 1, 'u': 2}


In [16]:
#Dictionaries and files
#fname = input('Enter the file name: ')
try:
    fhand = open('romeo.txt')
except:
    print('File cannot be opened:', fname)
    exit()
counts = dict()
for line in fhand:
    words = line.split()
    for word in words:
        if word not in counts:
            counts[word] =1
        else:
            counts[word] +=1
print(counts)       
      
        

{'But': 1, 'soft': 1, 'what': 1, 'light': 1, 'through': 1, 'yonder': 1, 'window': 1, 'breaks': 1, 'It': 1, 'is': 3, 'the': 3, 'east': 1, 'and': 3, 'Juliet': 1, 'sun': 2, 'Arise': 1, 'fair': 1, 'kill': 1, 'envious': 1, 'moon': 1, 'Who': 1, 'already': 1, 'sick': 1, 'pale': 1, 'with': 1, 'grief': 1}


In [3]:
#Looping and dictionaries
counts = { 'chuck' : 1 , 'annie' : 42, 'jan': 100}
for key in counts:
    print(key, counts[key])


chuck 1
annie 42
jan 100


In [4]:
counts = { 'chuck' : 1 , 'annie' : 42, 'jan': 100}
for key in counts:
    if counts[key] > 10:
        print(key, counts[key])

annie 42
jan 100


In [7]:
#Dictionary doesn't have sorting function. Put Dictionary keys into a list and sort
counts = { 'chuck' : 1 , 'annie' : 42, 'jan': 100}
lst = list(counts.keys())
print(lst)
lst.sort()
for key in lst:
    print(key,counts[key])

['chuck', 'annie', 'jan']
annie 42
chuck 1
jan 100


In [17]:
#Advanced text parsing
import string
string.punctuation


'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [27]:
#We use translate to remove all punctuation and lower to force the line to lowercase.
Otherwise the program is unchanged.
import string
fname = input('Enter a file name: ')
try:
    fhand = open(fname)
except:
    print('File cannot be opened:'. fname)
    exit()
    
counts = dict()
for line in fhand:
    line = line.rstrip()
    line = line.translate(line.maketrans('', '', string.punctuation))
    

line = line.lower()
words = line.split()
for word in words:
    if word not in counts:
        counts[word] = 1
    else:
        counts[word]+=1
print(counts)

Enter a file name: romeo.txt
{'who': 1, 'is': 1, 'already': 1, 'sick': 1, 'and': 1, 'pale': 1, 'with': 1, 'grief': 1}
