# Dictionaries

Dictionaries are used everywhere in python
<li><b>classes</b></li>
<li><b>modules</b></li>
<li><b>functions</b></li>
<li><b>namespaces</b></li>
<li><b>sets</b></li>

In [None]:
import webbrowser

In [None]:
website = 'https://docs.python.org/3/library/stdtypes.html#dict'
webbrowser.open(website)

#### What objects can be keys?
The object must be hashable (and therefore, immutable)

In [51]:
hash(10)

10

In [52]:
hash('abx')

9196328632032270519

In [53]:
tup = (10.1)
hash(tup)

230584300921368586

In [54]:
lst = [1,2,3]
hash(lst)

TypeError: unhashable type: 'list'

### Creating Dictionaries

#### a. Using Literals

In [1]:
a = {'name': 'John', 2:20, 'key2':'London'}
a

{'name': 'John', 2: 20, 'key2': 'London'}

##### b. Using the constructor

We can also use the class constructor `dict()` in different ways.
Note: The restriction here is that the key names must be valid Python identifiers, since they are being used as argument names.

In [2]:
d = dict(a=100, b=200)
d

{'a': 100, 'b': 200}

We can also build a dictionary by passing it an iterable containing the keys and the values.
The restriction here is that the elements of the iterable must themselves be iterables with exactly two elements.

In [3]:
d = dict([ ('a', 100), ('b', 200) ] )
d

{'a': 100, 'b': 200}

In [4]:
d = dict(( ('a', 100), ('b', 200) ) )
d

{'a': 100, 'b': 200}

In [5]:
d = dict( ( ['a', 100], ['b', 200]) )
d

{'a': 100, 'b': 200}

In [6]:
## Using zip
keys = ['a', 'b', 'c','d']
values = (1, 2, 3, 4)

In [7]:
for k, v in zip(keys,values):
    print(k,v)

a 1
b 2
c 3
d 4


In [None]:
dzip = dict(zip('abc', range(1, 4)))
dzip

##### c. Using Comprehensions

In [8]:
dcom = { k:v for k,v in zip(keys, values) }
dcom

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

Dictionary comprehensions support the same syntax as list comprehensions - you can have nested loops, `if` statements, etc.

In [None]:
deven = {k:v for k,v in zip(keys, values) if v%2 == 0}
deven

##### d. Using fromkeys

This class method is used to create a dictionary from an iterable containing the keys, and a **single** value used to assign to each key.

In [9]:
dfk = dict.fromkeys((1,2,3))
dfk

{1: None, 2: None, 3: None}

In [None]:
dfk = dict.fromkeys('abc')
dfk

In [10]:
dfk = dict.fromkeys([1,2,3], 100)
dfk

{1: 100, 2: 100, 3: 100}

## Accessing Elements

In [11]:
keys = ('Kerala','Karnataka','Tamilnadu','Telangana')
values = ('Trivandrum','Bangalore','Chennai','Hyderabad')

In [12]:
dstates = { k:v for k,v in zip(keys,values)}
dstates

{'Kerala': 'Trivandrum',
 'Karnataka': 'Bangalore',
 'Tamilnadu': 'Chennai',
 'Telangana': 'Hyderabad'}

In [13]:
dstates['Kerala']

'Trivandrum'

In [14]:
dstates['AP']

KeyError: 'AP'

In [16]:
## Avoid exception using get()
dstates.get('AP')

In [17]:
if dstates.get('AP') == None:
    print('No such state')

No such state


In [18]:
dstates.get('Kerala')

'Trivandrum'

In [19]:
dstates.get('Kerala','No such state'), dstates.get('AP','No such state')

('Trivandrum', 'No such state')

##### Example : Count the number of each character in a text

In [None]:
text = 'The minister said it would take two to three weeks to vaccinate healthcare workers and was hopeful that the new consignment would arrive before its utilisation. The vaccine was developed by the Beijing Institute of Biological Products, a subsidiary of state-owned conglomerate Sinopharm. The company announced last month that preliminary data from last-stage trials had shown it to be 79.3 per cent effective.'

In [None]:
counts = dict()
for c in text:
    key = c.lower().strip()
    if key:
        counts[key] = counts.get(key, 0) + 1
print(counts)

## Common Operations

#### len()
Dictionaries support the `len` function - this simply returns the number of key/value pairs in the dictionary:

In [20]:
dfk

{1: 100, 2: 100, 3: 100}

In [23]:
len(dstates)

4

##### Membership test
We can use the `in` and `not in` operators to test the presence of a **key** in a dictionary:

In [None]:
'Karnataka' in dstates, 'AP' in dstates

In [None]:
'AP' not in dstates

##### Removing elements 
We can use the `del` operator to remove a key from a dictionary:

In [24]:
d

{'a': 100, 'b': 200}

In [25]:
del d['a']
d

{'b': 200}

In [26]:
## Throws error
del d['z']

KeyError: 'z'

In [27]:
## To avoid error use pop or popitem
d = dict.fromkeys('abcd', 10)
d

{'a': 10, 'b': 10, 'c': 10, 'd': 10}

In [28]:
result = d.pop('b')
result

10

In [29]:
d

{'a': 10, 'c': 10, 'd': 10}

In [30]:
d.pop('z')

KeyError: 'z'

In [31]:
d

{'a': 10, 'c': 10, 'd': 10}

In [32]:
## Still gives an error. Use a default value to avoid this
result = d.pop('z', 'Not found!')
result

'Not found!'

In [35]:
## Removes elements as LIFO. returns an error if dict is empty
d.popitem()

('c', 10)

In [36]:
d

{'a': 10}

##### Remove all keys
If we want to remove all the keys in a dictionary, we can use the `clear` method:

In [37]:
d = dict.fromkeys('abcd', 10)
d

{'a': 10, 'b': 10, 'c': 10, 'd': 10}

In [38]:
d.clear()

In [39]:
d

{}

# Sets

#### Similar to keys in a dictionary

#### Properties of a set
<li><b>Unordered</b></li>
<li><b>Distinct Elements</b></li>
<li><b>Hashable Elements</b></li>
<li><b>Mutable</b></li>
<li><b>Iterable</b></li>
<li><b>Heterogeneous</b></li>


In [41]:
x =[]
type(x)

list

In [42]:
t = ()
type(t)

tuple

In [43]:
s = {}
type(s)

dict

In [44]:
s = set()
type(s)

set

In [45]:
s1 = {9,16,3,2}
s1

{2, 3, 9, 16}

In [46]:
## No positional ordering
s1[2]

TypeError: 'set' object is not subscriptable

In [47]:
## Distinct elements - duplicates not allowed
s2 = {1,1,2,3}
s2

{1, 2, 3}

In [48]:
## Iterable
for i in s1:
    print(i)

16
9
2
3


### Creating Sets

In [None]:
## Creating an empty set

In [None]:
s  = {}
type(s)

In [None]:
## Use the set() function
s = set()
type(s)

##### a. Using literals

In [50]:
## Elements must be immutable
s = {'a', 100, (1,2)}
s

TypeError: unhashable type: 'list'

In [None]:
type(s)

In [None]:
s = {'a', 100, [1,2]}
s

#### b. Using set() function
Pass any iterable

In [55]:
s1 = set(range(10))
s1

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [56]:
s2 = set([11,22,33])
s2

{11, 22, 33}

In [57]:
s3 = set('python')
s3

{'h', 'n', 'o', 'p', 't', 'y'}

In [58]:
s3 = set('baabaa')
s3


{'a', 'b'}

In [59]:
## When you iterate over a dictionary you only get the keys
d = {'a': 1, 'b': 2}
s4 = set(d)
s4

{'a', 'b'}

#### c. Set Comprehension

In [60]:
s5 = { c for c in [50,23,45,66] }
s5

{23, 45, 50, 66}

In [61]:
s = {c for c in 'moomoo'}
s

{'m', 'o'}

##### Adding Elements

In [62]:
s = {89,45,23,78}
s

{23, 45, 78, 89}

In [63]:
s.add('abc')
s

{23, 45, 78, 89, 'abc'}

In [64]:
## If you add an existing element again, python quietly ignores it
s.add('abc')
s

{23, 45, 78, 89, 'abc'}

#### Removing Elements

In [65]:
s.remove(45)
s

{23, 78, 89, 'abc'}

In [66]:
## This throws an error
s.remove(100)

KeyError: 100

In [67]:
## Remove without errors
s.discard(100)

In [68]:
s.discard(23)
s

{78, 89, 'abc'}

In [70]:
## Using pop() - does not take an argument
## As there is no ordering, it will remove an arbitrary element
s.pop()
s

{89}

In [71]:
## remove all elements
s = {89,45,23,78}
s.clear()
s

set()

## Common Operations

##### a. len()

In [72]:
s = {1,2,3,4}
len(s)

4

##### b. Membership test

In [73]:
s5 = {c for c in [50,23,45,66]}
23 in s5 , 40 in s5

(True, False)

In [74]:
23 not in s5, 40 not in s5

(False, True)

##### Efficiency of set compared to other collections

In [75]:
n = 1000000
s = set(range(n))
l = list(range(n))
t = tuple(range(n))

In [76]:
def test_set(s, value):
    return value in s

In [77]:
def test_list(l, value):
    return value in l

In [78]:
def test_tup(t, value):
    return value in t

In [79]:
%timeit test_tup(t,100)

1.55 µs ± 56.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [80]:
%timeit test_tup(t,900000)

13.6 ms ± 744 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [81]:
%timeit test_list(t,100)

1.56 µs ± 45.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [82]:
%timeit test_list(t,900000)

13.4 ms ± 1.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [83]:
%timeit test_set(s,100)

133 ns ± 5.59 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [84]:
%timeit test_set(s,900000)

153 ns ± 3.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [85]:
### Efficiency comes at the cost of memory
print(l.__sizeof__())
print(t.__sizeof__())
print(s.__sizeof__())

8000040
8000024
33554632


### Set Operations

##### a. Intersection

In [86]:
s1 = {1, 2, 3}
s2 = {2, 3, 4}

In [87]:
s1.intersection(s2)

{2, 3}

In [88]:
s1 & s2

{2, 3}

In [89]:
## Intersection of more than 2 sets
s1 = {1, 2, 3}
s2 = {2, 3, 4}
s3 = {3, 4, 5}

In [90]:
s1.intersection(s2, s3)

{3}

In [91]:
s1 & s2 & s3

{3}

##### b. Union

In [92]:
s1 = {1, 2, 3}
s2 = {3, 4, 5}

In [93]:
s1.union(s2)

{1, 2, 3, 4, 5}

In [94]:
s1 | s2

{1, 2, 3, 4, 5}

In [95]:
### We can compute the union of more than two sets:
s3 = {5, 6, 7}

In [96]:
s1.union(s2, s3)

{1, 2, 3, 4, 5, 6, 7}

In [97]:
s1 | s2 | s3

{1, 2, 3, 4, 5, 6, 7}

##### c. Disjointedness

Two sets are disjoint if their intersection is empty:

In [99]:
s1 = {1, 2, 3}
s2 = {2, 3, 4}
s3 = {30, 40, 50}

In [100]:
print(s1.isdisjoint(s2))
print(s2.isdisjoint(s3))

False
True


In [None]:
### Another way to test disjointedness
### Check the cardinality of the intersection
len(s1 & s2) , len(s2&s3)

##### d. Differences

Note that the difference operator is not commutative, i.e. 

s1 - s2 = s2 - s1


In [101]:
s1 = {1, 2, 3, 4, 5}
s2 = {4, 5}

In [102]:
s1 - s2

{1, 2, 3}

In [103]:
s1.difference(s2)

{1, 2, 3}

##### e. Symmetric Difference

The symmetric difference of two sets results in the difference of the union and the intersection of the two sets:

In [104]:
s1 = {1, 2, 3, 4, 5}
s2 = {4, 5, 6, 7, 8}

In [105]:
s1.symmetric_difference(s2)

{1, 2, 3, 6, 7, 8}

In [106]:
(s1 | s2) - (s1 & s2)

{1, 2, 3, 6, 7, 8}

##### f. Subsets and Supersets

In [107]:
s1 = {1, 2, 3}
s2 = {1, 2, 3}
s3 = {1, 2, 3, 4}
s4 = {10, 20, 30}

In [108]:
s1.issubset(s2)

True

In [109]:
s1 <= s2

True

In [110]:
## Proper subset
s1 < s2

False

In [111]:
## s1 is a proper subset of s3
s1 < s3

True

In [112]:
### Similar tests for superset
s2.issuperset(s1)

True

In [113]:
s2 >= s1

True

In [114]:
s2 > s1

False