## Unpacking a Sequence into a separate Variable
Unpack an n element tuple or sequence into collection of n variables

In [5]:
p = (4, 5)
x, y = p
print(x)
print(y)

4
5


In [6]:
data = ['ACME', 50, 91.1, (2012, 12, 21)]

In [8]:
name, shares, price, date = data

In [10]:
print(name)
print(shares)
print(price)
print(date)

ACME
50
91.1
(2012, 12, 21)


In [11]:
name, shares, price, (year, mon, day) = data

In [12]:
print(name)
print(shares)
print(price)
print(year)
print(mon)
print(day)

ACME
50
91.1
2012
12
21


In [16]:
s = 'Hello'
a, b, c, d, e = s
print(a, b, c, d, e)

H e l l o


In [18]:
# discard certain values using throwaway variables
data = ['ACME', 50, 91.1, (2012, 12, 21)]
_, shares, price, _ = data
print(shares, price)

50 91.1


## Unpacking elements from iterables of arbitary length
Unpack N elements from iterable, but the iterable may be longer than N elements causing, too many values to unpack exception  
We can use **star expressions** in python
* Extended iterable unpacking is tailor made for unpacking iterables of unknown or arbitary length

In [286]:
def func(*books):
    print(books)
    x, *y = books
    print(x, y)
    print(type(books))

In [287]:
func('hello', 'world', 'babpp')

('hello', 'world', 'babpp')
hello ['world', 'babpp']
<class 'tuple'>


In [30]:
record = ('Dave', 'dave@example.com', '999-444-3221', '123-222-9876')
name, email, *phone_numbers = record

In [31]:
print(name, email)
phone_numbers

Dave dave@example.com


['999-444-3221', '123-222-9876']

In [32]:
*trailing, current = [10, 8, 7, 1, 9, 5, 10, 3]
print(trailing, current)

[10, 8, 7, 1, 9, 5, 10] 3


In [33]:
records = [
    ('foo', x, y),
    ('bar', 'hello'),
    ('foo', 3, 4),
]

In [34]:
def do_foo(x, y):
    print('foo', x, y)
def do_bar(s):
    print('bar', s)        

In [35]:
for tag, *args in records:
    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)

foo 4 5
bar hello
foo 3 4


In [36]:
# star unpacking while doing string operation
line = 'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'
uname, *fields, homedir, sh = line.split(':')
print(uname, homedir, sh)

nobody /var/empty /usr/bin/false


In [37]:
fields

['*', '-2', '-2', 'Unprivileged User']

In [38]:
#unpack values and throw them away
uname, *_, homedir, sh = line.split(':')

In [39]:
record = ('ACME', 50, 123.45, (12, 18, 2012))
name, *_, (*_, year) = record

In [40]:
name

'ACME'

In [41]:
year

2012

In [42]:
# split a list into head and tails
items =[1, 10, 5, 3, 4, 5]

In [43]:
head, *tails = items
head, tails

(1, [10, 5, 3, 4, 5])

In [44]:
def sum(items):
    head, *tail = items
    return head + sum(tail) if tail else head

sum(items)

28

## Keeping the Last N items
use `collections.deque`

In [47]:
from collections import deque

In [None]:
def search(lines, pattern, history=5):
    previous_lines = deque(maxlen=history)
    for line in lines:
        if pattern in line
            yield line, previous_lines
        previous_lines.append(line)
    
# example use on a file
if __name__ = '__main__':
    with open('somefile.txt') as f:
        for line, prevlines in search(f, 'python', 5):
            for pline in prevlines:
                print(pline, end='')
            print(line, end='')
            print('-' * 20)


In [45]:
# when new items are added and queue is full, oldest items are automatically removed

In [48]:
q = deque(maxlen=3)
q.append(1)
q.append(2)
q.append(3)
q

deque([1, 2, 3])

In [49]:
q.append(4)
q

deque([2, 3, 4])

In [50]:
q.append(5)
q

deque([3, 4, 5])

In [51]:
q.appendleft(4)
q

deque([4, 3, 4])

In [52]:
q.pop()
q

deque([4, 3])

In [53]:
q.popleft()

4

In [54]:
q

deque([3])

Adding or popping items from either end of the queue has **O(1)** complexity. This is unlike a list where inserting or removing items from the front of the list is **O(N)**

## Finding the Largest or Smallest N items
Use the `heapq` module

Make a list of the largest or smallest N items in a collection

In [56]:
import heapq

In [57]:
nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]

In [59]:
print (heapq.nlargest(3, nums))

[42, 37, 23]


In [60]:
print(heapq.nsmallest(3, nums))

[-4, 1, 2]


`key` parameter can be passed that allows them to be used with more complicated data structure.

In [61]:
portfolio = [
{'name': 'IBM', 'shares': 100, 'price': 91.1},
{'name': 'AAPL', 'shares': 50, 'price': 543.22},
{'name': 'FB', 'shares': 200, 'price': 21.09},
{'name': 'HPQ', 'shares': 35, 'price': 31.75},
{'name': 'YHOO', 'shares': 45, 'price': 16.35},
{'name': 'ACME', 'shares': 75, 'price': 115.65}
]

In [62]:
cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])

In [63]:
cheap

[{'name': 'YHOO', 'shares': 45, 'price': 16.35},
 {'name': 'FB', 'shares': 200, 'price': 21.09},
 {'name': 'HPQ', 'shares': 35, 'price': 31.75}]

In [64]:
expensive

[{'name': 'AAPL', 'shares': 50, 'price': 543.22},
 {'name': 'ACME', 'shares': 75, 'price': 115.65},
 {'name': 'IBM', 'shares': 100, 'price': 91.1}]

Heap works by first converting the data into a list where items are ordered as a heap. The most important feature of a heap is that `heap[0]` is always the smalles item. Subsequent items can be easily found using the heapq.heappop() method, which pops off the first item and replaces it with the next smallest item( an operation that requires `O(logN)` where N is the size of the heap

In [65]:
nums 

[1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]

In [66]:
import heapq

In [67]:
heap = list(nums)

In [68]:
heapq.heapify(heap)

In [69]:
heap

[-4, 2, 1, 23, 7, 2, 18, 23, 42, 37, 8]

In [70]:
heapq.heappop(heap)

-4

In [71]:
heapq.heappop(heap)

1

In [84]:
heapq??

Note
* If you are simply trying to find the smallest or largest item, it is faster to use the min() or max() method
* If N is about the same size of the collection itself, it is usually faster to sort it first and take a slice ie `sorted(items)[:N]` or `sorted(items)[-N:]`


## Implementing a Priority Queue
Implement a queue that sorts items by a given priority and always returns the item with the highest priority on each pop operation

In [108]:
import heapq

In [123]:
class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
    def push(self, item, priority):
        heapq.heappush(self._queue, (-priority, self._index, item))
        print(self._queue)
        self._index += 1
    def pop(self):
        x =heapq.heappop(self._queue)[-1]
        return x

In [124]:
class Item:
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return f'Item({self.name})'

In [132]:
heapq.heappush??

In [125]:
q = PriorityQueue()

In [126]:
q.push(Item('foo'), 1)
q.push(Item('bar'), 5)
q.push(Item('spam'), 4)
q.push(Item('grok'), 1)

[(-1, 0, Item(foo))]
[(-5, 1, Item(bar)), (-1, 0, Item(foo))]
[(-5, 1, Item(bar)), (-1, 0, Item(foo)), (-4, 2, Item(spam))]
[(-5, 1, Item(bar)), (-1, 0, Item(foo)), (-4, 2, Item(spam)), (-1, 3, Item(grok))]


In [127]:
heapq.heappop(q._queue)

(-5, 1, Item(bar))

In [128]:
q.pop()

Item(spam)

In [129]:
q.pop()

Item(foo)

In [130]:
q.pop()

Item(grok)

In [131]:
q.pop()

IndexError: index out of range

First pop operation returned the item with the highest priority. Two items with the same priority were returned in the same order in which they were inserted into the queue

In [122]:
print("Hello")

Hello


The functions `heapq.heappush()` and `heapq.heappop()` insert and remove items from a list \_queue in a way such that the first item in the list has the smallest priority. The heappop() method always returns the "smallest" item, so that is the key to making the queue pop the correct items.

The push and pop operations have `O(logN)` complexity where N is the number of items in the heap, they are fairly efficient even for fairly large values of N

In [133]:
a = Item('foo')
b = Item('bar')
a < b

TypeError: '<' not supported between instances of 'Item' and 'Item'

Tuples are compared position by position: the first item of the first tuple is compared to the first item of the second tuple; if they are not equal (i.e. the first is greater or smaller than the second) then that's the result of the comparison, else the second item is considered, then the third and so on.

In [134]:
a = (1, Item('foo'))
b = (5, Item('bar'))
a < b

True

In [135]:
c = (1, Item('grok'))
a < c

TypeError: '<' not supported between instances of 'Item' and 'Item'

By introducing the extra index and making(priority, index, item) tuples, you avoid this problem entirely since no two tuples will ever have the same value for index (and python never bothers to compare the remaining tuple values once the result of the comparison can be determined)

In [138]:
a = (1, 0, Item('foo'))
b = (5, 1, Item('bar'))
c = (1, 2, Item('grok'))
a < b

True

In [139]:
a < c

True

## Mapping Keys to Multiple Values in a Dictionary
Make a dictionary that maps keys to more than one value (multidict)

In [152]:
d = {
    'a': [1, 2, 3],
    'b': [4, 5]
}

e = {
    'a': {1, 2, 3},
    'b': {4, 5}
}

In [154]:
for key, value in d.items():
    print(key, value)

a [1, 2, 3]
b [4, 5]


We use a **list** if we want to preserve the insertion order of the items
We use a **set** if we want to eliminate duplicates (and don't care about the order)

To easily construct such dictionaries, we use thte `defaultdict` in the `collections` module

In [141]:
from collections import defaultdict

In [147]:
d = defaultdict(list)
d['a'].append(1)
d['a'].append(2)
d['b'].append(4)
d

defaultdict(list, {'a': [1, 2], 'b': [4]})

In [143]:
d = defaultdict(set)
d['a'].add(1)
d['a'].add(2)
d['b'].add(3)

In [144]:
d

defaultdict(set, {'a': {1, 2}, 'b': {3}})

In [151]:
d = {}
for key, value in {'x':5}:
    if key not in d:
        d[key] = []
    d[key].append(value)

In [None]:
# using a defaultdict simply leads to much cleaner code.
d = default(list)
for key, value in pairs:
    d[key].append(value)

## Keeping Dictionaries in Order
create a dictionary and control the order of items

To control the order of items in a dictionary, you can use a `OrderedDict` from the `collections` module. It exactly preserves the original order of data when iterating

In [155]:
from collections import OrderedDict

In [156]:
d = OrderedDict()
d['foo'] = 1
d['bar'] = 2
d['spam'] = 3
d['grok'] = 4

In [157]:
for key in d:
    print(key, d[key])

foo 1
bar 2
spam 3
grok 4


An **OrderedDict** will be useful when you want to build a mapping that you may want to later serialize or encode into a different format. For example, if you want to precisely control the order of fields appearing in a JSON encoding, first building the data in an OrderedDict will do the trick

In [158]:
import json
json.dumps(d)


'{"foo": 1, "bar": 2, "spam": 3, "grok": 4}'

An OrderedDict internally maintains a **doubly linked list** that orders the keys according to the insertion order. When a new item is first inserted, it is placed at the end of this list. Subsequent reassignment of an existing key doesn't change the order

The size of an OrderedDict is twice as large as the normal dictionary due to the extra linked list that is created. Thus, if you are going to build a data structure involving large number of OrderedDict Instances, you should think if it is required or not

## Calculating with dictionaries
perform various calculations on dictionaries

In [159]:
prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75,
}

In [160]:
prices

{'ACME': 45.23, 'AAPL': 612.78, 'IBM': 205.55, 'HPQ': 37.2, 'FB': 10.75}

To perform calculations on the dictionary contents, it is useful to invert the keys and values of dictionary using zip()

In [162]:
min_price = min(zip(prices.values(), prices.keys()))
min_price

(10.75, 'FB')

In [168]:
zip(prices.values()).__next__()

(45.23,)

In [171]:
max_price = max(zip(prices.values(), prices.keys()))
max_price

(612.78, 'AAPL')

To rank the data use the zip() with sorted()

In [173]:
prices_sorted = sorted(zip(prices.values(), prices.keys()))
prices_sorted

[(10.75, 'FB'),
 (37.2, 'HPQ'),
 (45.23, 'ACME'),
 (205.55, 'IBM'),
 (612.78, 'AAPL')]

In [178]:
prices.values()

dict_values([45.23, 612.78, 205.55, 37.2, 10.75])

In [180]:
tuple(zip(prices.values(), prices.keys()))

((45.23, 'ACME'),
 (612.78, 'AAPL'),
 (205.55, 'IBM'),
 (37.2, 'HPQ'),
 (10.75, 'FB'))

`zip()` creates an iterator that can only be consumed once. 

In [181]:
prices_and_names = zip(prices.values(), prices.keys())
print(min(prices_and_names))
print(max(prices_and_names))

(10.75, 'FB')


ValueError: max() arg is an empty sequence

In [182]:
min(prices)

'AAPL'

In [183]:
min(prices.values())

10.75

In [184]:
min(prices, key=lambda k: prices[k])

'FB'

In [185]:
max(prices, key=lambda k: prices[k])

'AAPL'

In [187]:
min_value = prices[min(prices, key=lambda k: prices[k])]
min_value

10.75

In the above case we need to perform an extra lookup step.To avoid it, zip() inverts the dictionary into a sequence of (value, key) pairs. When performing comparisons on such tuples, the value element is compared first, followed by the key. This gives you exactly the behavior that you want and allows reductions and sorting to be easily performed on the contents in a single statement.

In the case of equal values

In [188]:
prices = {'AAA': 45.23, 'ZZZ': 45.23}

In [190]:
min(zip(prices.values(), prices.keys()))

(45.23, 'AAA')

In [191]:
max(zip(prices.values(), prices.keys()))

(45.23, 'ZZZ')

## Finding Commonalitites in Two Dictionaries
You have two dictionaries and want to find out what they might have in common

In [193]:
a = {
    'x': 1,
    'y': 2,
    'z': 3
}

b = {
'w' : 10,
'x' : 11,
'y' : 2
}

In [194]:
# to find out what two dictionaries have in common, simply perform common set
# operations using the keys() or items() methods

In [195]:
# Find keys in common
a.keys() & b.keys()

{'x', 'y'}

In [196]:
#
a.items() and b.items()

dict_items([('w', 10), ('x', 11), ('y', 2)])

In [197]:
# Find keys in a that are not in b
a.keys() - b.keys()

{'z'}

In [198]:
# Find (key, value) pairs in common
a.items() & b.items()

{('y', 2)}

In [199]:
# make a new dictionary with certain keys removed
c = {key:a[key] for key in a.keys() - {'z', 'w'}}
c

{'x': 1, 'y': 2}

In [206]:
type(a.keys())

dict_keys

## Removing Duplicates from a Sequence while maintaining order
Eliminate the duplicate values in a sequence, but preserve the order of the remaining items

In [219]:
def dedupe(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)
            print(seen)

In [220]:
a = [1, 5, 2, 1, 9, 1, 5, 10]
list(dedupe(a))

{1}
{1, 5}
{1, 2, 5}
{1, 2, 5, 9}
{1, 2, 5, 9, 10}


[1, 5, 2, 9, 10]

The above method works only if the items in the sequence are hashable. If you are trying to eliminate duplicates in a sequence of unhashable types (such as dicts), you can make a slight change to this as follows

In [215]:
def dedupe1(items, key=None):
#     print(key)
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        print(key(item))
        if val not in seen:
            yield item
            seen.add(val)

In [224]:
a = [{'x':1, 'y':2}, {'x':1, 'y':3}, {'x':1, 'y':2}, {'x':2, 'y':4}]

In [217]:
list(dedupe1(a, key=lambda d: (d['x'], d['y'])))

(1, 2)
(1, 3)
(1, 2)
(2, 4)


[{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

In [218]:
for i in a:
    print(i)

{'x': 1, 'y': 2}
{'x': 1, 'y': 3}
{'x': 1, 'y': 2}
{'x': 2, 'y': 4}


In [225]:
# remove duplicates based on the value of a single field or attribute or a larger data structure
list(dedupe1(a, key=lambda d: d['x']))

1
1
1
2


[{'x': 1, 'y': 2}, {'x': 2, 'y': 4}]

In [223]:
a

[1, 5, 2, 1, 9, 1, 5, 10]

## Iterators

In [234]:
# define a list
my_list = [4, 5, 0, 2]

In [235]:
#get an iterator using iter()
my_iter = iter(my_list)

In [236]:
# iterate through it using next()
next(my_iter)

4

In [240]:
my_iter.__next__()

StopIteration: 

In [241]:
# for loop implementation
iter_obj = iter(my_list)

In [242]:
# infinite loop
while True:
    try:
        # get the next item
        element =next(iter_obj)
        print(element)
        # do something with element
    except StopIteration:
        break
        

4
5
0
2


## building custom iterators

Building an iterator from scractch is easy in Python. We have to implement the
\__iter__() and the \__next__() methods
The \__iter__() method returns the iterator object itself. If required, some initialization
can be performed
The \__next__() method must return the next item in the sequence. On reaching the
end, and in subsequent calls, it must raise StopIteration


In [275]:
class PowTwo:
    """Class to implement an iterator of powers of two"""
    
    def __init__(self, max=0):
        self.max = max
    
    def __iter__(self):
        self.n = 0
        print("inside the iterator")
        return self
    def __next__(self):
        if self.n <= self.max:
            result = 2 ** self.n
            self.n += 1
            return result
        else:
            raise StopIteration

In [277]:
# create an object
numbers = PowTwo(3)


In [278]:
# create an object
numbers = PowTwo(3)

In [279]:
#create an iterable from the object
i = iter(numbers)

inside the iterator


In [280]:
# Using next to get to the next iterator element
print(next(i))
print(next(i))
print(next(i))
print(next(i))
print(next(i))

1
2
4
8


StopIteration: 

In [270]:
# Use a for loop to iterate over our iterator class
for i in PowTwo(5):
    print(i)

1
2
4
8
16
32


## python infinite iterators
It is not necessary that the item in an iterator object has to be exhausted. There can be infinite iterators(which never ends). We must be careful when handling such iterators

A simple example to demonstrate infinite iterators

The built in function `iter()` function can be called with two arguments where the first argument must be a callable object (function) and second is the sentinel. The iterator calls this function until the returned value is equal to the sentinel

In [271]:
int()

0

In [272]:
inf = iter(int, 1)

In [273]:
next(inf)

0

In [274]:
next(inf)

0

The int() function always returns 0. So passing it as iter(int, 1) will return an iterator that calls int() until the returned value equals 1. This never happens and we get an infiniter iterator.

We can also build our own infinite iterator. The following iterator will, theoretically, return all the odd numbers.

In [298]:
class Infilter:
    """Infinte iterator to return all
    odd numbers"""
    
    def __iter__(self):
        self.num = 1
        print(self.__repr__())
        return self
    
    def __next__(self):
        num = self.num
        self.num += 2
        return num

In [299]:
a = iter(Infilter())

<__main__.Infilter object at 0x00000293C7F63070>


In [291]:
next(a)

1

In [292]:
next(a)

3

In [293]:
next(a)

5

In [294]:
next(a)

7

In [295]:
next(a)

9

The advantage of using iterators is that they save resources. like shown above, we could get all the odd numbers without storing the entire number system in memory. We can have infinite items(theoretically) in finite memory.

## Returning self in python

In [300]:
class Counter():
    def __init__(self, start=1):
        self.val = start
    def increment(self):
        self.val += 1
        return self
    def decrement(self):
        self.val -= 1
        return self

In [301]:
c = Counter()

In [302]:
c.val

1

In [303]:
c.increment().val

2

In [304]:
c.increment().increment().decrement()

<__main__.Counter at 0x293c800f1c0>

In [305]:
a 

<__main__.Infilter at 0x293c7f63070>

In [306]:
test ={'x': 1, 'y': 2}

In [307]:
yada = test['x']
yada = 3
yada

3

In [308]:
test

{'x': 1, 'y': 2}