<a href="https://colab.research.google.com/github/narsym/Advanced-Python/blob/master/Fluent_python_chap_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Data Structures

List Comprehension

In [None]:
symbols = '$¢£¥€¤'
codes = [ord(i) for i in symbols]
codes

Variables assigned within theexpression are local, but variables in the surrounding scope can still be referenced. Evenbetter, the local variables do not mask the variables from the surrounding scope.

In [None]:
x = 'ABC' #list comprehension no longer leaks their variables
dummy = [ord(x) for x in x]
print(x, dummy)

ABC [65, 66, 67]


List comprehension is faster than map and filter combo

In [None]:
symbols = '$¢£¥€¤'
beyond_ascii = [ord(s) for s in symbols if ord(s) > 127]
beyond_ascii

[162, 163, 165, 8364, 164]

In [None]:
beyond_ascii = list(filter(lambda c: c > 127, map(ord, symbols))) # slower
beyond_ascii

[162, 163, 165, 8364, 164]

cartesian product using list comprehension

In [None]:
colors = ['black', 'sizes']
sizes = ['S', 'M', 'L']
tshirts = [(color, size) for color in colors for size in sizes]
tshirts

[('black', 'S'),
 ('black', 'M'),
 ('black', 'L'),
 ('sizes', 'S'),
 ('sizes', 'M'),
 ('sizes', 'L')]

Generator expressions

1.   saves a lot of memory to build other sequence types
2.   enclosed in parenthesis rather than braces



In [None]:
symbols = '$¢£¥€¤'
tuple(ord(s) for s in symbols)

(36, 162, 163, 165, 8364, 164)

In [None]:
import array
a = array.array('I', (ord(s) for s in symbols))

*list comprehension builds a list with items to feed to for loop, but generator expression produces one item at a time to feed for loop so saves memory and time*

In [None]:
for tshirt in ('%s %s' % (color, size)#format
               #Thegenerator expression yields items one by one;a list with all six T-shirt variations is never produced in this example.
               for color in colors for size in sizes):
  print(tshirt)

black S
black M
black L
sizes S
sizes M
sizes L


Tuples are not just lists

In [None]:
#tuples as records
lax_coordinates = (33, 97)
city, year, pop, chg, area = ('Tokyo', 2003, 97, 0.66, 8017)
traveler_ids = [('USA', 9987), ('BRA', 9982), ('ESP', 3325)]
for passport in sorted(traveler_ids):
  print('%s/%s' % passport)

BRA/9982
ESP/3325
USA/9987


In [None]:
for country, _ in traveler_ids:
  print(country)

USA
BRA
ESP


Tuple unpacking

In [None]:
lax_coordinates = (33, 97)
latitude, longitude = lax_coordinates #unpacking
print(latitude, longitude)

33 97


In [None]:
#swapping
latitude, longitude = longitude, latitude
print(latitude, longitude)

97 33


In [None]:
#prefixing argument with a start
print(divmod(20, 8))
t = (20, 8)
print(divmod(*t))
q, r = divmod(*t)
print(q, r)

(2, 4)
(2, 4)
2 4


In [None]:
import os
_, filename = os.path.split('/home/go/idrsa.pub') #splits last part
_, filename

('/home/go', 'idrsa.pub')

Using * to grab excess items

In [None]:
a, b, *c = range(5)
print(a, b, c)
*a, b, c = range(2) #can only applied to a single variable, but at any position
print(a, b, c)

0 1 [2, 3, 4]
[] 0 1


Nested tuple unpacking

In [None]:
metro_areas=[('Tokyo','JP',36.933,(35.689722,139.691667)),
             ('Delhi NCR','IN',21.935,(28.613889,77.208889)),
             ('Mexico City','MX',20.142,(19.433333,-99.133333)),
             ('New York-Newark','US',20.104,(40.808611,-74.020386)),('Sao Paulo','BR',19.649,(-23.547778,-46.635833))]

print('{:15} | {:^9} | {:^9}'.format('','lat.','long.'))
fmt='{:15} | {:9.4f} | {:9.4f}'
for name, cc, pop, (lat, lon) in metro_areas:
  if lon <= 0:
    print(fmt.format(name, lat, lon))

                |   lat.    |   long.  
Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
Sao Paulo       |  -23.5478 |  -46.6358


Named Tuple

In [None]:
from collections import namedtuple
City = namedtuple('City', 'name country population coordinates')
tokyo = City('Tokyo', 'JP', 36.933, (35, 18))
print(tokyo, tokyo.population, tokyo.coordinates, tokyo[1]) #access the elements by field names or position

City(name='Tokyo', country='JP', population=36.933, coordinates=(35, 18)) 36.933 (35, 18) JP


three methods

1.   _fields
2.   _make(iterable)
3.   _asdict()



In [None]:
City._fields

('name', 'country', 'population', 'coordinates')

In [None]:
Latlong = namedtuple('Latlong', 'lat long')
delhi_data = ('Delhi NCR', 'IN', 21.935, Latlong(28, 77))
delhi = City._make(delhi_data)
delhi._asdict()

OrderedDict([('name', 'Delhi NCR'),
             ('country', 'IN'),
             ('population', 21.935),
             ('coordinates', Latlong(lat=28, long=77))])

In [None]:
for key, value in delhi._asdict().items():
  print(key + ':', value)

name: Delhi NCR
country: IN
population: 21.935
coordinates: Latlong(lat=28, long=77)


slicing

In [None]:
s = 'bicycle'
print(s[::3], s[::-1]) #s[a:b:c] , a = start, b = stop, c = stride

bye elcycib


In [None]:
invoice="""
... 0.....6.................................40........52...55........... 
1909  Pimoroni PiBrella                     $17.50    3    $52.50... 
1489  6mm Tactile Switch x20                 $4.95    2     $9.90... 
1510  Panavise Jr. - PV-201                 $28.00    1    $28.00... 
1601  PiTFT Mini Kit 320x240                $34.95    1    $34.95... 
"""


SKU = slice(0, 6)
DESCRIPTION = slice(6, 40)
UNIT_PRICE = slice(40, 52)
QUANTITY = slice(52, 55)
ITEM_TOTAL = slice(55, None)
line_items = invoice.split('\n')[2:]

for item in line_items:
  print(item[UNIT_PRICE], item[DESCRIPTION])

    $17.50   Pimoroni PiBrella                 
     $4.95   6mm Tactile Switch x20            
    $28.00   Panavise Jr. - PV-201             
    $34.95   PiTFT Mini Kit 320x240            
 


assigning to slices

In [None]:
l = list(range(10))
print(l)
l[2:5] = [20, 30]
print(l)
del l[5:7]
print(l)
l[3::2] = [11, 22]
print(l)
l[2:5] = [100]
print(l)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 20, 30, 5, 6, 7, 8, 9]
[0, 1, 20, 30, 5, 8, 9]
[0, 1, 20, 11, 5, 22, 9]
[0, 1, 100, 22, 9]


using + and * with sequences

In [None]:
l = [1, 2, 3]
print(l * 5, l + l)

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3] [1, 2, 3, 1, 2, 3]


Building lists of lists

In [None]:
board = [['_'] * 3 for i in range(3)] #correct way
print(board)
board[1][2] = 'X'
print(board)

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]
[['_', '_', '_'], ['_', '_', 'X'], ['_', '_', '_']]


In [None]:
weird_board = [['_'] * 3] * 3 #incorrect way
print(weird_board)
weird_board[1][2] = 'X'
print(weird_board)

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]
[['_', '_', 'X'], ['_', '_', 'X'], ['_', '_', 'X']]


Assigning augmented operators

In [None]:
#For mutable sequence types __iadd__ and __imul__ etc are implemented, so they behave like a.extend()

l = [1, 2, 3]
print(l, id(l))
l *= 2
print(l, id(l)) #same object as above 

#For immutable sequence types __iadd__ and __imul etc are not implemented, so they behave like a = a + b, str is exception behaves like extend

t = (1, 2, 3)
print(t, id(t))
t *= 2
print(t, id(t)) #new object

[1, 2, 3] 139957158082888
[1, 2, 3, 1, 2, 3] 139957158082888
(1, 2, 3) 139957158615224
(1, 2, 3, 1, 2, 3) 139957159222152


In [None]:
t = (1, 2, [30, 40])
print(id(t))
t[2] += [50, 60]

139957158616952


TypeError: ignored

In [None]:
print(t, id(t)) #same object, element inside tuple changed after throwing an error

(1, 2, [30, 40, 50, 60]) 139957158616952


Remember


1.   Putting mutable items in tuples is not a good idea.
2.   Augmented  assignment  is  not  an  atomic  operation—we  just  saw  it  throwing  anexception after doing part of its job.



list.sort and sorted

In [None]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
print(sorted(fruits))
print(fruits)
print(sorted(fruits, reverse = True))
print(sorted(fruits, key = len))
print(fruits)
print(fruits.sort())
print(fruits)

['apple', 'banana', 'grape', 'raspberry']
['grape', 'raspberry', 'apple', 'banana']
['raspberry', 'grape', 'banana', 'apple']
['grape', 'apple', 'banana', 'raspberry']
['grape', 'raspberry', 'apple', 'banana']
None
['apple', 'banana', 'grape', 'raspberry']


Managing ordered sequences with bisect

In [None]:
import bisect
import sys

HAYSTACK=[1,4,5,6,8,12,15,20,21,23,23,26,29,30]
NEEDLES=[0,1,2,5,8,10,22,23,29,30,31]
ROW_FMT='{0:2d} @ {1:2d}    {2}{0:<2d}'

def demo(bisect_fn):
  for needle in reversed(NEEDLES):
    position = bisect_fn(HAYSTACK, needle)
    offset = position * '  |'
    print(ROW_FMT.format(needle, position, offset))

bisect_fn = bisect.bisect_left

print('DEMO:',bisect_fn.__name__)
print('haystack ->',' '.join('%2d' % n for n in HAYSTACK))
demo(bisect_fn)

DEMO: bisect_left
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 12      |  |  |  |  |  |  |  |  |  |  |  |29
23 @  9      |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  4      |  |  |  |8 
 5 @  2      |  |5 
 2 @  1      |2 
 1 @  0    1 
 0 @  0    0 


In [None]:
def grade(score, breakpoints = [60, 70, 80, 90], grades = 'FDCBA'):
  i = bisect.bisect(breakpoints, score)
  return grades[i]

[grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]

['F', 'A', 'C', 'C', 'B', 'A', 'A']

Inserting with bisect.insort

In [None]:
import random
SIZE = 7
random.seed(178)
my_list = []
for i in range(SIZE):
  new_item = random.randrange(SIZE * 2)
  bisect.insort(my_list, new_item)
  print('%2d ->' % new_item, my_list)

 1 -> [1]
 8 -> [1, 8]
 7 -> [1, 7, 8]
10 -> [1, 7, 8, 10]
 3 -> [1, 3, 7, 8, 10]
12 -> [1, 3, 7, 8, 10, 12]
 5 -> [1, 3, 5, 7, 8, 10, 12]


When list is not an answer

Array

In [15]:
from array import array
from random import random

floats = array('d', (random() for i in range(10 ** 7)))
print(floats[-1])

0.4464896172235179


In [16]:
fp = open('floats.bin', 'wb')
floats.tofile(fp)
fp.close()

In [19]:
floats2 = array('d')
fp = open('floats.bin', 'rb')
floats2.fromfile(fp, 10 ** 7)
fp.close()
floats2[-1]
floats == floats2

True

For sorting arrays

In [28]:
a = array('d', [9, 8, 7, 6, 5, 4, 3, 2, 1])
a = array(a.typecode, sorted(a))
print(a)

array('d', [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])


Memory View Class

In [29]:
numbers = array('h', [-2, -1, 0, 1, 2])
memv = memoryview(numbers)
print(len(memv))

5


In [30]:
memv[0]

-2

In [31]:
memv_oct = memv.cast('B')
memv_oct.tolist()

[254, 255, 255, 255, 0, 0, 1, 0, 2, 0]

In [32]:
memv_oct[5] = 4
numbers

array('h', [-2, -1, 1024, 1, 2])

Deque and other queues

In [33]:
from collections import deque
dq = deque(range(10), maxlen = 10)
print(dq)
dq.rotate(3)
print(dq)
dq.rotate(-4)
print(dq)
dq.appendleft(-1)
print(dq)
dq.extend([11, 12, 13])
print(dq)
dq.extendleft([10, 20, 30, 40])
print(dq)

deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], maxlen=10)
deque([7, 8, 9, 0, 1, 2, 3, 4, 5, 6], maxlen=10)
deque([1, 2, 3, 4, 5, 6, 7, 8, 9, 0], maxlen=10)
deque([-1, 1, 2, 3, 4, 5, 6, 7, 8, 9], maxlen=10)
deque([3, 4, 5, 6, 7, 8, 9, 11, 12, 13], maxlen=10)
deque([40, 30, 20, 10, 3, 4, 5, 6, 7, 8], maxlen=10)


In [None]:
#explore heapq 