### Big idea:

Add type hints to code helps clarify your thoughts, imrproves documentation, and may allow a static analysis tool to detect some kinds of errors.

#### Tools:

- mypy (can be a plugin in PyCharm): detect errors of typing hints
- pyflakes: scan the codes and report back that it think to be an anomaly
- hypothesis: to generate test cases and detect errors in your codes (similar to doctest?)
- unittest -> nose/ py.test

#### some messy notes:

- Sequence: iterable and indexables (e.g. tuple, list, string) -> len(x), x[1]
- For tuple, you need to list all the fields; for list, homogeneous field

- Deque (Doubly Ended Queue) in Python is implemented using the module “collections“. Deque is preferred over list in the cases where we need quicker append and pop operations from both the ends of container, as deque provides an O(1) time complexity for append and pop operations as compared to list which provides O(n) time complexity.
- namedtuple

- secrets — Generate secure random numbers for managing secrets¶

#### fsum( )
- fast, accurate than regular sum
- import for summing up lots of data points

In [5]:
from math import fsum
print(fsum([0.1]*10) == 1.0)
print(sum([0.1]*10) == 1.0)

True
False


#### true division & floor division

In [6]:
print(38 / 5)
print(38 // 5)

7.6
7


#### defaultdict is used for grouping
- defaultdict compared to regular dict: you can set default value for missing keys

In [8]:
from collections import defaultdict
e = defaultdict(lambda: 'black')
e['apple'] = 'red'
e['banana']

'black'

- defaultdict + key functions to build conotainers
- defaultdict creates a new container to store elements with a common feature

In [12]:
d = defaultdict(set)
d['t'].add('tom')
d['m'].add('mary')
d['t'].add('tim')
d['t'].add('tom')
d['m'].add('martin')
d

defaultdict(set, {'t': {'tim', 'tom'}, 'm': {'martin', 'mary'}})

In [14]:
from pprint import pprint
d = defaultdict(list)
d['t'].append('tom')
d['m'].append('mary')
d['t'].append('tim')
d['t'].append('tom')
d['m'].append('martin')
d

defaultdict(list, {'t': ['tom', 'tim', 'tom'], 'm': ['mary', 'martin']})

- defaultdict for grouping

In [16]:
names = """ david betty susan mary darlene sandy davin shelly becky beatrice michael wallace""".split()
names

['david',
 'betty',
 'susan',
 'mary',
 'darlene',
 'sandy',
 'davin',
 'shelly',
 'becky',
 'beatrice',
 'michael',
 'wallace']

In [17]:
d = defaultdict(list)
for name in names:
    feature = name[0]      #change the feature to get different : e.g. len(name), name[-1]
    d[feature].append(name)
pprint(d)

defaultdict(<class 'list'>,
            {'b': ['betty', 'becky', 'beatrice'],
             'd': ['david', 'darlene', 'davin'],
             'm': ['mary', 'michael'],
             's': ['susan', 'sandy', 'shelly'],
             'w': ['wallace']})


#### key functions
- it is a function that takes one argument and tranforms it into a key
- work with min( ), max( ), sorted( ), nsmalles( ), nlargest( ), groupby( ), and merge( )

In [19]:
pprint(sorted(names, key=len))

['mary',
 'david',
 'betty',
 'susan',
 'sandy',
 'davin',
 'becky',
 'shelly',
 'darlene',
 'michael',
 'wallace',
 'beatrice']


#### Transpose 2D data with zip(*data)
- zip brings multiple sequences together pair-wise.

In [20]:
list(zip('abcde', 'ghijklm'))   #the unpaired elements are left out 

[('a', 'g'), ('b', 'h'), ('c', 'i'), ('d', 'j'), ('e', 'k')]

In [21]:
from itertools import zip_longest
list(zip_longest('abcde', 'ghijklm', fillvalue='x')) 

[('a', 'g'),
 ('b', 'h'),
 ('c', 'i'),
 ('d', 'j'),
 ('e', 'k'),
 ('x', 'l'),
 ('x', 'm')]

- *(star): unpack one sequence into separate arguments
- zip(*) is useful for tranposing matrices/ any multi-dimensional data structure is used to help you invert it and to loop over that data structure switching rows and columns together

In [22]:
m = [
  [10, 20],
  [30, 40],  
  [50, 60],
]  # 3 rows by 2 columns

In [26]:
# transforms to 2 rows by 3 columns
pprint(list(zip(*m)), width=15)
#same as
pprint(list(zip( [10, 20], [30, 40], [50, 60])), width=15)

[(10, 30, 50),
 (20, 40, 60)]
[(10, 30, 50),
 (20, 40, 60)]


#### Flattening 2D data with a nested list comprehension

In [27]:
[x for row in m for x in row]

[10, 20, 30, 40, 50, 60]

#### Convert an iterator into a list with list(iter)
- Sometimes we need to turn the iterator into list so that we can index them or loop over them multiple times

In [28]:
it = iter('abcd')
list(it)

['a', 'b', 'c', 'd']

In [30]:
('a b c d').split()

['a', 'b', 'c', 'd']