## Bite-Sized Python Recipes
_Disclaimer:_ This is a collection of small useful functions I've found around the web, mainly on Stack Overflow or Python documentation page. I intend to keep up-to-date.

**Create a Dictionary From Two Lists:**

In [1]:
prod_id = [1, 2, 3]
prod_name = ['foo', 'bar', 'baz']
prod_dict = dict(zip(prod_id, prod_name))

prod_dict

{1: 'foo', 2: 'bar', 3: 'baz'}

**Remove Duplicates From a List and Keep the Order:**

In [2]:
from collections import OrderedDict

nums = [1, 2, 4, 3, 0, 4, 1, 2, 5]
list(OrderedDict.fromkeys(nums))

# As of Python 3.6 (for the CPython implementation) and
# as of 3.7 (across all implementations) dictionaries remember
# the order of items inserted. So, a better one is:
list(dict.fromkeys(nums))

[1, 2, 4, 3, 0, 5]

**Create a Multi-Level Nested Dictionary:**

Create a dictionary as a value in a dictionary.  Essentially, it's a dictionary that goes multiple levels deep.

In [3]:
from collections import defaultdict

def multi_level_dict():
    """ Constructor for creating multi-level nested dictionary. """

    return defaultdict(multi_level_dict)

_Example 1:_

In [4]:
d = multi_level_dict()
d['a']['a']['y'] = 2
d['b']['c']['a'] = 5
d['x']['a'] = 6

d

defaultdict(<function __main__.multi_level_dict()>,
            {'a': defaultdict(<function __main__.multi_level_dict()>,
                         {'a': defaultdict(<function __main__.multi_level_dict()>,
                                      {'y': 2})}),
             'b': defaultdict(<function __main__.multi_level_dict()>,
                         {'c': defaultdict(<function __main__.multi_level_dict()>,
                                      {'a': 5})}),
             'x': defaultdict(<function __main__.multi_level_dict()>,
                         {'a': 6})})

_Example 2:_

A list of products is given, where each product needs to be delivered from its origin to its distribution center (DC), and then to its destination. Given this list, create a dictionary for the list of products that are shipped through each DC, coming from each origin and going to each destination.

In [5]:
import random
random.seed(20)

# Just creating arbitrary attributes for each Product instance
class Product:
    def __init__(self, id):
        self.id = id
        self.materials = random.sample('ABCD', 3)  # comprising materials
        self.origin = random.choice(('o1', 'o2'))
        self.destination = random.choice(('d1', 'd2', 'd3'))
        self.dc = random.choice(('dc1', 'dc2'))
        
    def __repr__(self):
        return f'P{str(self.id)}'


products = [Product(i) for i in range(20)]

# create the multi-level dictionary
def get_dc_origin_destination_products_dict(products):
    dc_od_products_dict = multi_level_dict()
    for p in products:
        dc_od_products_dict[p.dc][p.origin].setdefault(p.destination, []).append(p)
    return dc_od_products_dict


dc_od_orders_dict = get_dc_origin_destination_products_dict(products)
dc_od_orders_dict

defaultdict(<function __main__.multi_level_dict()>,
            {'dc1': defaultdict(<function __main__.multi_level_dict()>,
                         {'o2': defaultdict(<function __main__.multi_level_dict()>,
                                      {'d3': [P0, P15],
                                       'd1': [P2, P9, P14, P18],
                                       'd2': [P3, P13]}),
                          'o1': defaultdict(<function __main__.multi_level_dict()>,
                                      {'d1': [P1, P16],
                                       'd3': [P4, P6, P7, P11],
                                       'd2': [P17, P19]})}),
             'dc2': defaultdict(<function __main__.multi_level_dict()>,
                         {'o1': defaultdict(<function __main__.multi_level_dict()>,
                                      {'d1': [P5, P12], 'd3': [P10]}),
                          'o2': defaultdict(<function __main__.multi_level_dict()>,
                                     

**Return the Keys and Values From the Innermost Layer of a Nested Dict:**

In [6]:
from collections import abc

def nested_dict_iter(nested):
    """ Return the keys and values from the innermost layer of a nested dict. """

    for key, value in nested.items():
        # Check if value is a dictionary. abc.Mapping is used for generality
        if isinstance(value, abc.Mapping):
            yield from nested_dict_iter(value)
        else:
            yield key, value

_Example 1:_

In [7]:
d = {'a':{'a':{'y':2}},'b':{'c':{'a':5}},'x':{'a':6}}
list(nested_dict_iter(d))

[('y', 2), ('a', 5), ('a', 6)]

_Example 2:_ let's retrieve keys and values from our `dc_od_orders_dict` above.

In [8]:
list(nested_dict_iter(dc_od_orders_dict))

[('d3', [P0, P15]),
 ('d1', [P2, P9, P14, P18]),
 ('d2', [P3, P13]),
 ('d1', [P1, P16]),
 ('d3', [P4, P6, P7, P11]),
 ('d2', [P17, P19]),
 ('d1', [P5, P12]),
 ('d3', [P10]),
 ('d1', [P8])]

**The Intersection of Multiple Sets:**

In [9]:
def get_common_attr(attr, *args):
    """ intersection requires 'set' objects """
    
    return set.intersection(*[set(getattr(a, attr)) for a in args])

_Example:_ Find the common comprising materials, if any, among our first 5 `products`.

In [10]:
get_common_attr('materials', *products[:5])

{'B'}

**First Match:**

Find the first element, if any, from an iterable that matches a condition.

In [11]:
def first_match(iterable, check_condition, default_value=None):
    """ check_condition is a function. """
    
    return next((i for i in iterable if check_condition(i)), default_value)

Example:

In [12]:
nums = [1, 2, 4, 0, 5]
f1 = first_match(nums, lambda x: x > 3)
f2 = first_match(nums, lambda x: x > 9)
f3 = first_match(nums, lambda x: x > 9, 'no_match')
f1, f2, f3

(4, None, 'no_match')

**Powerset:**

The powerset of a set S is the set of all the subsets of S.

In [13]:
import itertools as it

def powerset(iterable):
    s = list(iterable)
    return it.chain.from_iterable(it.combinations(s, r)
                                  for r in range(len(s) + 1))

Example:

In [14]:
list(powerset([1,2,3]))

[(), (1,), (2,), (3,), (1, 2), (1, 3), (2, 3), (1, 2, 3)]

**Timer Decorator:**

Shows the runtime of each class/method/function.

In [15]:
from time import time
from functools import wraps

def timeit(func):
    """
    :param func: Decorated function
    :return: Execution time for the decorated function
    """

    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time()
        result = func(*args, **kwargs)
        end = time()
        print(f'{func.__name__} executed in {end - start:.4f} seconds')
        # In case you use logging module:
        # logging.info(f'{func.__name__} executed in {end - start:.4f} seconds')
        return result

    return wrapper

_Example:_

In [16]:
import random

# An arbitrary function
@timeit
def sort_rnd_num():
    numbers = [random.randint(100, 200) for _ in range(100000)]
    numbers.sort()
    return numbers
    
numbers = sort_rnd_num()

sort_rnd_num executed in 0.2194 seconds


**Calculate the Total Number of Lines in a File:**

In [17]:
def file_len(file_name, encoding='utf8'):
    with open(file_name, encoding=encoding) as f:
        i = -1
        for i, line in enumerate(f):
            pass
    return i + 1

Example: How many lines of codes are there in the python files of your current directory?

_Using `os` and `glob`:_

In [18]:
import os
import glob

path = os.path.abspath('')
files_list = glob.glob(path + '/*.ipynb')  # '/*.py' or '/*.ipynb' depending on what you have
print(sum(file_len(f) for f in files_list))

1011


_Using `pathlib`:_
Find out more about `pathlib` and its corrospondance to `os` [here](https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module).

In [19]:
from pathlib import Path

p = Path()
path = p.resolve()  # similar to os.path.abspath()
print(sum(file_len(f) for f in path.glob('*.ipynb')))  # '/*.py' or '/*.ipynb' depending on what you have

1011


**Just For Fun! Creating Long Hashtags:**

In [20]:
s = "#this is how I create very long hashtags"
"".join(s.title().split())

'#ThisIsHowICreateVeryLongHashtags'

### Some mistakes to avoid:

Be careful not to mix up mutable and immutable objects!

**Initialize a dictionary with empty lists as values:**

In [21]:
nums = [1, 2, 3, 4]
# Create a dictionary with keys from the list.
# Let's implement the dictionary in two ways
d1 = {n: [] for n in nums}
d2 = dict.fromkeys(nums, [])
# d1 and d2 may look similar. But list is mutable.
d1[1].append(5)
d2[1].append(5)
# Let's see if d1 and d2 are similar
print(f'd1 = {d1} \nd2 = {d2}')

d1 = {1: [5], 2: [], 3: [], 4: []} 
d2 = {1: [5], 2: [5], 3: [5], 4: [5]}


**Don't modify a list while iterating over it:**

This is something that should be avoided in any collection.

_Example:_ Remove all numbers less than 5 from a list.

- Wrong Implementation: Remove the elements while iterating!

In [22]:
nums = [1, 2, 3, 5, 6, 7, 0, 1]

for ind, n in enumerate(nums):
    if n < 5:
        del(nums[ind])

# expected: nums = [5, 6, 7]
nums

[2, 5, 6, 7, 1]

- Correct Implementation:
Use list comprehension to create a new list containing only the elements you want!

In [23]:
nums = [1, 2, 3, 5, 6, 7, 0, 1]
id(nums)  # before modification

2347411645384

In [24]:
nums = [n for n in nums if n >= 5]
id(nums)  # after modification

2347411752648

`id(nums)` is checked before and after to show that in fact, they are different lists. So, if the list is used in other places and it's important to mutate the existing list, rather than creating a new list with the same name, assign it to the slice:

In [25]:
nums = [1, 2, 3, 5, 6, 7, 0, 1]
id(nums)  # before modification

2347411753992

In [26]:
nums[:] = [n for n in nums if n >= 5]
id(nums)  # after modification

2347411753992