# Python Best Practices

In [1]:
import datetime
print("Latest run time: [{}]".format(datetime.datetime.now()))

Latest run time: [2021-12-18 17:08:17.122384]


## Random Choice

### 总结写在前面

1. 如果需要从一个列表里随机选出多个样本，不要用 `for` 循环一个一个选，要用 `numpy.random.choice()` 并传入 `size` 参数，或者用 `random.choices()` 并传入 `k` 参数。
    1. 两者都比用 `for` 循环一个一个采样要快很多，可加速 5-500 倍不等。
    2. 前者比后者快 20 倍。
2. 有时候确实需要用 `for` 或 `while` 循环逐个采样，也就是每次只选一个样本，进行多次选择。这时候不要用 `numpy.random.choice()` 而应该用 `random.choice()`。
    1. 后者比前者快 5-10 倍不等。
    2. 例如动态替换一个列表 X 中的元素，每个元素 xi 都有一定的概率被替换，并且用于替换的候选元素也在一个列表 Y 里，每次替换，列表 Y 的每个元素 yi 都有一定概率被选中。这种动态替换方式就很难用 `numpy` 或 `random` 内置的方法实现。

Define a Python decorator for recording the elapsed time used by a function.

In [2]:
import time

def time_elapsed(fn):
    def wraped_fn(*args, **kwargs):
        start = time.time()
        output = fn(*args, **kwargs)
        end = time.time()
        print("Function [{}] time elapsed: {} seconds".format(fn.__name__, end-start))
        return output
    return wraped_fn

Define two functions using different implementations.

In [3]:
import numpy
import random

@time_elapsed
def random_choice_use_numpy():
    candidates = [i for i in range(100)]
    for i in range(10**5):
        choice = numpy.random.choice(candidates)

@time_elapsed
def random_choice_use_random_v1():
    candidates = [i for i in range(100)]
    for i in range(10**5):
        choice = random.choice(candidates)

@time_elapsed
def random_choice_use_random_v2():
    candidates = [i for i in range(100)]
    for i in range(10**5):
        choice = candidates[random.randrange(0, len(candidates))]

@time_elapsed
def random_choice_use_random_v3():
    candidates = [i for i in range(100)]
    for i in range(10**5):
        choice = candidates[int(random.random() * len(candidates))]

random_choice_use_numpy()
random_choice_use_random_v1()
random_choice_use_random_v2()
random_choice_use_random_v3()

Function [random_choice_use_numpy] time elapsed: 1.3469736576080322 seconds
Function [random_choice_use_random_v1] time elapsed: 0.11700439453125 seconds
Function [random_choice_use_random_v2] time elapsed: 0.17999649047851562 seconds
Function [random_choice_use_random_v3] time elapsed: 0.054015159606933594 seconds


### Choose Multiple Samples

In [4]:
@time_elapsed
def random_multiple_choice_use_numpy():
    candidates = [i for i in range(100)]
    choice = numpy.random.choice(candidates, size=(10**5))

@time_elapsed
def random_multiple_choice_use_random():
    candidates = [i for i in range(100)]
    choice = random.choices(candidates, k=(10**5))

random_multiple_choice_use_numpy()
random_multiple_choice_use_random()

Function [random_multiple_choice_use_numpy] time elapsed: 0.003997087478637695 seconds
Function [random_multiple_choice_use_random] time elapsed: 0.032004594802856445 seconds
