## Advanced Python Programming - Lecture-01

Not: Python高级编程 

主要是想和大家聊一下 介绍一下比较重要的编程方法，编程方法在各个语言里都有用到。

## Profiler 和 Decorator 

In [46]:
def is_primer(n):
    """质数的判断"""
    if n < 2: return False
    
    for i in range(2, n):
        if n % i == 0: 
            #print('{} could be divided by #{}#'.format(n, i))
            return False
    
    return True

assert not is_primer(0)
assert not is_primer(1)
assert is_primer(2)
assert is_primer(5)
assert not is_primer(10)
assert not is_primer(33)
assert is_primer(101)

print('test cases passed')

def get_primers(n):
    """我们获取所以从1到n中的质数"""
    
    results = []
    
    for i in range(2, n+1):
        if is_primer(i):
            results.append(i)
    
    return results


test cases passed


In [28]:
%time r = get_primers(40000)

CPU times: user 5.39 s, sys: 34.3 ms, total: 5.42 s
Wall time: 5.48 s


## Instructions 

$$ O(n^2), O(nlgn) $$

## 为什么程序这么慢呢？ 

## Don Knuth 1971 "Software -Practice and Experience"

如果一个非IO密集的，那么约3%的行数代码，占用了超过80%的时间

?? 如果我们发现程序比较慢的时候，第一步要考虑什么呢？ 要观察瓶颈

那么瓶颈怎么出现呢？ 

## Monitor

## 减肥的第一步，买称

In [31]:
%prun r = get_primers(40000)

 

所以的语言都有：Profiler

+ 以函数/方法（function/method）为维度的性能观察
+ 以一行一行为维度的性能观察

In [33]:
%load_ext line_profiler

In [47]:
%lprun -f is_primer get_primers(10000)

In [None]:
9 -> 2, 3, 4, 5, 6, 7, 8
100 -> 2, 3, 4, 5, 6, 7, 8, 9, 10

In [39]:
N --- 
n**2 == N --- 

9 could be divided by #3#


False

In [None]:
16 -> 2 * 8; 4 * 4; 8 * 2

In [None]:
30 -> 2 * 15; 3 * 10; 5 * 6; 平方根 6 * 5; 10 * 3; 15 * 2

## Version-2

In [52]:
def is_primer_2(n):
    """质数的判断"""
    if n < 2: return False
    
    root = int(n ** 0.5) + 1
    
    for i in range(2, root):
        if n % i == 0: 
            #print('{} could be divided by #{}#'.format(n, i))
            return False
    
    return True

assert not is_primer_2(0)
assert not is_primer_2(1)
assert is_primer_2(2)
assert is_primer_2(5)
assert not is_primer_2(10)
assert not is_primer_2(33)
assert is_primer_2(101)

print('test cases passed')

def get_primers_2(n):
    """我们获取所以从1到n中的质数"""
    
    results = []
    
    for i in range(2, n+1):
        if is_primer_2(i):
            results.append(i)
    
    return results


test cases passed


In [75]:
%time r = get_primers_2(100000)

CPU times: user 206 ms, sys: 2.78 ms, total: 209 ms
Wall time: 209 ms


In [60]:
%lprun -f is_primer_2 get_primers_2(10000)

## Version-3

1. 这个世界上所以的整数里边，有1/2的数字会被2整除
2. 这个世界上所以的整数里边，有1/3的数字会被3整除
3. 这个世界上所以的整数里边，有1/5的数字会被5整除

1/2 + 1/3 + 1/5 == (15 + 10 + 6) / 30 == 21 / 30 


In [77]:
def could_be_divide(n, m): return n != m and n % m == 0 

def is_primer_3(n):
    """质数的判断"""
    if n < 2: return False
    
    if any(could_be_divide(n, i) for i in [2, 3, 5]): return False
    
#    if n != 2 and n % 2 == 0: return False
#    if n != 3 and n % 3 == 0: return False
#    if n != 5 and n % 5 == 0: return False
    # 者三句话，减少了21/30的循环计算
    
    root = int(n ** 0.5) + 1
    
    for i in range(7, root):
        if n % i == 0: 
            #print('{} could be divided by #{}#'.format(n, i))
            return False
    
    return True

assert not is_primer_3(0)
assert not is_primer_3(1)
assert is_primer_3(2)
assert is_primer_3(5)
assert not is_primer_3(10)
assert not is_primer_3(33)
assert is_primer_3(101)

print('test cases passed')

def get_primers_3(n):
    """我们获取所以从1到n中的质数"""
    
    results = []
    
    for i in range(2, n+1):
        if is_primer_3(i):
            results.append(i)
    
    return results


test cases passed


In [76]:
import time

s = time.time()

get_primers_3(100000)

print('using time = {}'.format(time.time() - s))

CPU times: user 173 ms, sys: 3.54 ms, total: 176 ms
Wall time: 175 ms


In [None]:
%time get_primers_3(100000)

In [72]:
%lprun -f is_primer_3 get_primers_3(10000)

## 并不是要和大家讲这个质数怎么求解，当你的程序变慢的时候，你改怎么办？ 或者说，你怎么知道程序的运行时间

## 对话机器人 -> 短文本相似度匹配 + 文本检索

我去的时候 项目已经进行到一半了，但是输入一段文字，要隔5秒钟，才能得到回复

经过短短4天的优化，一轮问答的速度从5s变成了250ms；

## 算法调优 模型的准确率
## -> 时间

## decorator -> 面向函数

## 造一辆车

+ 思维方式叫做： 流程(Procedure) --> 过程
+ 还有一种思维方式： Object --> 对象
+ Oriented
+ 想一想，完成造车这件事情，功能（Function）--> 函数

## 计算机科学里边，面向这两个字：
+ 面向XX

+ XX可以被用作函数的参数
+ XX可以被用作返回值
+ XX可以被变量赋值

## Python -> 面向函数？

+ 函数是可以作为变量的
+ 函数是可以作为参数的
+ 函数是可以作为返回值的

In [80]:
def get_primers_f(n, primer_func):
    """我们获取所以从1到n中的质数"""
    
    results = []
    
    for i in range(2, n+1):
        if primer_func(i):
            results.append(i)
            
    return results

In [82]:
r = get_primers_f(100, is_primer)

In [83]:
r = get_primers_f(100, is_primer_2)

In [85]:
%lprun -f is_primer_3 get_primers_f(10000, is_primer_3)

In [86]:
from functools import partial

In [89]:
get_p_2 = partial(get_primers_f, primer_func=is_primer_2)

In [93]:
%time r = get_primers_f(100, is_primer_2)

CPU times: user 82 µs, sys: 1 µs, total: 83 µs
Wall time: 86.1 µs


In [96]:
import time

In [126]:
def get_primers_f(n, primer_func):
    """我们获取所以从1到n中的质数"""
    #s = time.time()
    
    results = []
    
    for i in range(2, n+1):
        if primer_func(i):
            results.append(i)
    
    #e = time.time() 
    
    #print('used time: {}'.format(e - s))
    return results

In [117]:
get_primers_f(100, is_primer_2)

[2,
 3,
 5,
 7,
 11,
 13,
 17,
 19,
 23,
 29,
 31,
 37,
 41,
 43,
 47,
 53,
 59,
 61,
 67,
 71,
 73,
 79,
 83,
 89,
 97]

In [99]:
r = get_primers_f(100, is_primer)

used time: 0.00011301040649414062


In [100]:
r = get_primers_f(100, is_primer_2)

used time: 7.390975952148438e-05


In [101]:
r = get_primers_f(100, is_primer_3)

used time: 0.00010824203491210938


In [137]:
from functools import wraps

In [170]:
called_time = 0

def get_time_with_cached(func):
    cached = {}
    
    @wraps(func)
    def _time(arg1, arg2):
        """My name is _time function"""
        s = time.time()
        
        global called_time 

        called_time += 1
        
        if (arg1, arg2) in cached: return cached[(arg1, arg2)]
        else:
            result = func(arg1, arg2)
            cached[(arg1, arg2)] = result
        
        print('used time:{} '.format(time.time() - s))
        print('function called time is : {}'.format(called_time))
        return result
    
    return _time

In [127]:
get_primers_f = get_time(get_primers_f) #每次都这样很麻烦，Python里边从2.6开始，就加了个新的notation。@

In [128]:
get_primers_f

<function __main__.get_time.<locals>._time(arg1, arg2)>

In [125]:
get_primers_f(100, is_primer) #接受同样的参数 返回同样的值 只是多了一个操作

used time:9.703636169433594e-05 
used time:0.0001590251922607422 


[2,
 3,
 5,
 7,
 11,
 13,
 17,
 19,
 23,
 29,
 31,
 37,
 41,
 43,
 47,
 53,
 59,
 61,
 67,
 71,
 73,
 79,
 83,
 89,
 97]

In [187]:
@get_time_with_cached # exactly ==> get_primers_f = get_time(get_primers_f)
def get_primers_with_decorator(n, primer_func):
    """我们获取所以从1到n中的质数"""
    #s = time.time()
    
    results = []
    
    for i in range(2, n+1):
        if primer_func(i):
            results.append(i)
    
    #e = time.time() 
    
    #print('used time: {}'.format(e - s))
    return results

In [188]:
r1 = get_primers_with_decorator(12351, is_primer)

used time:0.5714240074157715 
function called time is : 14


## Python面向函数，面向对象

+ C: 面向过程的 Oriented Procedure
+ Java: 面向对象 Oriented Object
+ Scala: 面向函数 Oriented Function

+ Python: OF, OO, OP
+ C++: OO, OP

In [189]:
func_mapper = {
    cond1: func1,
    cond2: func2,
    cond3: func3
}

NameError: name 'cond1' is not defined

In [None]:
for cond, func in func_mapper:
    if cond(x): func(x)

In [None]:
if xxx:
    func1():
if XXX2():
    func2()
