# 第六章：良好的编程习惯

## 1. 不要直接调用类的私有方法

In [1]:
class Kls():
    def public(self):
        print('Hello public world!')
    
    def __private(self):
        print('Hello private world!')
    
    def call_private(self):
        self.__private()

In [2]:
ins = Kls()

In [3]:
ins.public()

Hello public world!


In [4]:
ins.__private()

AttributeError: 'Kls' object has no attribute '__private'

In [5]:
ins.call_private()

Hello private world!


调用私有方法的方式（普及知识，不要用）

In [6]:
ins._Kls__private()

Hello private world!


In [7]:
ins.call_private()

Hello private world!


## 2. 默认参数最好不为可变对象

In [8]:
def func(item, item_list=[]):
    item_list.append(item)
    print(item_list)

In [9]:
func('iphone')

['iphone']


In [10]:
func('xiaomi', item_list=['oppo', 'vivo'])

['oppo', 'vivo', 'xiaomi']


In [11]:
func('huawei')

['iphone', 'huawei']


![解释](http://image.iswbm.com/20190511165650.png)

## 3. 增量赋值新能更好

In [12]:
a = 1

In [13]:
a += 1

+= 其背后使用的魔法方法是 `__iadd__`，如果没有实现这个方法则会退而求其次，使用 `__add__` 。

用列表举例 `a += b`，使用 `__iadd__` 的话就像是使用了`a.extend(b)`,如果使用 `__add__` 的话，则是 `a = a+b`,前者是直接在原列表上进行扩展，而后者是先从原列表中取出值，在一个新的列表中进行扩展，然后再将新的列表对象返回给变量，显然后者的消耗要大些。

## 4. 别再使用pprint打印了

In [15]:
from pprint import pprint

In [18]:
info = [{"id":1580615,"name":"皮的嘛","packageName":"com.renren.mobile.android","iconUrl":"app/com.renren.mobile.android/icon.jpg","stars":2,"size":21803987,"downloadUrl":"app/com.renren.mobile.android/com.renren.mobile.android.apk","des":"2011-2017 你的铁头娃一直在这儿。中国最大的实名制SNS网络平台，嫩头青"},{"id":1540629,"name":"不存在的","packageName":"com.ct.client","iconUrl":"app/com.ct.client/icon.jpg","stars":2,"size":4794202,"downloadUrl":"app/com.ct.client/com.ct.client.apk","des":"斗鱼271934 走过路过不要错过，这里有最好的鸡儿"}]

In [19]:
pprint(info)

[{'des': '2011-2017 你的铁头娃一直在这儿。中国最大的实名制SNS网络平台，嫩头青',
  'downloadUrl': 'app/com.renren.mobile.android/com.renren.mobile.android.apk',
  'iconUrl': 'app/com.renren.mobile.android/icon.jpg',
  'id': 1580615,
  'name': '皮的嘛',
  'packageName': 'com.renren.mobile.android',
  'size': 21803987,
  'stars': 2},
 {'des': '斗鱼271934 走过路过不要错过，这里有最好的鸡儿',
  'downloadUrl': 'app/com.ct.client/com.ct.client.apk',
  'iconUrl': 'app/com.ct.client/icon.jpg',
  'id': 1540629,
  'name': '不存在的',
  'packageName': 'com.ct.client',
  'size': 4794202,
  'stars': 2}]


In [20]:
from pprint import PrettyPrinter

In [30]:
class MyPrettyPrinter(PrettyPrinter):
    def format(self, object, context, maxlevels, level):
        if isinstance(object, unicode):
            return (object.encode('utf8'), True, False)
        return PrettyPrinter.format(self, object, context, maxlevels, level)

In [31]:
class MyStream():
    def write(self, text):
        print(text.replace('\'', '"'))

In [None]:
MyPrettyPrinter(stream=MyStream()).pprint(info)

In [33]:
import json

In [35]:
print(json.dumps(info, indent=4, ensure_ascii=False))

[
    {
        "id": 1580615,
        "name": "皮的嘛",
        "packageName": "com.renren.mobile.android",
        "iconUrl": "app/com.renren.mobile.android/icon.jpg",
        "stars": 2,
        "size": 21803987,
        "downloadUrl": "app/com.renren.mobile.android/com.renren.mobile.android.apk",
        "des": "2011-2017 你的铁头娃一直在这儿。中国最大的实名制SNS网络平台，嫩头青"
    },
    {
        "id": 1540629,
        "name": "不存在的",
        "packageName": "com.ct.client",
        "iconUrl": "app/com.ct.client/icon.jpg",
        "stars": 2,
        "size": 4794202,
        "downloadUrl": "app/com.ct.client/com.ct.client.apk",
        "des": "斗鱼271934 走过路过不要错过，这里有最好的鸡儿"
    }
]


## 5. 变量名与保留关键冲突怎么办

In [36]:
import keyword

In [38]:
print('\n'.join(keyword.kwlist))

False
None
True
and
as
assert
async
await
break
class
continue
def
del
elif
else
except
finally
for
from
global
if
import
in
is
lambda
nonlocal
not
or
pass
raise
return
try
while
with
yield


关于这个问题，PEP8 建议当你想使用的变量名被关键字所占用时，可以使用 变量_ 这样在变量后面加一个单下划线的形式来命名，这种后缀一下划线的方式优先于缩写或拼写错误。

In [39]:
try_ = True

## 6. 不想让子类继承的变量名该怎么写

In [49]:
class Parent:
    def __init__(self):
        self.name = 'MING'
        self.__wife = 'Julia'

class Son(Parent):
    def __init__(self):
        self.name = "Xiao Ming"

In [50]:
son = Son()

In [51]:
son.name

'Xiao Ming'

In [52]:
p = Parent()

In [53]:
p.name

'MING'

In [54]:
p._Parent__wife

'Julia'

In [55]:
son._Son__wife

AttributeError: 'Son' object has no attribute '_Son__wife'

In [58]:
son._Son__name

AttributeError: 'Son' object has no attribute '_Son__name'

# 第七章：神奇魔法模块

## 1. 远程登录服务器的最佳利器

1. 使用subprocess
2. 使用sh.ssh
3. 使用paramiko

## 2. 代码BUG变得炫酷的利器

In [59]:
1/0

ZeroDivisionError: division by zero

In [60]:
import pretty_errors

1/0

ZeroDivisionError: division by zero

## 3. 少有人知的Python 重试机制

最基本的重试

In [61]:
from tenacity import retry

In [62]:
@retry
def test_retry():
    print('等待重试，重试无间隔执行...')
    raise Exception

In [None]:
test_retry()

设置停止基本条件

In [1]:
from tenacity import retry, stop_after_attempt

In [2]:
@retry(stop=stop_after_attempt(7))
def test_retry():
    print('等待重试...')
    raise Exception

In [3]:
test_retry()

等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...


RetryError: RetryError[<Future at 0x29008da7388 state=finished raised Exception>]

In [5]:
from tenacity import retry, stop_after_delay
@retry(stop=stop_after_delay(3))
def test_retry():
    print('等待重试...')
    raise Exception

In [6]:
test_retry()

等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...


RetryError: RetryError[<Future at 0x29009582448 state=finished raised Exception>]

In [7]:
@retry(stop=(stop_after_attempt(7) | stop_after_delay(3)))
def test_retry():
    print('等待重试...')
    raise Exception

In [8]:
test_retry()

等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...
等待重试...


RetryError: RetryError[<Future at 0x29009371e48 state=finished raised Exception>]

设定值何时进行重试

In [10]:
from requests import exceptions
from tenacity import retry, retry_if_exception_type

@retry(retry=retry_if_exception_type(exception_types=exceptions.Timeout))
def test_retry():
    print('等待重试...')
    raise exceptions.Timeout

In [None]:
test_retry()

在满足自定义条件时，再进行重试

In [1]:
from tenacity import retry, stop_after_attempt, retry_if_result

def is_false(value):
    return value is False

@retry(stop=stop_after_attempt(3), retry=retry_if_result(is_false))
def test_retry():
    return False

In [2]:
test_retry()

RetryError: RetryError[<Future at 0x27d67699f48 state=finished returned bool>]

多条件注意顺序

重试后错误重新抛出

In [4]:
from tenacity import retry, stop_after_attempt

@retry(stop=stop_after_attempt(4), reraise=True)
def test_retry():
    print("等待重试...")
    raise Exception

In [5]:
test_retry()

等待重试...
等待重试...
等待重试...
等待重试...


Exception: 

设置回调函数

In [6]:
from tenacity import *

In [7]:
def return_last_value(retry_state):
    print("执行回调函数")
    return retry_state.outcome.result()

In [8]:
def is_false(value):
    return value is False

In [9]:
@retry(stop=stop_after_attempt(3), retry_error_callback=return_last_value, retry=retry_if_result(is_false))
def test_retry():
    print("等待重试中...")
    return False

In [10]:
print(test_retry())

等待重试中...
等待重试中...
等待重试中...
执行回调函数
False


## 4. 规整字符串提取数据的神器

1. parse

In [14]:
from parse import parse

In [80]:
flow = 'cookie=0x9816da8e872d717d, duration=298506.364s, table=0, n_packets=480, n_bytes=20160, priority=10,ip,in_port="tapbbdf080b-c2" actions=NORMAL'

In [87]:
result = parse('cookie={cookie}, duration={duration}, table={table}, n_packets={n_packets}, n_bytes={n_bytes}, priority={priority},ip,in_port="{in_port}" actions={actions}', flow)

In [88]:
print(result)

<Result () {'cookie': '0x9816da8e872d717d', 'duration': '298506.364s', 'table': '0', 'n_packets': '480', 'n_bytes': '20160', 'priority': '10', 'in_port': 'tapbbdf080b-c2', 'actions': 'NORMAL'}>


In [49]:
parse('cookie={cookie}, duration={duration}, table={table}, n_packets={n_packets}, n_bytes={n_bytes}, priority={priority},ip,in_port="{in_port}" actions={actions}', 'cookie=0x9816da8e872d717d, duration=298506.364s, table=0, n_packets=480, n_bytes=20160, priority=10,ip,in_port="tapbbdf080b-c2" actions=NORMAL')

<Result () {'cookie': '0x9816da8e872d717d', 'duration': '298506.364s', 'table': '0', 'n_packets': '480', 'n_bytes': '20160', 'priority': '10', 'in_port': 'tapbbdf080b-c2', 'actions': 'NORMAL'}>

In [72]:
flow = 'cookie=0x9816da8e872d717d, duration=298506.364s, table=0, n_packets=480, n_bytes=20160, priority=10,ip,in_port="tapbbdf080b-c2" actions=NORMAL'

In [81]:
x = parse('cookie={cookie}, duration={duration}, table={table}, n_packets={n_packets}, n_bytes={n_bytes}, priority={priority},ip,in_port="{in_port}" actions={actions}', flow)

In [82]:
x

<Result () {'cookie': '0x9816da8e872d717d', 'duration': '298506.364s', 'table': '0', 'n_packets': '480', 'n_bytes': '20160', 'priority': '10', 'in_port': 'tapbbdf080b-c2', 'actions': 'NORMAL'}>

In [75]:
parse('cookie={cookie}, duration={duration}, table={table}, n_packets={n_packets}, n_bytes={n_bytes}, priority={priority},ip,in_port="{in_port}" actions={actions}', flow)

<Result () {'cookie': '0x9816da8e872d717d', 'duration': '298506.364s', 'table': '0', 'n_packets': '480', 'n_bytes': '20160', 'priority': '10', 'in_port': 'tapbbdf080b-c2', 'actions': 'NORMAL'}>

In [76]:
parse('cookie={cookie}, duration={duration}, table={table}, n_packets={n_packets}, n_bytes={n_bytes}, priority={priority},ip,in_port="{in_port}" actions={actions}', flow)

<Result () {'cookie': '0x9816da8e872d717d', 'duration': '298506.364s', 'table': '0', 'n_packets': '480', 'n_bytes': '20160', 'priority': '10', 'in_port': 'tapbbdf080b-c2', 'actions': 'NORMAL'}>

In [89]:
result['cookie']

'0x9816da8e872d717d'

2. parse的结果

In [91]:
rr = parse("I am {}, {} years old, {}", "I am Jack, 27 years old, male")

In [92]:
rr

<Result ('Jack', '27', 'male') {}>

In [93]:
rr[0]

'Jack'

3. 重复利用pattern

In [94]:
from parse import compile

In [95]:
pattern = compile("I am {}, {} years old, {}")

In [96]:
pattern.parse("I am Kack, 27 years old, male")

<Result ('Kack', '27', 'male') {}>

In [97]:
pattern.parse("I am gouzei, 18 years old, female")

<Result ('gouzei', '18', 'female') {}>

4. 类型转化

In [98]:
profile = parse("I am {name}, {age:d} years old, {gender}", "I am gouzei, 18 years old, female")

In [99]:
profile

<Result () {'name': 'gouzei', 'age': 18, 'gender': 'female'}>

In [100]:
type(profile['age'])

int

In [101]:
parse('Meet at {:tg}', 'Meet at 1/2/2011 11:00 PM')

<Result (datetime.datetime(2011, 2, 1, 23, 0),) {}>

![在这里插入图片描述](https://img-blog.csdnimg.cn/20210507111117261.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3FxXzIxNTc5MDQ1,size_16,color_FFFFFF,t_70)

5. 提取时取出空格

In [102]:
parse('hello {}, hello python', 'hello    world    , hello python')

<Result ('   world    ',) {}>

In [104]:
parse('hello {:^}, hello python', 'hello    world    , hello python')

<Result ('world',) {}>

In [105]:
parse('hello {:>}, hello python', 'hello    world    , hello python')

<Result ('world    ',) {}>

In [106]:
parse('hello {:<}, hello python', 'hello    world    , hello python')

<Result ('   world',) {}>

6. 大小写敏感开关

In [107]:
parse('SPAM', 'spam')

<Result () {}>

In [109]:
parse('SPAM', 'spam') is None

False

In [110]:
parse('SPAM', 'spam', case_sensitive=True) is None

True

In [111]:
parse('SPAM', 'spam', case_sensitive=True)

7. 匹配字符数

精确匹配：指定最大字符数

In [112]:
parse('{:.2}{:.2}', 'hello')

In [113]:
parse('{:.2}{:.2}', 'hell')

<Result ('he', 'll') {}>

模糊匹配：指定最小字符数

In [114]:
parse('{:.2}{:2}', 'hello')

<Result ('h', 'ello') {}>

In [115]:
parse('{:2}{:2}', 'hello')

<Result ('he', 'llo') {}>

若要在精准/模糊匹配的模式下，再进行格式转换

In [118]:
parse('{:2}{:2}', '1024')

<Result ('10', '24') {}>

In [119]:
parse('{:2d}{:2d}', '1024')

<Result (10, 24) {}>

8. 三个重要属性

Parse 里有三个非常重要的属性

- fixed：利用位置提取的匿名字段的元组
- named：存放有命名的字段的字典
- spans：存放匹配到字段的位置

In [120]:
profile = parse("I am {name}, {age:d} years old, {}", "I am Jack, 27 years old, male")

In [121]:
profile

<Result ('male',) {'name': 'Jack', 'age': 27}>

In [123]:
profile.fixed

('male',)

In [124]:
profile.named

{'name': 'Jack', 'age': 27}

In [125]:
profile.spans

{'name': (5, 9), 'age': (11, 13), 0: (25, 29)}

9. 自定义类型的转换

In [126]:
parse("I am {:d}", "I am 27")

<Result (27,) {}>

In [127]:
type(_[0])

int

等价于

In [128]:
def myint(string):
    return int(string)

In [129]:
parse("I am {:myint}", "I am 27", dict(myint=myint))

<Result (27,) {}>

In [130]:
type(_[0])

int

In [131]:
def shouty(string):
    return string.upper()

In [132]:
parse('{:shouty} world', 'hello world', dict(shouty=shouty))

<Result ('HELLO',) {}>

在一些简单的场景中，使用 parse 可比使用 re 去写正则开发效率不知道高几个 level，用它写出来的代码富有美感，可读性高，后期维护起代码来一点压力也没有，推荐你使用。

## 5. 一行代码让代码运行速度提高100倍

In [136]:
import time
def foo(x,y):
    tt = time.time()
    s = 0
    for i in range(x, y):
        s += i
    print('Time used: {} sec'.format(time.time() - tt))
    return s

print(foo(1, 100000000))

Time used: 4.702789068222046 sec
4999999950000000


In [138]:
from numba import jit

In [139]:
import time

In [140]:
@jit
def foo_jit(x,y):
    tt = time.time()
    s = 0
    for i in range(x, y):
        s += i
    print('Time used: {} sec'.format(time.time() - tt))
    return s

print(foo_jit(1, 100000000))

Compilation is falling back to object mode WITH looplifting enabled because Function "foo_jit" failed type inference due to: [1m[1mUnknown attribute 'time' of type Module(<module 'time' (built-in)>)
[1m
File "<ipython-input-140-6d183d8794bf>", line 3:[0m
[1mdef foo_jit(x,y):
[1m    tt = time.time()
[0m    [1m^[0m[0m
[0m
[0m[1mDuring: typing of get attribute at <ipython-input-140-6d183d8794bf> (3)[0m
[1m
File "<ipython-input-140-6d183d8794bf>", line 3:[0m
[1mdef foo_jit(x,y):
[1m    tt = time.time()
[0m    [1m^[0m[0m
[0m
  @jit
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "foo_jit" failed type inference due to: [1m[1mCannot determine Numba type of <class 'numba.core.dispatcher.LiftedLoop'>[0m
[1m
File "<ipython-input-140-6d183d8794bf>", line 5:[0m
[1mdef foo_jit(x,y):
    <source elided>
    s = 0
[1m    for i in range(x, y):
[0m    [1m^[0m[0m
[0m[0m
  @jit
[1m
File "<ipython-input-140-6d183d8794bf>", line 3

Time used: 0.10870504379272461 sec
4999999950000000


In [141]:
import numba as nb
from numba import jit

In [142]:
@jit('f8(f8[:])')
def sum1d(array):
    s = 0.0
    n = array.shape[0]
    for i in range(n):
        s += array[i]
    return s

In [143]:
import numpy as np

In [158]:
array = np.random.random(10000)

In [159]:
array

array([0.24113423, 0.01703803, 0.59646421, ..., 0.12706251, 0.01535441,
       0.8258294 ])

In [160]:
%timeit sum1d(array)

10.6 µs ± 54.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [147]:
%timeit np.sum(array)

7.41 µs ± 66.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [149]:
%timeit sum(array)

1.36 ms ± 36.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [155]:
import numpy as np

In [165]:
temp = np.ones(10, dtype=np.int32)

In [166]:
temp

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [161]:
sum1d(np.ones(10, dtype=np.int32))

TypeError: No matching definition for argument type(s) array(int32, 1d, C)

In [153]:
print(sum1d(np.ones(10, dtype=np.float32)))

TypeError: No matching definition for argument type(s) array(float32, 1d, C)

In [173]:
from numba import autojit

ImportError: cannot import name 'autojit' from 'numba.extending' (d:\python3.78\lib\site-packages\numba\extending.py)

工作原理 numba的通过meta模块解析Python函数的ast语法树，对各个变量添加相应的类型信息。然后调用llvmpy生成机器码，最后再生成机器码的Python调用接口。

**meta模块**

In [174]:
def add2(a, b):
    return a+b

In [176]:
import meta

In [177]:
from meta.decompiler import decompile_func

In [178]:
from meta.asttools import str_ast

In [182]:
from meta.asttools import python_source

In [183]:
decompile_func(add2)

IndexError: tuple index out of range

In [184]:
import py_compile

In [186]:
with open('tmp.py', 'w') as f:
    f.write("""
def square_sum(n):
    s = 0
    for i in range(n):
        s += i**2
    return s
    """)

In [187]:
py_compile.compile('tmp.py')

'__pycache__\\tmp.cpython-37.pyc'

In [191]:
with open('__pycache__\\tmp.cpython-37.pyc', "rb"):
    decompile_func(f)

AttributeError: '_io.TextIOWrapper' object has no attribute '__code__'

**llvmpy模块**

In [192]:
import llvm

ModuleNotFoundError: No module named 'llvm'

python2中才有，算了算了

numba所完成的工作就是：解析Python函数的ast语法树并加以改造，添加类型信息；将带类型信息的ast语法树通过llvmpy动态地转换为机器码函数，然后再通过和ctypes类似的技术为机器码函数创建包装函数供Python调用。

## 6. 新一代的调试神器：PySnooper

In [194]:
import pysnooper

In [195]:
@pysnooper.snoop()
def demo_func():
    profile = {}
    profile['name'] = "李英俊"
    profile['age'] = 27
    profile['gender'] = "male"
    
    return profile

In [196]:
def main():
    profile = demo_func()

In [198]:
main()

Source path:... <ipython-input-195-ff93daf3c53e>
14:34:57.118372 call         2 def demo_func():
14:34:57.118560 line         3     profile = {}
New var:....... profile = {}
14:34:57.118560 line         4     profile['name'] = "李英俊"
Modified var:.. profile = {'name': '李英俊'}
14:34:57.118598 line         5     profile['age'] = 27
Modified var:.. profile = {'name': '李英俊', 'age': 27}
14:34:57.118678 line         6     profile['gender'] = "male"
Modified var:.. profile = {'name': '李英俊', 'age': 27, 'gender': 'male'}
14:34:57.118720 line         8     return profile
14:34:57.118760 return       8     return profile
Return value:.. {'name': '李英俊', 'age': 27, 'gender': 'male'}
Elapsed time: 00:00:00.000454


1. 重定向到日志文件中

In [200]:
@pysnooper.snoop(output='debug.log')
def demo_func():
    profile = {}
    profile['name'] = "李英俊"
    profile['age'] = 27
    profile['gender'] = "male"
    
    return profile

In [202]:
main()

2. 跟踪非局部变量值

PySnooper 是以函数为单位进行调试的，它默认只会跟踪函数体内的局部变量，若想跟踪全局变量，可以给 `@pysnooper.snoop()` 加上 `watch` 参数

In [203]:
out = {"foo": "bar"}

In [205]:
@pysnooper.snoop(watch=('out["foo"]'))
def demo_func():
    profile = {}
    profile['name'] = "李英俊"
    profile['age'] = 27
    profile['gender'] = "male"
    
    out['foo'] = 'asd'
    
    return profile

In [206]:
main()

Source path:... <ipython-input-205-16e7c36b28dd>
Starting var:.. out["foo"] = 'bar'
14:52:56.832354 call         2 def demo_func():
14:52:56.832354 line         3     profile = {}
New var:....... profile = {}
14:52:56.832354 line         4     profile['name'] = "李英俊"
Modified var:.. profile = {'name': '李英俊'}
14:52:56.832354 line         5     profile['age'] = 27
Modified var:.. profile = {'name': '李英俊', 'age': 27}
14:52:56.832354 line         6     profile['gender'] = "male"
Modified var:.. profile = {'name': '李英俊', 'age': 27, 'gender': 'male'}
14:52:56.832354 line         8     out['foo'] = 'asd'
Modified var:.. out["foo"] = 'asd'
14:52:56.832354 line        10     return profile
14:52:56.832354 return      10     return profile
Return value:.. {'name': '李英俊', 'age': 27, 'gender': 'male'}
Elapsed time: 00:00:00.000000


watch 参数，接收一个可迭代对象（可以是list 或者 tuple）

In [None]:
@pysnooper.snoop(watch=('out["foo"]', 'foo.bar', 'self.foo["bar"]'))
def demo_func():
        ...

3. 设置跟踪函数的深度

当你使用 PySnooper 调试某个函数时，若该函数中还调用了其他函数，PySnooper 是不会傻傻的跟踪进去的。

如果你想继续跟踪该函数中调用的其他函数，可以通过指定 depth 参数来设置跟踪深度（不指定的话默认为 1）。

In [None]:
@pysnooper.snoop(depth=2)
def demo_func():
        ...

4. 设置调试日志的前缀

当你在使用 PySnooper 跟踪多个函数时，调试的日志会显得杂乱无章，不方便查看。

在这种情况下，PySnooper 提供了一个参数，方便你为不同的函数设置不同的标志，方便你在查看日志时进行区分。

In [None]:
@pysnooper.snoop(output="/var/log/debug.log", prefix="demo_func: ")
def demo_func():
    ...

5. 设置最大的输出长度

默认情况下，PySnooper 输出的变量和异常信息，如果超过 100 个字符，被会截断为 100 个字符。

In [None]:
@pysnooper.snoop(max_variable_length=200）
def demo_func():
    ...

也可以使用max_variable_length=None它从不截断它们

In [None]:
@pysnooper.snoop(max_variable_length=None）
def demo_func():
    ...

6. 支持多线程调试模式

In [None]:
@pysnooper.snoop(thread_info=True)
def demo_func():
    ...

7. 自定义对象的格式输出

In [208]:
class Person: pass

In [209]:
def print_person_obj(obj):
    return f"<Person {obj.name} {obj.age} {obj.gender}>"

In [210]:
@pysnooper.snoop(custom_repr=(Person, print_person_obj))
def test_func():
    person = Person()
    person.name = '你好'
    person.age = 27
    person.gender = "male"
    
    return person

In [211]:
def main():
    profile = test_func()

In [212]:
main()

Source path:... <ipython-input-210-c64cbcd85b5b>
15:12:07.013187 call         2 def test_func():
15:12:07.013187 line         3     person = Person()
New var:....... person = REPR FAILED
15:12:07.013187 line         4     person.name = '你好'
15:12:07.013187 line         5     person.age = 27
15:12:07.014189 line         6     person.gender = "male"
Modified var:.. person = <Person 你好 27 male>
15:12:07.014220 line         8     return person
15:12:07.014220 return       8     return person
Return value:.. <Person 你好 27 male>
Elapsed time: 00:00:00.001033


如果你要自定义格式输出的有很多个类型，那么 custom_repr 参数的值可以这么写

In [None]:
@pysnooper.snoop(custom_repr=((Person, print_person_obj), (numpy.ndarray, print_ndarray)))
def demo_func():
    ...

## 7. 比open更好用、更优雅的读取文件

1. 从标准输入中读取

2. 单独打开一个文件

In [1]:
import fileinput

In [8]:
with fileinput.input(files=('a.txt', )) as file:
    for line in file:
        print(f'{fileinput.filename()} 第{fileinput.lineno()}行：{line}', end='')

a.txt 第1行：hello
a.txt 第2行：world


需要说明的一点是，fileinput.input() 默认使用 mode='r' 的模式读取文件，如果你的文件是二进制的，可以使用mode='rb' 模式。fileinput 有且仅有这两种读取模式。

3. 批量打开多个文件

In [9]:
with fileinput.input(files=('a.txt', 'b.txt')) as file:
    for line in file:
        print(f'{fileinput.filename()} 第{fileinput.lineno()}行：{line}', end='')

a.txt 第1行：hello
a.txt 第2行：world
b.txt 第3行：hello
b.txt 第4行：python


如果想要在读取多个文件的时候，也能读取原文件的真实行号，可以使用 fileinput.filelineno() 方法

In [10]:
with fileinput.input(files=('a.txt', 'b.txt')) as file:
    for line in file:
        print(f'{fileinput.filename()} 第{fileinput.filelineno()}行：{line}', end='')

a.txt 第1行：hello
a.txt 第2行：world
b.txt 第1行：hello
b.txt 第2行：python


这个用法和 glob 模块简直是绝配

In [11]:
import glob

In [12]:
for line in fileinput.input(glob.glob('*.txt')):
    if fileinput.isfirstline():
        print('-'*20, f'Reading {fileinput.filename()}...', '-'*20)
    print(str(file.lineno()) + ': ' + line.upper(), end="")

-------------------- Reading a.txt... --------------------
4: HELLO
4: WORLD
-------------------- Reading b.txt... --------------------
4: HELLO
4: PYTHON
-------------------- Reading test1.txt... --------------------
4: 你是猪
4: 注视你

4. 读取的同时备份文件

In [14]:
with fileinput.input(files=('a.txt', ), backup=".bak") as file:
    for line in file:
        print(f'{fileinput.filename()} 第{fileinput.lineno()}行：{line}', end='')

a.txt 第1行：hello
a.txt 第2行：world


5. 标准输出重定向替换

In [15]:
import fileinput

In [16]:
with fileinput.input(files=('a.txt', ), inplace=True) as file:
    print("[INFO] task is started...")
    for line in file:
        print(f'{fileinput.filename()} 第{fileinput.lineno()}行：{line}', end='')
    print("[INFO] task is closed...")

[INFO] task is started...
[INFO] task is closed...


利用这个机制，很容易实现文本替换

In [17]:
import sys
import fileinput

In [18]:
for line in fileinput.input(files=('a.txt', ), inplace=True):
    # 将Windows/DOS格式下的文本文件转为Linux的文件
    if line[-2:] == '\r\n':
        line += "\n"
    sys.stdout.write(line)

In [27]:
with fileinput.input(files=('a.txt', ), inplace=True) as file:
    # 返回当前被读取的文件名，在第一行被读取之前返回None
    print(file.filename())
    # 返回以整数表示的当前文件「文件描述符」。当未打开文件时（处在第一行和文件之间），返回-1
    print(file.fileno())
    # 返回已被读取的累计行号。在第一行被读取之前，返回0；在最后一个问价你的最后一行被读取之后，返回改行的行号
    print(file.lineno())
    # 返回当前文件中的行号。在第一行被读取之前，返回0；在最后一个文件的最后一行被读取之后，返回此文件中该行的行号
    print(file.filelineno())
    # 如果刚刚读取的行是其所在文件的第一行则返回True，否则返回False
    print(file.isfirstline())
    # 如果最后读取的行来自sys.stdin则返回True，否则返回False
    print(file.isstdin())
    # 关闭当前文件以使下次迭代将从下一个文件（如果存在）读取第一行；不是从该文件读取的行将不会被计入累计行数。
    # 直到下一个文件的第一行被读取之后文件名才会改变。在第一行被读取之前，此函数将不会生效；它不能被用来跳过
    # 第一个文件。在最后一个文件的最后一行被读取之后，此函数将不再生效。
    print(file.nextfile())
    # 关闭序列
    file.close()

None
-1
0
0
False
False
None


7. 进阶一点的玩法

在 fileinput.input() 中有一个 openhook 的参数，它支持用户传入自定义的对象读取方法。

若你没有传入任何的勾子，fileinput 默认使用的是 open 函数。

fileinput 为我们内置了两种勾子供你使用

fileinput.hook_compressed(*filename*, *mode*)

使用 gzip 和 bz2 模块透明地打开 gzip 和 bzip2 压缩的文件（通过扩展名 '.gz' 和 '.bz2' 来识别）。 如果文件扩展名不是 '.gz' 或 '.bz2'，文件会以正常方式打开（即使用 `open() <https://docs.python.org/zh-cn/3/library/functions.html#open>`__ 并且不带任何解压操作）。使用示例: fi = fileinput.FileInput(openhook=fileinput.hook_compressed)

fileinput.hook_encoded(*encoding*, *errors=None*)

In [None]:
fi = fileinput.FileInput(openhook=fileinput.hook_encoded("utf-8", "surrogateescape"))

假如我想要使用 fileinput 来读取网络上的文件，可以这样定义勾子。

1. 先使用 requests 下载文件到本地
2. 再使用 open 去读取它

In [29]:
def online_open(url, mode):
    import requests
    r = requests.get(url)
    filename = url.split("/")[-1]
    with open(filename, "w") as f1:
        f1.write(r.content.decode("utf-8"))
    f2 = open(filename, 'r')
    return f2

In [30]:
file_url = 'https://www.csdn.net/robots.txt'

In [31]:
with fileinput.input(files=(file_url, ), openhook=online_open) as file:
    for line in file:
        print(line, end='')

User-agent: * 
Disallow: /scripts 
Disallow: /public 
Disallow: /css/ 
Disallow: /images/ 
Disallow: /content/ 
Disallow: /ui/ 
Disallow: /js/ 
Disallow: /scripts/ 
Disallow: /article_preview.html* 
Disallow: /tag/
Disallow: /*?*
Disallow: /link/

Sitemap: https://www.csdn.net/sitemap-aggpage-index.xml
Sitemap: https://www.csdn.net/article/sitemap.txt 


8. 列举一些使用案例

读取一个文件的所有行

In [33]:
for line in fileinput.input('a.txt'):
    print(line, end="")

a.txt 第1行：hello
a.txt 第2行：world


读取多个文件所有行

In [34]:
for line in fileinput.input(glob.glob("*.txt")):
    if fileinput.isfirstline():
        print('-'*20, f'Reading {fileinput.filename()}...', '-'*20)
    print(str(fileinput.lineno()) + ": " + line.upper(), end="")

-------------------- Reading a.txt... --------------------
1: A.TXT 第1行：HELLO
2: A.TXT 第2行：WORLD
-------------------- Reading b.txt... --------------------
3: HELLO
4: PYTHON
-------------------- Reading robots.txt... --------------------
5: USER-AGENT: * 
6: DISALLOW: /SCRIPTS 
7: DISALLOW: /PUBLIC 
8: DISALLOW: /CSS/ 
9: DISALLOW: /IMAGES/ 
10: DISALLOW: /CONTENT/ 
11: DISALLOW: /UI/ 
12: DISALLOW: /JS/ 
13: DISALLOW: /SCRIPTS/ 
14: DISALLOW: /ARTICLE_PREVIEW.HTML* 
15: DISALLOW: /TAG/
16: DISALLOW: /*?*
17: DISALLOW: /LINK/
18: 
19: SITEMAP: HTTPS://WWW.CSDN.NET/SITEMAP-AGGPAGE-INDEX.XML
20: SITEMAP: HTTPS://WWW.CSDN.NET/ARTICLE/SITEMAP.TXT 
-------------------- Reading test1.txt... --------------------
21: 你是猪
22: 注视你

利用fileinput将CRLF文件转为LF

In [36]:
for line in fileinput.input(files=('a.txt', ), inplace=True):
    if line[-2:] == '\r\n':
        line = line + "\n"
    sys.stdout.write(line)

配合re做日志分析：取所有含日期的行 

In [2]:
import re, fileinput, sys

In [2]:
pattern = '\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'

In [3]:
for line in fileinput.input('test.log', backup='.bak', inplace=1):
    if re.search(pattern, line):
        sys.stdout.write("=> ")
        sys.stdout.write(line)

利用fileinput实现类似于grep的功能

In [3]:
pattern = re.compile(sys.argv[1])
for line in fileinput.input(sys.argv[2]):
    if pattern.match(line):
        print(fileinput.filename(), fileinput.filelineno(), line)

## 8. 像操作路径一样，操作嵌套字典

In [1]:
import dpath.util

In [2]:
data = {
    "foo": {
        "bar": {
        "a": 10,
        "b": 20,
        "c": [],
        "d": ['red', 'buggy', 'bumpers'],
        }
    }
}

In [3]:
print(dpath.util.get(data, '/foo/bar/d'))

['red', 'buggy', 'bumpers']


In [4]:
print(dpath.util.search(data, "/foo/bar/[ab]"))

{'foo': {'bar': {'a': 10, 'b': 20}}}


In [8]:
print(dpath.util.values(data, "/foo/bar/*"))

[10, 20, [], ['red', 'buggy', 'bumpers']]


## 9. 读取文件中任意行的数据

In [9]:
import linecache

In [17]:
linecache.getline('a.txt', 4)

SyntaxError: invalid or missing encoding declaration for 'a.txt' (<string>)

In [13]:
linecache.getline('a.txt', 100000)

SyntaxError: invalid or missing encoding declaration for 'a.txt' (<string>)

## 10. 让你的装饰器写的更轻松的神库

1. 常规的装饰器

In [18]:
def deco(func):
    def wrapper(*args, **kw):
        print("Read to run task")
        func(*args, **kw)
        print("Successful to run task")
    return wrapper

In [19]:
@deco
def myfunc():
    print("Running the task")

In [20]:
myfunc()

Read to run task
Running the task
Successful to run task


2. 使用神库

In [21]:
from decorator import decorator

In [23]:
@decorator
def deco2(func, *args, **kw):
    print("Ready to run task")
    func(*args, **kw)
    print("Successful to run task")

In [24]:
@deco2
def myfunc2():
    print("Running the task")

In [25]:
myfunc2()

Ready to run task
Running the task
Successful to run task


3. 带参数的装饰器可用？

In [26]:
import time

In [27]:
@decorator
def warn_slow(func, timelimit=60, *args, **kw):
    t0 = time.time()
    result = func(*args, **kw)
    dt = time.time() - t0
    if dt > time.time() - t0:
        logging.warn('%s took %d seconds', func.__name__, dt)
    else:
        logging.info('%s took %d seconds', func.__name__, dt)
    return result

In [28]:
@warn_slow(timelimit=60)
def run_calculation(temdir, outdir):
    pass

4. 签名问题有解决？

In [29]:
def wrapper(func):
    def inner_function():
        ...
    return inner_function

In [30]:
@wrapper
def wrapped():
    ...

In [31]:
print(wrapped.__name__)

inner_function


目前，我们可以看到当一个函数被装饰器装饰过后，它的签名信息会发生变化（譬如上面看到的函数名）

那如何避免这种情况的产生？

functools.wraps 的作用就是将 被修饰的函数(wrapped) 的一些属性值赋值给 修饰器函数(wrapper) ，最终让属性的显示更符合我们的直觉。

In [32]:
from functools import wraps

In [33]:
def wrapper(func):
    @wraps(func)
    def inner_function():
        ...
    return inner_function

In [34]:
@wrapper
def wrapped():
    ...

In [35]:
print(wrapped.__name__)

wrapped


那么问题就来了，我们使用了 decorator 之后，是否还会存在这种签名的问题呢？

In [37]:
@decorator
def deco(func, *args, **kw):
    print("Ready to run task")
    func(*args, **kw)
    print("Successful to run task")

In [38]:
@deco
def myfunc3():
    print("Running the task")

In [39]:
print(myfunc3.__name__)

myfunc3


说明 decorator 已经默认帮我们处理了一切可预见的问题。