## 写在前面

- importlib.reload
- \_\_name\_\_
- sys.path
- os.path
- dir, \_\_all\_\_, help, \_\_doc\_\_, \_\_file\_\_
- heapq: heappush, heappop, heapreplace, heapify, nlargest, nsmallest
- deque: append, appendleft, pop, popleft, rotate
- time: ctime, mktime, localtime, asctime
- random: random, uniform, randrange, choice, samples, shuffle
- re: compile, match, search, findall, split, sub, escape

## 模块
- 模块并不是用来执行操作（如打印文本）的，而是用于定义变量、函数、类等
- 模块存储在扩展名为.py的文件里
- 作用：重用代码
- 包：是一种模块，可以包含其他模块。包是一个包含\_\_init\_\_.py文件的目录

## 让模块可用

### 搜索路径  
- python解释器的搜索路径
- 最好把模块放在site-package目录中

In [1]:
import sys, pprint
pprint.pprint(sys.path)

['C:\\Users\\SHIDAWEI558\\jupyter notebook workspace',
 'D:\\app\\anaconda\\python37.zip',
 'D:\\app\\anaconda\\DLLs',
 'D:\\app\\anaconda\\lib',
 'D:\\app\\anaconda',
 '',
 'D:\\app\\anaconda\\lib\\site-packages',
 'D:\\app\\anaconda\\lib\\site-packages\\win32',
 'D:\\app\\anaconda\\lib\\site-packages\\win32\\lib',
 'D:\\app\\anaconda\\lib\\site-packages\\Pythonwin',
 'D:\\app\\anaconda\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\SHIDAWEI558\\.ipython']


### 添加路径

In [6]:
import sys
sys.path.append("D:\\app")

In [7]:
sys.path

['C:\\Users\\SHIDAWEI558\\jupyter notebook workspace',
 'D:\\app\\anaconda\\python37.zip',
 'D:\\app\\anaconda\\DLLs',
 'D:\\app\\anaconda\\lib',
 'D:\\app\\anaconda',
 '',
 'D:\\app\\anaconda\\lib\\site-packages',
 'D:\\app\\anaconda\\lib\\site-packages\\win32',
 'D:\\app\\anaconda\\lib\\site-packages\\win32\\lib',
 'D:\\app\\anaconda\\lib\\site-packages\\Pythonwin',
 'D:\\app\\anaconda\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\SHIDAWEI558\\.ipython',
 'D:\\app']

## import
- 模块只需要导入一次
- 如果确实要重新导入，使用importlib.reload

In [10]:
import numpy
import importlib
importlib.reload(numpy)

<module 'numpy' from 'D:\\app\\anaconda\\lib\\site-packages\\numpy\\__init__.py'>

## \_\_name\_\_  
判断模块是作为程序运行，还是被导入另一个模块

\# hello.py
<code>
    def hello():
        print("hello world")
    if \_\_name\_\_ == "\_\_main__":
        hello()
</code>
运行hello.py，输出"hello word"，此时\_\_name__等于\_\_main\_\_

\# hello2.py
<code>
    import hello
</code>
运行hello2.py不会输出"hello world"，此时\_\_name__等于hello

## 探索模块copy  
- 深复制：
    - 调用copy.deepcopy(x)
    - 创建x的属性的副本
- 浅复制：
    - 调用copy.copy(x)
    - 副本的属性关联到x的属性

### dir函数  
查询对象所有的属性，模块对应与变量、函数、类等

In [8]:
import copy

In [11]:
[n for n in dir(copy) if not n.startswith("__")]  # 过滤双下划线开头的属性

['Error',
 '_copy_dispatch',
 '_copy_immutable',
 '_deepcopy_atomic',
 '_deepcopy_dict',
 '_deepcopy_dispatch',
 '_deepcopy_list',
 '_deepcopy_method',
 '_deepcopy_tuple',
 '_keep_alive',
 '_reconstruct',
 'copy',
 'deepcopy',
 'dispatch_table',
 'error']

### \_\_all__  
- 模块的公有属性
- 其他属性必须显式导入

In [12]:
copy.__all__

['Error', 'copy', 'deepcopy']

### help函数  
查看模块所有信息

In [13]:
help(copy)

Help on module copy:

NAME
    copy - Generic (shallow and deep) copying operations.

DESCRIPTION
    Interface summary:
    
            import copy
    
            x = copy.copy(y)        # make a shallow copy of y
            x = copy.deepcopy(y)    # make a deep copy of y
    
    For module specific errors, copy.Error is raised.
    
    The difference between shallow and deep copying is only relevant for
    compound objects (objects that contain other objects, like lists or
    class instances).
    
    - A shallow copy constructs a new compound object and then (to the
      extent possible) inserts *the same objects* into it that the
      original contains.
    
    - A deep copy constructs a new compound object and then, recursively,
      inserts *copies* into it of the objects found in the original.
    
    Two problems often exist with deep copy operations that don't exist
    with shallow copy operations:
    
     a) recursive objects (compound objects that, directly 

### \_\_doc__  
文档字符串

In [15]:
print(copy.__doc__)

Generic (shallow and deep) copying operations.

Interface summary:

        import copy

        x = copy.copy(y)        # make a shallow copy of y
        x = copy.deepcopy(y)    # make a deep copy of y

For module specific errors, copy.Error is raised.

The difference between shallow and deep copying is only relevant for
compound objects (objects that contain other objects, like lists or
class instances).

- A shallow copy constructs a new compound object and then (to the
  extent possible) inserts *the same objects* into it that the
  original contains.

- A deep copy constructs a new compound object and then, recursively,
  inserts *copies* into it of the objects found in the original.

Two problems often exist with deep copy operations that don't exist
with shallow copy operations:

 a) recursive objects (compound objects that, directly or indirectly,
    contain a reference to themselves) may cause a recursive loop

 b) because deep copy copies *everything* it may copy too much, 

### \_\_file__  
源文件的地址

In [16]:
copy.__file__

'D:\\app\\anaconda\\lib\\copy.py'

## 常用的标准模块

### sys

In [25]:
import sys
print(''.join(reversed(sys.argv[1:])))

C:\Users\SHIDAWEI558\AppData\Roaming\jupyter\runtime\kernel-75b0a980-4385-429f-bc1b-21263db0ce79.json-f


### os

In [31]:
import os
os.environ['path'].split(';')

['D:\\app\\anaconda',
 'D:\\app\\anaconda\\Library\\mingw-w64\\bin',
 'D:\\app\\anaconda\\Library\\usr\\bin',
 'D:\\app\\anaconda\\Library\\bin',
 'D:\\app\\anaconda\\Scripts',
 'D:\\app\\anaconda\\bin',
 'D:\\app\\anaconda\\condabin',
 'D:\\app\\anaconda\\condabin\\Library\\mingw-w64\\bin',
 'D:\\app\\anaconda\\condabin\\Library\\usr\\bin',
 'D:\\app\\anaconda\\condabin\\Library\\bin',
 'D:\\app\\anaconda\\condabin\\Scripts',
 'D:\\app\\anaconda\\condabin\\bin',
 'D:\\app\\anaconda',
 'D:\\app\\anaconda\\Library\\mingw-w64\\bin',
 'D:\\app\\anaconda\\Library\\usr\\bin',
 'D:\\app\\anaconda\\Library\\bin',
 'D:\\app\\anaconda\\Scripts',
 'C:\\Program Files (x86)\\Common Files\\Oracle\\Java\\javapath',
 'C:\\Windows\\system32',
 'C:\\Windows',
 'C:\\Windows\\System32\\Wbem',
 'C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\',
 'C:\\Program Files\\TortoiseGit\\bin',
 'C:\\Program Files\\TortoiseSVN\\bin',
 'D:\\app\\python37\\Scripts',
 'D:\\app\\python37',
 'C:\\Users\\SHIDAWEI558\\App

### fileinput

### 集合、堆、双端队列

#### 集合

In [2]:
a = {1, 2, 3}
b = {2, 3, 4}

In [4]:
a & b

{2, 3}

In [8]:
a | b

{1, 2, 3, 4}

In [4]:
a - b

{1}

In [5]:
a ^ b

{1, 4}

In [11]:
a.add(frozenset(b))

In [12]:
a

{1, 2, 3, frozenset({2, 3, 4})}

#### 堆
- python中没有独立的堆类型
- 堆对象需要用列表来表示
- 堆函数不能用于普通列表，只能用于堆函数创建的列表
- 堆特征

In [8]:
from heapq import *
from random import shuffle

In [15]:
data = list(range(10))
heap = []
shuffle(data)
for n in data:
    heappush(heap, n)

In [16]:
heap

[0, 2, 1, 4, 3, 9, 7, 8, 5, 6]

In [17]:
heappop(heap)

0

In [20]:
heappush(heap, 0.5)

In [21]:
heappop(heap)

0.5

In [22]:
heap.

[1, 2, 6, 4, 3, 9, 7, 8, 5]

In [23]:
heapreplace(heap, 0.5)

1

In [99]:
heap = [0, 2, 1, 4, 3, 9, 7, 8, 5, 6]
heapify(heap)

In [25]:
heap

[0, 2, 1, 4, 3, 9, 7, 8, 5, 6]

In [26]:
itera = [0, 2, 1, 4, 3, 9, 7, 8, 5, 6]
nlargest(5, itera)

[9, 8, 7, 6, 5]

In [27]:
nsmallest(5, itera)

[0, 1, 2, 3, 4]

#### 双端队列
- 由可迭代对象创建
- 具有列表类似的方法

In [89]:
from collections import deque

In [92]:
deq = deque(range(5))

In [93]:
deq.append(5)
deq

deque([0, 1, 2, 3, 4, 5])

In [94]:
deq.appendleft(-1)
deq

deque([-1, 0, 1, 2, 3, 4, 5])

In [95]:
deq.pop()
deq

deque([-1, 0, 1, 2, 3, 4])

In [96]:
deq.popleft()
deq

deque([0, 1, 2, 3, 4])

In [97]:
deq.rotate(3)
deq

deque([2, 3, 4, 0, 1])

In [98]:
deq.rotate(-2)
deq

deque([4, 0, 1, 2, 3])

### time  
- 时间元组和字符串
- timeit计算小段代码的运行时间

In [104]:
import time

In [119]:
time.time()

1559110375.018305

In [121]:
time.ctime()

'Wed May 29 14:13:53 2019'

In [123]:
time.asctime((2008, 1, 21, 12, 2, 56, 0, 21, 0))  # 将日期元组转换为字符串

'Mon Jan 21 12:02:56 2008'

In [109]:
time.gmtime()  # 从新纪元开始后的秒数转换为国际标准时间

time.struct_time(tm_year=2019, tm_mon=5, tm_mday=29, tm_hour=6, tm_min=3, tm_sec=3, tm_wday=2, tm_yday=149, tm_isdst=0)

In [112]:
time.localtime()  # 从新纪元开始后的秒数转换为日期元组(本地时间)

time.struct_time(tm_year=2019, tm_mon=5, tm_mday=29, tm_hour=14, tm_min=5, tm_sec=44, tm_wday=2, tm_yday=149, tm_isdst=0)

In [37]:
import timeit

In [63]:
timeit.timeit('map(lambda x:pow(x,2), range(1000))')

0.4801158270001906

In [74]:
timeit.timeit('(pow(x, 2) for x in range(1000))')

0.7141235710000728

### random

#### 产生伪随机数

In [134]:
random.random()

0.7624777499810539

In [135]:
random.uniform(-1, 1)

-0.8126162690893883

In [136]:
random.randrange(0, 10, 2)

4

In [139]:
random.choice(list(range(10)))

8

In [141]:
lst = list(range(10))
random.shuffle(lst)
lst

[2, 0, 7, 6, 8, 9, 5, 4, 3, 1]

In [142]:
random.sample(lst, 5)

[5, 8, 2, 1, 6]

#### 随机时间

In [143]:
from random import uniform
from time import mktime, localtime, asctime
time1 = mktime((2019, 1, 1, 0, 0, 0, -1, -1, -1))
time2 = mktime((2020, 1, 1, 0, 0, 0, -1, -1, -1))
random_time = uniform(time1, time2)
asctime(localtime(random_time))

'Fri Feb  1 22:00:31 2019'

#### 掷骰子

In [1]:
from random import randrange
dices = int(input("How many dices:"))
sides = int(input("How many sides per dice:"))
record = []
for i in range(dices):
    record.append(randrange(sides) + 1)
print(record)
print("The total point is {}.".format(sum(record)))

How many dices:3
How many sides per dice:6
[1, 6, 6]
The total point is 13.


#### 发扑克牌

In [23]:
from random import shuffle
from pprint import pprint

In [24]:
values = ['A'] + list(range(2, 11)) + ['J', 'Q', 'K']
suits = ['hearts', 'spades', 'clubs', 'diamonds']
poker = ['{} of {}'.format(v, s) for v in values[0:14] for s in suits] + ['Black Joker', 'Red Joker']

In [25]:
player1, player2, player3, player4 = [], [], [], []
shuffle(poker)
poker = iter(poker)
for n in range(round(54/4)+1):
    try:
        player1.append(next(poker))
        player2.append(next(poker))
        player3.append(next(poker))
        player4.append(next(poker))
    except StopIteration:
        break

### shelve和json  
- 模块shelve的open函数，将文件名作为参数，返回一个shelf对象，用来存储数据，可像字典一样进行操作，键必须为字符串，操作完毕时，调用close方法
- 元素赋值给键，该元素才能被存储
- 修改shelf对象存储的数据
    - 通过创建副本来修改，然后再存储
    - open函数的参数writeback设置为True

In [1]:
import shelve
s = shelve.open('test.dat')
s['x'] = ['a', 'b', 'c']
s['x'].append('d')
s['x']

['a', 'b', 'c']

In [3]:
temp = s['x']
temp.append('d')
s['x'] = temp
s['x']

['a', 'b', 'c', 'd']

In [4]:
s1 = shelve.open('test1.dat', writeback=True)
s1['x'] = ['a', 'b', 'c']
s1['x'].append('d')
s1['x']

['a', 'b', 'c', 'd']

#### 存储数据

In [5]:
import shelve
def store_person(db):
    """
    让用户输入数据并且存储在shelf对象中
    """
    try:
        pid = input("请输入id：")
        person = {}
        person['name'] = input("请输入姓名：")
        person['age'] = input("请输入年龄：")
        person['phone'] = input("请输入号码：")
        db[pid] = person
    except:
        print("存储异常")
    else:
        print("存储成功")
    
def lookup_person(db):
    """
    让用户输入pid和字段，从shelf对象中获取相应的数据
    """
    pid = input("请输入id：")
    field = input("请输入要查询的信息(name,age,phone)：")
    field = field.strip().lower()
    try:
        print(field.capitalize() + '：', db[pid][field])
    except:
        print("信息不存在")

def print_help():
    """
    显示所有命令信息
    """
    print("store ：存储信息")
    print("lookup ：查询信息")
    print("q ：保存信息并退出")
    print("? ：显示所有命令信息")

def enter_command():
    """
    获取用户命令
    """
    cmd = input("请输入命令：")
    cmd.strip().lower()
    return cmd

In [9]:
if __name__ == "__main__":
    database = shelve.open("database.dat")
    while True:
        cmd = enter_command()
        if cmd == "store":
            store_person(database)
        elif cmd == "lookup":
            lookup_person(database)
        elif cmd == "?":
            print_help()
        elif cmd == "q":
            database.close()
            print("谢谢使用")
            break
        else:
            print("命令错误")
            continue

请输入命令：lookup
请输入id：001
请输入要查询的信息(name,age,phone)：name
Name： Apache
请输入命令：lookup
请输入id：001
请输入要查询的信息(name,age,phone)：age
Age： 18
请输入命令：lookup
请输入id：001
请输入要查询的信息(name,age,phone)：phone
Phone： 010-67729999
请输入命令：?
store ：存储信息
lookup ：查询信息
q ：保存信息并退出
? ：显示所有命令信息
请输入命令：q
谢谢使用


### re

#### 基础知识

##### 特殊字符
***
符号|描述
:-:|:-
.|通配符，匹配除换行符之外的所有字符
^|脱字符，匹配字符串开头，在[]中表示取反
$|美元，匹配字符串结束
*|匹配0次或多次
+|匹配一次或多次
?|匹配0次或一次
{m,n}|匹配m和n之间的任意次数
{m,}|至少匹配m次
[abc]|匹配a,b,c中任意一个
[^abc]|a,b,c中任意一个都不匹配
(abc)|表示abc是一个整体
http&#124;https|二分，二选一
\\ |转义字符
-|连字符

##### 正则表达式函数
***
函数|描述
-|-
compile(pattern[, flags]) |根据包含正则表达式的字符串创建模式对象
search(pattern, string[, flags]) |在字符串中查找模式
match(pattern, string[, flags]) |在字符串开头匹配模式
split(pattern, string[, maxsplit=0]) |根据模式来分割字符串
findall(pattern, string) |返回一个列表，其中包含字符串中所有与模式匹配的子串
sub(pat, repl, string[, count=0]) |将字符串中与模式pat匹配的子串都替换为repl
escape(string) |对字符串中所有的正则表达式特殊字符都进行转义

In [27]:
import re

In [26]:
re.split('o(o)', 'foobar')

['f', 'o', 'obar']

In [56]:
some_text = 'alpha, beta,,,,gamma delta'
re.split('[, ]+', some_text)

['alpha', 'beta', 'gamma', 'delta']

In [57]:
re.split('[, ]+', some_text, maxsplit=2)

['alpha', 'beta', 'gamma delta']

In [36]:
text = '"Hm... Err -- are you sure?" he said, sounding insecure.'
re.findall('[a-zA-Z]+', text)

['Hm', 'Err', 'are', 'you', 'sure', 'he', 'said', 'sounding', 'insecure']

In [52]:
re.findall('[".\-?,]+', text)

['"', '...', '--', '?"', ',', '.']

In [58]:
re.sub('{name}', 'Mr. Gumby', 'Dear {name}...')

'Dear Mr. Gumby...'

In [59]:
re.escape('https://www.python.org')

'https://www\\.python\\.org'

In [74]:
re.match('python', 'hello') == None

True

#### 匹配对象和编组
match和search若找到与模式对象相匹配的子串，则返回匹配对象，否则返回None
- 编组：模式对象中圆括号包括的内容
- group
- start
- end
- span

In [2]:
import re

In [3]:
pat = 'www\.(.*)\.(.*)\.(.*)'
url = 'www.suibe.edu.cn'

In [5]:
m = re.match(pat, url)

In [6]:
m.group(0)

'www.suibe.edu.cn'

In [7]:
m.group(1)

'suibe'

In [8]:
m.group(2)

'edu'

In [9]:
m.group(3)

'cn'

In [11]:
m.start(0)

0

In [12]:
m.start(1)

4

In [13]:
m.end(1)

9

In [14]:
m.span(1)

(4, 9)

In [16]:
s = re.search(pat, url)

In [21]:
s.group(0)

'www.suibe.edu.cn'

#### 替换中的编组
形式上就是编组的内容不变，其他内容被替换

In [29]:
pat = r'\*([^\*]+)\*'
text = 'hello *python* world!'

In [30]:
re.sub(pat, r'<em>\1</em>', text)

'hello <em>python</em> world!'

#### 贪婪与非贪婪匹配

In [39]:
pat1 = r'\*(.+)\*'  # 贪婪
pat2 = r'\*(.+?)\*'  # 非贪婪
text1 = '*This* is *it*'

In [42]:
re.sub(pat1, r'<em>\1</em>', text1)

'<em>This* is *it</em>'

In [43]:
re.sub(pat2, r'<em>\1</em>', text1)

'<em>This</em> is <em>it</em>'

In [46]:
pat3 = r'\*\*(.+?)\*\*'
text2 = '**This** is **it**'

In [47]:
re.sub(pat3, r'<em>\1</em>', text2)

'<em>This</em> is <em>it</em>'

#### 给正则表达式添加注释

In [49]:
emphasis_pattern = re.compile(r'''
    \*        # 起始突出标志——一个星号
    (         # 与要突出的内容匹配的编组的起始位置
    [^\*]+    # 与除星号外的其他字符都匹配
    )         # 编组到此结束
    \*        # 结束突出标志
    ''', re.VERBOSE)

In [50]:
text1 = '*This* is *it*.'
emphasis_pattern.sub(r'<em>\1</em>', text1)

'<em>This</em> is <em>it</em>.'

#### 例子：查找发件人

In [57]:
email = '''From foo@bar.baz Thu Dec 20 01:22:50 2008
Return-Path: <foo@bar.baz>
Received: from xyzzy42.bar.com (xyzzy.bar.baz [123.456.789.42])
by frozz.bozz.floop (8.9.3/8.9.3) with ESMTP id BAA25436
for <magnus@bozz.floop>; Thu, 20 Dec 2004 01:22:50 +0100 (MET)
Received: from [43.253.124.23] by bar.baz
(InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP
id <20041220002242.ADASD123.bar.baz@[43.253.124.23]>; Thu, 20 Dec 2004 00:22:42 +0000
User-Agent: Microsoft-Outlook-Express-Macintosh-Edition/5.02.2022
Date: Wed, 19 Dec 2008 17:22:42 -0700
Subject: Re: Spam
From: Foo Fie <foo@bar.baz>
To: Magnus Lie Hetland <magnus@bozz.floop>
CC: <Mr.Gumby@bar.baz>
Message-ID: <B8467D62.84F foo@baz.com> %
In-Reply-To: <20041219013308.A2655@bozz.floop> Mime- version: 1.0
Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit
Status: RO
Content-Length: 55
Lines: 6
So long, and thanks for all the spam!
Yours,
Foo Fie'''

In [61]:
pat = re.compile('From: (.+) <.+?>')
e = pat.search(email)

In [63]:
e.group(1)

'Foo Fie'

#### 例子：找出所有邮箱

In [65]:
pat = re.compile('[a-zA-Z0-9\-\.]+@[a-zA-Z0-9\-\.]+')
e = pat.findall(email)
set(e)

{'20041219013308.A2655@bozz.floop',
 'Mr.Gumby@bar.baz',
 'foo@bar.baz',
 'foo@baz.com',
 'magnus@bozz.floop'}