# 看完后可以学到什么

- 列表表达式果然是最简单高效的
- 原来还有字典表达式这种东西存在
- 原来还有个zip函数可以用来组装tuple
- 对字典按照value值排序的方法
- 学会使用functools的reduce以及map函数
- 学会使用collections的nametuple, OrderedDict, Counter, deque
- 学会使用pickle保存python对象到文件中以及从文件中加载python对象

#  如何在列表,字典, 集合中根据条件筛选数据

## 过滤列表的负数

In [9]:
from random import randint
data = [randint(-10, 10) for _ in range(10)]
data

[-4, 7, 2, -8, -10, 7, 0, -5, -10, -8]

### 方法一

In [10]:
filter(lambda x: x > 0, data)

[7, 2, 7]

### 方法二（推荐）

In [11]:
# 方法二
[x for x in data if x >= 0]

[7, 2, 7, 0]

### 两种方法的时间比较

In [12]:
%timeit filter(lambda x:x > 0, data)

1000000 loops, best of 3: 1.11 µs per loop


In [13]:
%timeit [x for x in data if x >= 0]

1000000 loops, best of 3: 362 ns per loop


**结论：使用列表表达式更快。** 

## 筛出字典某些元素

In [14]:
stu = {x: randint(60, 101) for x in range(1, 15)}  # 字典表达式
stu

{1: 96,
 2: 84,
 3: 96,
 4: 76,
 5: 73,
 6: 96,
 7: 81,
 8: 70,
 9: 87,
 10: 96,
 11: 99,
 12: 60,
 13: 83,
 14: 71}

In [15]:
{k: v for k, v in stu.items() if v >= 90}

{1: 96, 3: 96, 6: 96, 10: 96, 11: 99}

## 筛出集合中某些元素

In [19]:
s = set(data)
print s
{x for x in s if x % 2 == 0}

set([0, 2, 7, -10, -8, -5, -4])


{-10, -8, -4, 0, 2}

# 如何为元组中的每个元素命名, 提高程序可读性

场景描述：对于元组stu，如果我们想访问第i个field的值的话，就是stu[i]，这样我们的程序中就会充斥着许多诸如0123之类的下标，这样的程序的可读性是比较差的。

In [21]:
stu = ('Jim', 16, 'male', 'jim@gmail.com')
stu[0]

'Jim'

## 方法一

In [22]:
name, age, sex, mail = range(4)
#现在要访问第一个field就只需要用可读性更好的stu[name]来实现了
stu[name]

'Jim'

## 方法二

In [29]:
from collections import namedtuple
#第一个参数为类的名字，通常设置为与返回值同名
student = namedtuple("student", ['name', 'age', 'sex', 'mail'])
s = student('Jim', 16, 'male', 'jim@gmail.com')
s.name

'Jim'

# 如何统计序列中元素的出现频度

使用Counter

## 例子1：随机序列出现次数最多的3个数

In [35]:
data = [randint(0, 10) for _ in range(20)]
data

[4, 8, 0, 0, 2, 4, 0, 0, 8, 9, 4, 9, 6, 5, 6, 10, 3, 9, 8, 1]

In [36]:
from collections import Counter
c = Counter(data)
c.most_common(3)

[(0, 4), (4, 3), (8, 3)]

## 例子2：对一英文文本进行单词词频统计, 找到频率最高的10个单词

In [44]:
import re
txt = open('data/words-en.txt').read()
txt

"\nPython\nPython logo and wordmark.svg\nParadigm\tMulti-paradigm: functional, imperative, object-oriented, reflective\nDesigned by\tGuido van Rossum\nDeveloper\tPython Software Foundation\nFirst appeared\t1990; 29 years ago[1]\nStable release\t\n3.7.2 / 24 December 2018; 23 days ago[2]\n2.7.15 / 1 May 2018; 8 months ago[3]\nTyping discipline\tDuck, dynamic, gradual (since 3.5),[4] strong\nLicense\tPython Software Foundation License\nFilename extensions\t.py, .pyc, .pyd, .pyo (prior to 3.5),[5] .pyw, .pyz (since 3.5)[6]\nWebsite\twww.python.org\nMajor implementations\nCPython, IronPython, Jython, MicroPython, Numba, PyPy, Stackless Python, CircuitPython\nDialects\nCython, RPython\nInfluenced by\nABC,[7] ALGOL 68,[8] APL[9] C,[10] C++,[11] CLU,[12] Dylan,[13] Haskell,[14] Icon,[15] Java,[16] Lisp,[17] Modula-3,[11] Perl, Standard ML[9]\nInfluenced\nBoo, Cobra, CoffeeScript,[18] D, F#, Genie,[19] Go, Apache Groovy, JavaScript,[20][21] Julia,[22] Nim, Ring,[23] Ruby,[24] Swift[25]\n Pytho

In [45]:
c = Counter(re.split('\W+', txt))  # 分割
c.most_common(10)

[('Python', 14),
 ('and', 9),
 ('3', 6),
 ('a', 4),
 ('by', 4),
 ('5', 4),
 ('Foundation', 3),
 ('has', 3),
 ('2018', 3),
 ('7', 3)]

# 如何根据字典中值的大小, 对字典中的项排序

In [46]:
d = {x: randint(60, 100) for x in 'xyzabc'}
d

{'a': 84, 'b': 83, 'c': 65, 'x': 68, 'y': 85, 'z': 74}

## 单纯用sorted只能对key值排序

In [47]:
sorted(d)

['a', 'b', 'c', 'x', 'y', 'z']

## 方法一：zip函数，修改字典key和value的顺序作为元组来排序

In [52]:
s = sorted(zip(d.values(), d.keys()))
s

[(65, 'c'), (68, 'x'), (74, 'z'), (83, 'b'), (84, 'a'), (85, 'y')]

In [53]:
dict(s)  #然后再转换回字典

{65: 'c', 68: 'x', 74: 'z', 83: 'b', 84: 'a', 85: 'y'}

## 方法二：传递 sorted 函数的 key 参数（推荐）

In [54]:
sorted(d.items(), key=lambda x: x[1])

[('c', 65), ('x', 68), ('z', 74), ('b', 83), ('a', 84), ('y', 85)]

#  如何快速找到多个字典中的公共键(key)

In [81]:
from random import randint, sample
s1 = {x: randint(1, 4) for x in sample('abcdefg', randint(3, 6))}
s2 = {x: randint(1, 4) for x in sample('abcdefg', randint(3, 6))}
s3 = {x: randint(1, 4) for x in sample('abcdefg', randint(3, 6))}
print s1.viewkeys(), s2.viewkeys(), s3.viewkeys()
print s1.viewkeys() & s2.viewkeys() & s3.viewkeys()

dict_keys(['c', 'd', 'f']) dict_keys(['c', 'b', 'e', 'd', 'g', 'f']) dict_keys(['c', 'b', 'g', 'f'])
set(['c', 'f'])


注：viewkeys返回的是集合类型，所以可以进行&操作。

In [86]:
map(dict.keys, [s1, s2, s3])

[['c', 'd', 'f'], ['c', 'b', 'e', 'd', 'g', 'f'], ['c', 'b', 'g', 'f']]

In [85]:
from functools import reduce
reduce(lambda a, b: a & b, map(dict.viewkeys, [s1, s2, s3]))

{'c', 'f'}

#  如何让字典保持有序

In [55]:
d = {}
d['Jim'] = (1, 35)
d['Tom'] = (2, 37)
d['Bob'] = (3, 42)
for k in d:
    print(k)
# Bob Jim   Tom 并不是按照进入字典序列打印

Bob
Jim
Tom


In [56]:
from collections import OrderedDict
d = OrderedDict()
d['Jim'] = (1, 35)
d['Tom'] = (2, 37)
d['Bob'] = (3, 42)
for k in d:
    print(k)
    # Jim Tom Bob

Jim
Tom
Bob


# 实现用户的历史记录功能

In [91]:
from collections import deque
q = deque([], 2)
q.append(1)
q.append(2)
q.append(3)
q

deque([2, 3])

保存python对象到文件中：

In [92]:
import pickle
pickle.dumps?

加载文件中的python对象：

In [93]:
pickle.loads?