# 摘要
**“推导式”语法例子 （Comprehensions）**
- “列表推导式” （List Comprehension）
- “字典/集推导式” （Dictionary/Set Comprehension）
- “生成器推导式” （Generator Comprehension）

回忆：四种基本数据结构：列表（list），元组（tuple），集合（set），字典（dictionary）

作用：替代JavaScript中很常见的map跟filter函数。

**函数式编程（Functional Programming）在Python中的应用，与“推导式”的对比**
- Map function
- Filter function
- Reduce function

### 例1. 构建新列表（lists）
Construct new list using old one (item by item)
已知一个list：``` lst_num = [0,1,2,...,9]```，把每个数字转换为相应字符，存到一个新的数组里面。

（e.g., 1转换为‘1’）

In [1]:
# create list
lst_num = list(range(10))
print(lst_num)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


解法1: Use for loop

In [2]:
lst_str = []
# iterate every item in the list
for i in lst_num:
    lst_str.append(str(i))
print(lst_str)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']


解法2: List Comprehension

In [3]:
# list comprehension
lst_str = [ str(i) for i in lst_num ]
print(lst_str)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']


### 例2. 
提取0-9中的**奇数**，结果存到新的list中。

In [4]:
# only print odd number sub-list.
# for each item, times 10; print the sublist
lst_odd = []
for item in lst_num:
    if item % 2 == 1:
        lst_odd.append(item)
print(lst_odd)

[1, 3, 5, 7, 9]


In [5]:
lst_odd = [ i for i in lst_num if i % 2 == 1 ]
print(lst_odd)

[1, 3, 5, 7, 9]


### 例3
现在有三个语言在列表中：```['Python', 'Java', 'JavaScript']``` 

数字代表版本，现在有奇数版本 ```[1, 2, 3]```

列出所有组合。(e.g., "Python_v1")

In [6]:
languages = ['Python', 'Java', 'JavaScript']
versions = [1, 2, 3]

In [7]:
releases = []
for lang in languages:
    for v in versions:
        releases.append(lang+'_v'+str(v))
print(releases)

['Python_v1', 'Python_v2', 'Python_v3', 'Java_v1', 'Java_v2', 'Java_v3', 'JavaScript_v1', 'JavaScript_v2', 'JavaScript_v3']


In [8]:
# use list comprehension
releases = [ '{}_v{}'.format(l, v) for l in languages for v in versions ]
print(releases)

['Python_v1', 'Python_v2', 'Python_v3', 'Java_v1', 'Java_v2', 'Java_v3', 'JavaScript_v1', 'JavaScript_v2', 'JavaScript_v3']


### 例4 字典取逆
已知一个字典（dictionary），所有key与value**一一对应**

求：新的dictionary，原key变为value，原value变为key

（e.g, 已知 ```dict = {0:a, 1:b, 2:c}```，求 ``` dict_reversed = {a:0, b:1, c:2} ```）

In [9]:
char_dict = {0:'a', 1:'b', 2:'c'}
char_dict_reversed = {}

In [10]:
char_dict_reversed = {}
# iterate dictionary
for key in char_dict:
    value = char_dict[key]
    char_dict_reversed[value] = key
print(char_dict_reversed)

{'a': 0, 'b': 1, 'c': 2}


In [11]:
char_dict_reversed = {}
for key, value in char_dict.items():
    char_dict_reversed[value] = key
print(char_dict_reversed)

{'a': 0, 'b': 1, 'c': 2}


In [12]:
# use dict comprehension
char_dict_reversed = { val:key for key, val in char_dict.items()}
print(char_dict_reversed)

{'a': 0, 'b': 1, 'c': 2}


## 例4.2 求集运算
把上边字典中的所有字符存在一个集（set）中

In [13]:
char_dict = {0:'a', 1:'b', 2:'c', 3:'c', 4:'b'}

In [14]:
char_set = set()
for key, value in char_dict.items():
    char_set.add(value)
print(char_set)

{'b', 'c', 'a'}


In [15]:
# use comprehensions
char_set = { char for key, char in char_dict.items()}
print(char_set)

{'b', 'c', 'a'}


### **"生成器推导式"（"generator comprehensions"）**

没有元组（tuple）对应的comprehension，因为immutable（初始化之后无法变换）

但是，可以用comprehension生成generator。

In [None]:
# generators are more efficient, only compute when calling next() function, saves memory
# example: comparing efficiency between generators and lists
import time

t0 = time.time()
for i in range(int(1e7)):
    pass
print('It takes {} seconds to loop generators!'.format(time.time()-t0))


t0 = time.time()
for i in list(range(int(1e7))):
    pass
print('It takes {} seconds to loop list!'.format(time.time()-t0))


t0 = time.time()
big_lst = list(range(int(1e7)))
for i in big_lst:
    pass
print('It takes {} seconds to loop created list!'.format(time.time()-t0))


### 例5. "生成器推导式"
查看前面所有的软件版本，但是要求变成生成器，但是依次返回元组（tuple）**模拟online streaming**

In [None]:
print(releases)

In [None]:
# no generator for tuples but has syntax for generators
release_gen = ( r for r in releases )
print(type(release_gen))
for i in release_gen:
    time.sleep(0.2)
    print(i)

In [None]:
# next(reversed_char_gen)

In [None]:
# use for loop (should contained in a fuction)
def dict_reverse_gen(my_dict):
    for key, value in my_dict.items():
        yield {value:key}

my_reserse_gen = dict_reverse_gen(char_dict)

for i in my_reserse_gen:
    time.sleep(0.2)
    print(i)

#### 生成器总结
好处：
1. 省内存，不用一下把所有的计算结果存在内存中，但是很长时间不用。
2. 提供一种分步操控函数的方法
3. 一种流（stream）的概念
4. 其他包package经常返回generator结果

## 函数式编程

Python从Javascript借鉴了三个function：```map(), filter(), reduce()```
- comprehension更方便，可以代替这三个function
- reduce在Python 3中要调用，并且很少有使用场景
- 其他的包处理对象（object）更愿意提供map/filter这样的方法处理列表（list）

### 例7. Map()
语法：**map(function, iterable(s))**

function: 对每一项内容操作的函数；每一项必须要求返回一项
iterables: “可循环对象”（在for loop里可循环的对象），例子：list，dictionary，string，etc.

map()的特点：输入与输出的项数（长度）相同

In [None]:
# bonus point!
# vs. Javascript style functions (functional programming)
# map, filter, reduce
def square_func(i):
    return i*i

# map syntax: map(function, iterable(s))
squared_lst = map(square_func, [1,2,3])
print(type(squared_lst))
print(list(squared_lst))
squared_lst = map(square_func, range(1,4))
print(list(squared_lst))

# recall the usage of list comprehension
squared_lst = [ i*i for i in range(1,4)]
print(squared_lst)

In [None]:
char_dict = {1:'a', 2:'b', 3:'c'}
char_dict_reversed = {'a':1, 'b':2, 'c':3} # no way to access value in dict
def square_func_dictVals(k):
    return k*k
sqaured_lst = map(square_func_dictVals, char_dict)
print(list(sqaured_lst))
# sadly, map is not flexible enough to deal with dictionary
# (we cannot even compute dict values)

# The syntax is useful. It provides opportunities for package developers
# to borrow the map syntax. For example, map tasks to multiple cpu
# cores for multiprocessing as shown below

map()的语法经常被其他包类比使用，要记住！例如：executor.map()
把大的DataFrame分成几部分，定义对每一块执行的任务（function），每一块分到一个cpu中进行平行计算
```Python
import numpy as np
import pandas as pd
from multiprocessing import cpu_count
import concurrent.futures

def apply_parallel_df(df, func):
    df_lst = np.array_split(df, cpu_count())
    with concurrent.futures.ProcessPoolExecutor(max_workers=cpu_count()) as executor:
        ret_list = executor.map( func, df_lst )
    return pd.concat(ret_list)

```

### 例8. filter()
语法：**filter(function, iterable(s))**

参数（arguments）含义与map()相同。

filter()的特点：每个输入项，根据判断条件，返回True/False，进而筛选出感兴趣的项数。

In [None]:
# filter function
# let us use the filter function to get odds number (except 8)
lst_num = list(range(10))
# recall the task of find odd number sub-list
def get_odds(i):
    if i % 2 == 1:
        return True
    if i == 8:
        return True
    else:
        return False

lst_odd = list(filter(get_odds, lst_num))
print(lst_odd)

### 实践总结
credit to：《*Effective Python*》by Brett Slatkin

#### - Use list comprehension instead of map()

<img src="pics/remember1.png" width="400">

#### - Avoid more than two list comprehensions

<img src="pics/remember2.png" width="400">

Bad examples:

Bad practice 1. (two loops plus condition)
<img src="pics/bad_example1.png" width="400">

Bad practice 2. (three loops)
<img src="pics/bad_example1.png" width="400">


Congrats! You are now a master of Python comprehensions!

### 例9. reduce()
语法：**reduce(function, sequence\[, initial\])**

参数（arguments）前两项含义与map()相同。
<br>
<br>

**解释：**```reduce(sum, range(1,7))```的过程：

```[1,2,3,4,5,6]``` ==>```[sum(1,2),3,4,5,6]```==> ```[3,3,4,5,6]```

```[3,3,4,5,6]``` ==>```[sum(3,3),4,5,6]```==> ```[6,4,5,6]```

```[6,4,5,6]```==>```[sum(6,4),5,6]```==> ```[10,5,6]```

...

**返回：**```21```(a number, not a list anymore)
<br>
<br>
<br>

**解释：**```reduce(sum, range(1,7), 666)```的过程：

```[1,2,3,4,5,6]``` ==>(append 666 at front) ```[666,1,2,3,4,5,6]```

Then execute ```reduce(sum, [666,1,2,3,4,5,6])```

**返回：**```687```(a number, not a list anymore)
<br>
<br>
<br>

reduce()的特点：
- 递推，最后只返回一项
- Javascript会常用，Python不常用，了解即可
- Python 3中要调用才能使用（因为not Pythonic，没什么用）


In [None]:
# reduce
# syntax: reduce(function, sequence[, initial])
# reduce not a build-in in Python 3, needs to import it!
from functools import reduce

lst_num = list(range(1,7))
def reduce_sum(a, b):
    return a+b
odd_sum = reduce(reduce_sum, lst_num)
print(odd_sum)
odd_sum_pad = reduce(reduce_sum, lst_num, 666)
print(odd_sum_pad)