# 第4回課題
内包表記，並列処理

In [1]:
#!pip install joblib

In [4]:
from joblib import Parallel, delayed
import numpy as np
import pprint
pp = pprint.PrettyPrinter(indent=4)

## 内包表記
python における高速化テクの1つである内包表記の実装（for 文の削減）

### 課題3

以下の for 文を内包表記に書き直しなさい．
```[python]
data = []
for i in range(5):
    for j in range(4):
        data.append(i*j)
data
```
出力：
[0, 0, 0, 0, 0, 1, 2, 3, 0, 2, 4, 6, 0, 3, 6, 9, 0, 4, 8, 12]

In [2]:
[i*j for i in range(5) for j in range(4)]

[0, 0, 0, 0, 0, 1, 2, 3, 0, 2, 4, 6, 0, 3, 6, 9, 0, 4, 8, 12]

### 課題2

以下の for 文を内包表記に書き直しなさい．
```[python]
data = []
for i in range(5):
    inner = []
    for j in range(i, 6):
        inner.append(i)
    data.append(inner)
data
```
出力：
[[0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3], [4, 4]]

In [3]:
[[i for j in range(i,6)] for i in range(5)]

[[0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3], [4, 4]]

## 並列処理
python における高速化テクの1つである並列化の実装（複数コア，CPU の使用）

joblib だけでなく multiprocessing もある

### 例：2乗の場合

In [10]:
%%timeit 3 # 3回下記のコードを実行した時の時間を表示
r = []
for i in range(10000):
    r.append(i ** 2)
r[:3] + r[-3:]

3.08 ms ± 33.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [11]:
?Parallel

In [9]:
%%timeit 3 
power = lambda x: x**2
r = Parallel(n_jobs=-1, verbose=0)([delayed(power)(i) for i in range(10000)])
r[:3] + r[-3:]

277 ms ± 10.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [6]:
power = lambda x: x**2
r = Parallel(n_jobs=-1, verbose=3)( [delayed(power)(i) for i in range(10000)] )
r[:3] + r[-3:]

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  16 tasks      | elapsed:    0.4s
[Parallel(n_jobs=-1)]: Done 488 tasks      | elapsed:    0.5s
[Parallel(n_jobs=-1)]: Done 10000 out of 10000 | elapsed:    0.7s finished


[0, 1, 4, 99940009, 99960004, 99980001]

In [7]:
%%timeit 3
# numpy が最速というのはよくある
np.arange(10000) ** 2

10.4 µs ± 99.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [8]:
%%timeit 3
np.fromiter(range(10000), dtype=np.int) ** 2

261 µs ± 4.64 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


### 課題1

x = [(i, i+1) for i in range(100000)] としたとき，各 index 毎に積を求めよ

出力 [0, 2, 6,（中略） 9999500006, 9999700002, 9999900000]

In [1]:
x=[(i,i+1) for i in range(100000)]

In [17]:
multi = lambda x: x[0]*x[1]
r = Parallel(n_jobs=-1, verbose=3)( [delayed(multi)(i) for i in x] )
r[:3] + r[-3:]

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  16 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 904 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 99200 tasks      | elapsed:    1.4s
[Parallel(n_jobs=-1)]: Done 100000 out of 100000 | elapsed:    1.5s finished


[0, 2, 6, 9999500006, 9999700002, 9999900000]

### 発展課題
C = np.cumsum(np.arange(1000)) の各項の差を求めよ

出力[1, 2, 3,（中略） 997, 998, 999] = np.arange(1, 1000)

In [20]:
C=np.cumsum(np.arange(1000))

In [27]:
len(C)

1000

In [28]:
def sub(x,y):
    return y-x
r = Parallel(n_jobs=-1, verbose=3)( [delayed(sub)(C[i],C[i+1]) for i in range(len(C)-1)] )
r[:3] + r[-3:]

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  16 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 824 tasks      | elapsed:    0.0s
[Parallel(n_jobs=-1)]: Done 999 out of 999 | elapsed:    0.1s finished


[1, 2, 3, 997, 998, 999]