# map関数
## Agenda

- `map`関数の紹介
- `pandas.Series.map`の紹介


In [1]:
import pandas as pd
import numpy as np
import itertools

## `map`関数

function を、結果を返しながら iterable の全ての要素に適用するイテレータを返します。

### sysntax
```
map(function, iterable, ...)
```

In [2]:
list(map(int,  ("1", "2", "3", "4", "5")))

[1, 2, 3, 4, 5]

### map vs loop

In [3]:
%%time
def addition(n): 
    return n + n 
  
# We double all numbers using map() 
numbers = np.arange(0, 100000)
result = list(map(addition, numbers))
#list(result)

CPU times: user 29.9 ms, sys: 4.78 ms, total: 34.7 ms
Wall time: 42.1 ms


In [4]:
%%time
def addition(n): 
    return n + n 
  
# We double all numbers using map() 
numbers = np.arange(0, 100000)
res = []
for i in numbers:
    res.append(addition(i))

CPU times: user 45.4 ms, sys: 4.21 ms, total: 49.6 ms
Wall time: 58.4 ms


### 並列処理

In [5]:
from multiprocessing import Pool
import multiprocessing

In [6]:
%%time
def addition(n): 
    return n + n 
  
# We double all numbers using map() 
numbers = np.arange(0, 100000)
with Pool(multiprocessing.cpu_count()) as pool:
    res = list(pool.map(addition, numbers))

CPU times: user 527 ms, sys: 41.9 ms, total: 569 ms
Wall time: 652 ms


## List同士の演算とmap

二つの同じ長さ及び要素が同じタイプのリストを足し合わせたいとする。何も考えずmath operatorを用いると

In [7]:
l1 = [i for i in range(0, 10)]
l2 = [i for i in range(10, 30, 2)]
l1 + l2

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

In [8]:
res = list(map(lambda x, y: x + y, l1, l2))
print(res)

[10, 13, 16, 19, 22, 25, 28, 31, 34, 37]


長さが異なる場合も結果は返ってくる

In [9]:
l1 = [i for i in range(0, 5)]
l2 = [i for i in range(0, 6)]
res = list(map(lambda x, y: x + y, l1, l2))
print(res)

[0, 2, 4, 6, 8]


## 複数のListのデカルト積(直積)を出力する

### `itertools.product`

- イテレーターを返す

In [10]:
l1 = [i for i in range(0, 100)]
l2 = [i for i in range(101, 201)]
itertools.product(l1, l2)

<itertools.product at 0x116b015a0>

In [11]:
res_1 = list(itertools.product(l1, l2))
res_1[:10]

[(0, 101),
 (0, 102),
 (0, 103),
 (0, 104),
 (0, 105),
 (0, 106),
 (0, 107),
 (0, 108),
 (0, 109),
 (0, 110)]

### `for loop` vs `itertools.product`

In [12]:
res_2 = []
for i in l1:
    for j in l2:
        res_2.append((i, j))
res_1 == res_2

True

#### 速度比較

In [13]:
%%timeit
res_1 = list(itertools.product(l1, l2))

453 µs ± 4.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [14]:
%%timeit
res_2 = []
for i in l1:
    for j in l2:
        res_2.append((i, j))

1.31 ms ± 217 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


## pandas.Series.map
### syntax

```
Series.map(self, arg, na_action=None)
```

### params
- arg: function, collections.abc.Mapping subclass or Series - na_action: {None, ‘ignore’}, default None


In [15]:
s = pd.Series(['cat', 'dog', np.nan, 'rabbit'])
s

0       cat
1       dog
2       NaN
3    rabbit
dtype: object

In [16]:
s.map('I am a {}'.format, na_action='ignore')

0       I am a cat
1       I am a dog
2              NaN
3    I am a rabbit
dtype: object

In [17]:
s.map('I am a {}'.format, na_action=None)

0       I am a cat
1       I am a dog
2       I am a nan
3    I am a rabbit
dtype: object