# pandas的数据转换函数map、apply、applymap
数据转换函数对比: map、apply、applymap:
1. map:用于Series, 实现每个值->值的映射;
2. apply: 用于Series, 实现每个值的处理, 用于DataFrame实现某个周的Series的处理;
3. applymap: 只能用于DataFrame, 用于处理改DataFrame的每个元素;

## 1.map用于Series的值的转换
示例: 将股票代码英文转换为中文名字
`Series.map(dict) or Series.map(function)`

In [1]:
import pandas as pd

stock_df = pd.read_csv('./data/stock/stock.csv')
stock_df.head()

Unnamed: 0,日期,公司,收盘,开盘,高,低,交易量,涨跌幅
0,2023/3/27,BABA,86.12,87.13,88.22,85.5,18.18M,-0.90%
1,2023/3/24,BABA,86.9,85.87,88.11,85.63,20.35M,0.44%
2,2023/3/23,BABA,86.52,87.68,88.38,85.26,26.79M,3.43%
3,2023/3/22,BABA,83.65,84.84,85.39,83.51,21.08M,-0.06%
4,2023/3/21,BABA,83.7,82.46,84.09,82.0,16.44M,3.33%


In [2]:
stock_df['公司'].unique()

array(['BABA', 'JD', 'BAIDU'], dtype=object)

In [3]:
# 公司股票代码到中文的映射
dict_company_name = {
    'baidu': '百度',
    'baba': '阿里巴巴',
    'jd': '京东'
}

**方法1:Series.map(dict)**

In [4]:
stock_df['公司中文1'] = stock_df['公司'].str.lower().map(dict_company_name)

In [5]:
stock_df.head()

Unnamed: 0,日期,公司,收盘,开盘,高,低,交易量,涨跌幅,公司中文1
0,2023/3/27,BABA,86.12,87.13,88.22,85.5,18.18M,-0.90%,阿里巴巴
1,2023/3/24,BABA,86.9,85.87,88.11,85.63,20.35M,0.44%,阿里巴巴
2,2023/3/23,BABA,86.52,87.68,88.38,85.26,26.79M,3.43%,阿里巴巴
3,2023/3/22,BABA,83.65,84.84,85.39,83.51,21.08M,-0.06%,阿里巴巴
4,2023/3/21,BABA,83.7,82.46,84.09,82.0,16.44M,3.33%,阿里巴巴


**方法2:Series.map(function)**
function的参数是Series的每个元素的值

In [6]:
stock_df['公司中文2'] = stock_df['公司'].map(lambda d: dict_company_name.get(d.lower()) + '2')

In [7]:
stock_df

Unnamed: 0,日期,公司,收盘,开盘,高,低,交易量,涨跌幅,公司中文1,公司中文2
0,2023/3/27,BABA,86.12,87.13,88.22,85.5,18.18M,-0.90%,阿里巴巴,阿里巴巴2
1,2023/3/24,BABA,86.9,85.87,88.11,85.63,20.35M,0.44%,阿里巴巴,阿里巴巴2
2,2023/3/23,BABA,86.52,87.68,88.38,85.26,26.79M,3.43%,阿里巴巴,阿里巴巴2
3,2023/3/22,BABA,83.65,84.84,85.39,83.51,21.08M,-0.06%,阿里巴巴,阿里巴巴2
4,2023/3/21,BABA,83.7,82.46,84.09,82.0,16.44M,3.33%,阿里巴巴,阿里巴巴2
5,2023/3/20,BABA,81.0,80.15,81.9,79.48,18.95M,-0.82%,阿里巴巴,阿里巴巴2
6,2023/3/17,BABA,81.67,84.0,84.16,80.62,23.71M,-0.67%,阿里巴巴,阿里巴巴2
7,2023/3/16,BABA,82.22,81.46,82.48,80.66,22.41M,0.87%,阿里巴巴,阿里巴巴2
8,2023/3/15,BABA,81.51,81.55,82.54,80.15,20.86M,-2.79%,阿里巴巴,阿里巴巴2
9,2023/3/14,BABA,83.85,82.86,83.91,82.16,19.05M,1.15%,阿里巴巴,阿里巴巴2


## 2.apply用于Series和DataFrame的转换
- `Series.apply(function)` 函数的参数是Series中的每个值
- `DataFrame.apply(function)` 函数的参数就是Series


**`Series.apply(function)`**
function的参数就是Series中的每个值

In [8]:
stock_df['公司中文3'] = stock_df['公司'].apply(lambda d: dict_company_name.get(d.lower()) + '3')

In [9]:
stock_df.head()

Unnamed: 0,日期,公司,收盘,开盘,高,低,交易量,涨跌幅,公司中文1,公司中文2,公司中文3
0,2023/3/27,BABA,86.12,87.13,88.22,85.5,18.18M,-0.90%,阿里巴巴,阿里巴巴2,阿里巴巴3
1,2023/3/24,BABA,86.9,85.87,88.11,85.63,20.35M,0.44%,阿里巴巴,阿里巴巴2,阿里巴巴3
2,2023/3/23,BABA,86.52,87.68,88.38,85.26,26.79M,3.43%,阿里巴巴,阿里巴巴2,阿里巴巴3
3,2023/3/22,BABA,83.65,84.84,85.39,83.51,21.08M,-0.06%,阿里巴巴,阿里巴巴2,阿里巴巴3
4,2023/3/21,BABA,83.7,82.46,84.09,82.0,16.44M,3.33%,阿里巴巴,阿里巴巴2,阿里巴巴3


**`DataFrame.apply(function)`**
function的参数对应洲的Series

In [10]:
stock_df['公司中文4'] = stock_df.apply(lambda s: dict_company_name.get(s['公司'].lower()) + '4', axis=1)

注意:
1. apply是在stock_df这个DataFrame上调用的;
2. lambda 上的参数s是一个Series, 这里因为指定了axis=1所以传入的数据为行的Series, 所以可以s['公司']进行获取;

In [11]:
stock_df.head()

Unnamed: 0,日期,公司,收盘,开盘,高,低,交易量,涨跌幅,公司中文1,公司中文2,公司中文3,公司中文4
0,2023/3/27,BABA,86.12,87.13,88.22,85.5,18.18M,-0.90%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
1,2023/3/24,BABA,86.9,85.87,88.11,85.63,20.35M,0.44%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
2,2023/3/23,BABA,86.52,87.68,88.38,85.26,26.79M,3.43%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
3,2023/3/22,BABA,83.65,84.84,85.39,83.51,21.08M,-0.06%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
4,2023/3/21,BABA,83.7,82.46,84.09,82.0,16.44M,3.33%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4


## 3. applymap用于DataFrame所有值的转换

In [12]:
sub_df = stock_df[['收盘', '开盘', '高', '低']]

In [13]:
sub_df.head()

Unnamed: 0,收盘,开盘,高,低
0,86.12,87.13,88.22,85.5
1,86.9,85.87,88.11,85.63
2,86.52,87.68,88.38,85.26
3,83.65,84.84,85.39,83.51
4,83.7,82.46,84.09,82.0


In [14]:
# 将这些数字取整数, 应用于所有元素
sub_df.applymap(lambda s: int(s))

Unnamed: 0,收盘,开盘,高,低
0,86,87,88,85
1,86,85,88,85
2,86,87,88,85
3,83,84,85,83
4,83,82,84,82
5,81,80,81,79
6,81,84,84,80
7,82,81,82,80
8,81,81,82,80
9,83,82,83,82


注意:
DataFrame.applymap(function) 函数在调用自定义函数时传入的入参为每一个Series的元素

In [15]:
# 替换原来数据的值
stock_df.loc[:, ['收盘', '开盘', '高', '低']] = sub_df.applymap(lambda s: int(s))

In [16]:
stock_df.head()

Unnamed: 0,日期,公司,收盘,开盘,高,低,交易量,涨跌幅,公司中文1,公司中文2,公司中文3,公司中文4
0,2023/3/27,BABA,86.0,87.0,88.0,85.0,18.18M,-0.90%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
1,2023/3/24,BABA,86.0,85.0,88.0,85.0,20.35M,0.44%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
2,2023/3/23,BABA,86.0,87.0,88.0,85.0,26.79M,3.43%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
3,2023/3/22,BABA,83.0,84.0,85.0,83.0,21.08M,-0.06%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
4,2023/3/21,BABA,83.0,82.0,84.0,82.0,16.44M,3.33%,阿里巴巴,阿里巴巴2,阿里巴巴3,阿里巴巴4
