# pandas 基础

在本章中，我们将深入学习 pandas 的基础知识，包括文件的读取与写入、基本数据结构、常用基本函数和窗口对象等内容

In [2]:
import numpy as np
import pandas as pd

## 文件的读取和写入

### 文件读取

pandas 支持多种格式的文件读取，这里我们将重点介绍如何读取 CSV、Excel 和 TXT 文件。

#### 读取 CSV 文件

In [34]:
df_csv = pd.read_csv('./data/01_sample.csv')  # 读取 CSV 文件
df_csv.head()  # 显示前五行数据

Unnamed: 0,col1,col2,col3,col4,col5
0,2,a,1.4,apple,2020/1/1
1,3,b,3.4,banana,2020/1/2
2,6,c,2.5,orange,2020/1/5
3,5,d,3.2,lemon,2020/1/7


#### 读取 TXT 文件

In [4]:
df_txt = pd.read_table('./data/01_sample.txt')  # 读取 TXT 文件
print(df_txt.head())  # 显示前五行数据

   col1 col2  col3             col4
0     2    a   1.4   apple 2020/1/1
1     3    b   3.4  banana 2020/1/2
2     6    c   2.5  orange 2020/1/5
3     5    d   3.2   lemon 2020/1/7


#### 读取 Excel 文件

In [5]:
df_excel = pd.read_excel('./data/01_sample.xlsx')  # 读取 Excel 文件
print(df_excel.head())  # 显示前五行数据

   col1 col2  col3    col4      col5
0     2    a   1.4   apple  2020/1/1
1     3    b   3.4  banana  2020/1/2
2     6    c   2.5  orange  2020/1/5
3     5    d   3.2   lemon  2020/1/7


### 文件写入

将 DataFrame 数据写入文件也非常重要。在写入数据时，通常会设置 `index=False` 来排除索引。

#### 写入 CSV 文件

In [35]:
df_csv.to_csv('./data/01_sample_saved.csv', index=False)

In [36]:
df_csv.to_csv('./data/01_sample_saved_with_index.csv')

#### 写入 Excel 文件

In [7]:
df_excel.to_excel('./data/01_sample_saved.xlsx', index=False)

如果想要把表格快速转换为markdown和latex语言，可以使用to_markdown和to_latex函数，此处需要安装tabulate包。

In [37]:
print(df_csv.to_markdown())

|    |   col1 | col2   |   col3 | col4   | col5     |
|---:|-------:|:-------|-------:|:-------|:---------|
|  0 |      2 | a      |    1.4 | apple  | 2020/1/1 |
|  1 |      3 | b      |    3.4 | banana | 2020/1/2 |
|  2 |      6 | c      |    2.5 | orange | 2020/1/5 |
|  3 |      5 | d      |    3.2 | lemon  | 2020/1/7 |


In [9]:
print(df_csv.to_latex())

\begin{tabular}{lrlrll}
\toprule
{} &  col1 & col2 &  col3 &    col4 &      col5 \\
\midrule
0 &     2 &    a &   1.4 &   apple &  2020/1/1 \\
1 &     3 &    b &   3.4 &  banana &  2020/1/2 \\
2 &     6 &    c &   2.5 &  orange &  2020/1/5 \\
3 &     5 &    d &   3.2 &   lemon &  2020/1/7 \\
\bottomrule
\end{tabular}



  print(df_csv.to_latex())


### 常用参数介绍

- `header=None`: 当文件中不包含列名时使用，表示第一行不作为列名。
- `index_col`: 用作行索引的列编号或列名。
- `usecols`: 指定需要读取的列。
- `parse_dates`: 解析日期类型的列。
- `nrows`: 读取的数据行数。

## 基本数据结构

pandas 中有两种基本的数据结构：`Series` 和 `DataFrame`。

### Series

`Series`一般由四个部分组成，分别是序列的值`data`、索引`index`、存储类型`dtype`、序列的名字`name`。其中，索引也可以指定它的名字，默认为空。

In [38]:
s = pd.Series(data=[100, 'a', {'dic1': 5}], index=['id1', 20, 'third'], dtype='object', name='my_series')
print(s)

id1              100
20                 a
third    {'dic1': 5}
Name: my_series, dtype: object


In [42]:
s['id1'],s[20]

(100, 'a')

### DataFrame

`DataFrame` 是一种二维表格结构，类似于 Excel 表格。创建 `DataFrame` 的一个常见方法是传入一个字典,同时再加上行索引:

In [43]:
data = {'col1': [1, 2, 3], 
        'col2': ['a', 'b', 'c'],
        'col3': [1.2, 2.2, 3.2]}
df = pd.DataFrame(data,index = ['row_%d'%i for i in range(3)])
df

Unnamed: 0,col1,col2,col3
row_0,1,a,1.2
row_1,2,b,2.2
row_2,3,c,3.2


由于这种映射关系，在`DataFrame`中可以用`[col_name]`与`[col_list]`来取出相应的列与由多个列组成的表，结果分别为`Series`和`DataFrame`：

In [46]:
# 选择单列，结果为 Series
df['col1']

row_0    1
row_1    2
row_2    3
Name: col1, dtype: int64

In [47]:
# 选择多列，结果为 DataFrame
df[['col1', 'col2']]

Unnamed: 0,col1,col2
row_0,1,a
row_1,2,b
row_2,3,c


## 常用基本函数
为了进行举例说明，在接下来的部分和其余章节都将会使用一份`pandas_starter.csv`的虚拟数据集，它记录了H&M的用户行为数据。


In [48]:
df = pd.read_csv('./data/pandas_starter.csv')

### 汇总函数
`head, tail`函数分别表示返回表或者序列的前`n`行和后`n`行，其中`n`默认为5：

In [49]:
df.head(3)

Unnamed: 0,t_dat,customer_id,article_id,price,sales_channel_id,product_code,prod_name,product_type_no,product_type_name,product_group_name,...,section_name,garment_group_no,garment_group_name,detail_desc,FN,Active,club_member_status,fashion_news_frequency,age,postal_code
0,2018-09-20,03d0011487606c37c1b1ed147fc72f285a50c05f00b971...,668766002,0.042356,2,668766,Roger,258,Blouse,Garment Upper body,...,Womens Casual,1010,Blouses,Blouse in an airy modal and cotton weave with ...,1.0,1.0,ACTIVE,Regularly,51.0,8db52856d17c197683efbc9d5ef2dc873aaf7062486b2d...
1,2018-09-20,03d0011487606c37c1b1ed147fc72f285a50c05f00b971...,652946001,0.050831,2,652946,&DENIM Bootcut RW Speed,272,Trousers,Garment Lower body,...,Ladies Denim,1016,Trousers Denim,5-pocket jeans in washed stretch denim with a ...,1.0,1.0,ACTIVE,Regularly,51.0,8db52856d17c197683efbc9d5ef2dc873aaf7062486b2d...
2,2018-09-20,03d0011487606c37c1b1ed147fc72f285a50c05f00b971...,691275008,0.06778,2,691275,Waves blouse,258,Blouse,Garment Upper body,...,Womens Trend,1010,Blouses,"Blouse in an airy jacquard weave with a small,...",1.0,1.0,ACTIVE,Regularly,51.0,8db52856d17c197683efbc9d5ef2dc873aaf7062486b2d...


In [16]:
df.tail(3)

Unnamed: 0,t_dat,customer_id,article_id,price,sales_channel_id,product_code,prod_name,product_type_no,product_type_name,product_group_name,...,section_name,garment_group_no,garment_group_name,detail_desc,FN,Active,club_member_status,fashion_news_frequency,age,postal_code
50721,2020-09-22,e97c3a6c680cd3569df10f901a61fdffaf8f70300f6adf...,896064002,0.049627,1,896064,Volpino slacks,272,Trousers,Garment Lower body,...,Womens Tailoring,1009,Trousers,Ankle-length cigarette trousers in woven fabri...,1.0,1.0,ACTIVE,Regularly,52.0,738401fe6e5701e1fc827de90e69599119e434ca477f67...
50722,2020-09-22,f137c16fd175271922dad4006565503952f24750a57388...,752814020,0.033881,2,752814,Milk RW slack,272,Trousers,Garment Lower body,...,Womens Everyday Collection,1009,Trousers,Cigarette trousers in stretch twill with a reg...,1.0,1.0,ACTIVE,Regularly,70.0,3455b39b24a47ae0262c91c5728ab9ddcfccc43628291e...
50723,2020-09-22,f137c16fd175271922dad4006565503952f24750a57388...,906633002,0.050831,2,906633,Isach Loafer,86,Ballerinas,Shoes,...,Womens Shoes,1020,Shoes,Loafers in imitation leather with a decorative...,1.0,1.0,ACTIVE,Regularly,70.0,3455b39b24a47ae0262c91c5728ab9ddcfccc43628291e...


`info`, `describe`分别返回表的信息概况和表中数值列对应的主要统计量 ：

In [50]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50724 entries, 0 to 50723
Data columns (total 35 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   t_dat                         50724 non-null  object 
 1   customer_id                   50724 non-null  object 
 2   article_id                    50724 non-null  int64  
 3   price                         50724 non-null  float64
 4   sales_channel_id              50724 non-null  int64  
 5   product_code                  50724 non-null  int64  
 6   prod_name                     50724 non-null  object 
 7   product_type_no               50724 non-null  int64  
 8   product_type_name             50724 non-null  object 
 9   product_group_name            50724 non-null  object 
 10  graphical_appearance_no       50724 non-null  int64  
 11  graphical_appearance_name     50724 non-null  object 
 12  colour_group_code             50724 non-null  int64  
 13  c

In [51]:
df.describe()

Unnamed: 0,article_id,price,sales_channel_id,product_code,product_type_no,graphical_appearance_no,colour_group_code,perceived_colour_value_id,perceived_colour_master_id,department_no,index_group_no,section_no,garment_group_no,FN,Active,age
count,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,50724.0,33282.0,33282.0,50724.0
mean,728694600.0,0.031458,1.940028,728694.642635,243.795757,1009556.0,25.657637,3.138061,7.771469,2531.063284,2.224608,33.987501,1010.154976,1.0,1.0,41.983519
std,121382300.0,0.022794,0.237437,121382.290101,63.959872,21502.58,25.499722,1.479604,4.988244,1847.173009,4.537081,23.046094,6.281734,0.0,0.0,13.989001
min,108775000.0,0.000508,1.0,108775.0,-1.0,-1.0,-1.0,-1.0,-1.0,1201.0,1.0,2.0,1001.0,1.0,1.0,22.0
25%,671633000.0,0.016932,2.0,671633.0,253.0,1010010.0,9.0,2.0,5.0,1543.0,1.0,15.0,1005.0,1.0,1.0,28.0
50%,742058500.0,0.025407,2.0,742058.5,263.0,1010016.0,11.0,4.0,5.0,1666.0,1.0,19.0,1009.0,1.0,1.0,43.0
75%,816586000.0,0.040661,2.0,816586.0,272.0,1010016.0,42.0,4.0,11.0,3209.0,2.0,57.0,1016.0,1.0,1.0,53.0
max,956217000.0,0.50678,2.0,956217.0,761.0,1010029.0,93.0,7.0,20.0,9989.0,26.0,97.0,1025.0,1.0,1.0,70.0


### 数据统计函数

- `count()`: 计算非空元素的数量。
- `min(), max()`: 计算最小值和最大值。
- `sum()`, `mean()`: 计算总和和平均值。
- `quantile()`: 计算分位数
- `idxmax()`:表示返回值最大的索引

In [19]:
df['customer_id'].count()

50724

In [20]:
df['price'].min()

0.0005084745762711

In [21]:
df['price'].max()

0.5067796610169492

In [22]:
df['price'].sum()

1595.6556101694891

In [23]:
df['price'].mean()

0.031457606067531924

In [24]:
df.quantile(0.75)

article_id                    8.165860e+08
price                         4.066102e-02
sales_channel_id              2.000000e+00
product_code                  8.165860e+05
product_type_no               2.720000e+02
graphical_appearance_no       1.010016e+06
colour_group_code             4.200000e+01
perceived_colour_value_id     4.000000e+00
perceived_colour_master_id    1.100000e+01
department_no                 3.209000e+03
index_group_no                2.000000e+00
section_no                    5.700000e+01
garment_group_no              1.016000e+03
FN                            1.000000e+00
Active                        1.000000e+00
age                           5.300000e+01
Name: 0.75, dtype: float64

In [25]:
df['price'].idxmax() # idxmin是对应的函数

48988

### 唯一值函数
对序列使用`unique`和`nunique`可以分别得到其唯一值组成的列表和唯一值的个数：

In [26]:
df['customer_id'].unique()

array(['03d0011487606c37c1b1ed147fc72f285a50c05f00b9712e0fc3da400c864296',
       '1320d4b3dd6481cde05bb80fb7ca37397f70470b9afb96aeca5d41175acaf836',
       '1f09f1593c106b2b171e201a79e922f83ddacfdb690a0d8d382c2b7d03d0a5cb',
       '30d1e9b6378a74a740f64c3d34f1686693d0430b03c6cd602d58062e604373d0',
       '49beaacac0c7801c2ce2d189efe525fe80b5d37e46ed05b50a4cd88e34d0748f',
       '6cc121e5cc202d2bf344ffe795002bdbf87178054bcda2e57161f0ef810a4b55',
       '7f0ac4394297dc4a885d3b9277ba526cbbfbf7fb7cae465b256ed8e55b864f03',
       'ac078972395fc6edd27d6db647ec2fcf149cefd4c2fae6b5076a9fa7d2124406',
       'ae414fa70eb3c2ae0ec1640bbdf588531c8d034dfc5c06000e19601be6cd34fd',
       'c3bbcf0011725dd1012eef2f6f16e28bc3d0eb2b25a4b7c9f4f98cf97821d83d',
       'f137c16fd175271922dad4006565503952f24750a57388fe24970a218c62de6a',
       '0bf4c6fd4e9d33f9bfb807bb78348cbf5c565846ff4006acf5c1b9aea77b0e54',
       '157eee38676eebb003bf97407f26e369de192997ab3902c194ce2690f060ff50',
       '6881f635c5be05506

In [52]:
df['customer_id'].nunique()

50

`value_counts`可以得到唯一值和其对应出现的频数：

In [53]:
df['customer_id'].value_counts()

be1981ab818cf4ef6765b2ecaea7a2cbf14ccd6e8a7ee985513d9e8e53c6d91b    1895
b4db5e5259234574edfff958e170fe3a5e13b6f146752ca066abca3c156acc71    1441
49beaacac0c7801c2ce2d189efe525fe80b5d37e46ed05b50a4cd88e34d0748f    1364
a65f77281a528bf5c1e9f270141d601d116e1df33bf9df512f495ee06647a9cc    1361
cd04ec2726dd58a8c753e0d6423e57716fd9ebcf2f14ed6012e7e5bea016b4d6    1237
55d15396193dfd45836af3a6269a079efea339e875eff42cc0c228b002548a9d    1208
c140410d72a41ee5e2e3ba3d7f5a860f337f1b5e41c27cf9bda5517c8774f8fa    1170
8df45859ccd71ef1e48e2ee9d1c65d5728c31c46ae957d659fa4e5c3af6cc076    1169
03d0011487606c37c1b1ed147fc72f285a50c05f00b9712e0fc3da400c864296    1157
6cc121e5cc202d2bf344ffe795002bdbf87178054bcda2e57161f0ef810a4b55    1143
e34f8aa5e7c8c258523ea3e5f5f13168b6c21a9e8bffccd515dd5cef56126efb    1117
3493c55a7fe252c84a9a03db338f5be7afbce1edbca12f3a908fac9b983692f2    1115
0bf4c6fd4e9d33f9bfb807bb78348cbf5c565846ff4006acf5c1b9aea77b0e54    1099
e6498c7514c61d3c24669f49753dc83fdff3ec1ba13902dd918

如果想要观察多个列组合的唯一值，可以使用`drop_duplicates`。其中的关键参数是`keep`，默认值`first`表示每个组合保留第一次出现的所在行，`last`表示保留最后一次出现的所在行，`False`表示把所有重复组合所在的行剔除。

In [54]:
df.drop_duplicates(['customer_id'],keep='first')

Unnamed: 0,t_dat,customer_id,article_id,price,sales_channel_id,product_code,prod_name,product_type_no,product_type_name,product_group_name,...,section_name,garment_group_no,garment_group_name,detail_desc,FN,Active,club_member_status,fashion_news_frequency,age,postal_code
0,2018-09-20,03d0011487606c37c1b1ed147fc72f285a50c05f00b971...,668766002,0.042356,2,668766,Roger,258,Blouse,Garment Upper body,...,Womens Casual,1010,Blouses,Blouse in an airy modal and cotton weave with ...,1.0,1.0,ACTIVE,Regularly,51.0,8db52856d17c197683efbc9d5ef2dc873aaf7062486b2d...
3,2018-09-20,1320d4b3dd6481cde05bb80fb7ca37397f70470b9afb96...,501820043,0.016932,2,501820,SIRPA,252,Sweater,Garment Upper body,...,Divided Collection,1003,Knitwear,Jumper in a soft knit with a slightly wider ne...,1.0,1.0,ACTIVE,Regularly,54.0,da2dffc9d9cb6a1449dae3835ecb74cdf826ba152df3a0...
11,2018-09-20,1f09f1593c106b2b171e201a79e922f83ddacfdb690a0d...,598848006,0.030492,2,598848,Dylan,252,Sweater,Garment Upper body,...,Womens Everyday Basics,1003,Knitwear,Jumper in a soft rib knit containing some mohair.,1.0,1.0,ACTIVE,Regularly,27.0,db877358ce25bab258bce9f89ad718a908bb28eb7171e2...
14,2018-09-20,30d1e9b6378a74a740f64c3d34f1686693d0430b03c6cd...,686631001,0.033881,2,686631,Maggie RW tapered,272,Trousers,Garment Lower body,...,Womens Everyday Collection,1009,Trousers,,,,ACTIVE,NONE,46.0,0c0552e4851095100d21ea87138b747f5a8ddad6e592ab...
19,2018-09-20,49beaacac0c7801c2ce2d189efe525fe80b5d37e46ed05...,568597012,0.023712,2,568597,Hayes,272,Trousers,Garment Lower body,...,Womens Tailoring,1009,Trousers,Suit trousers in a stretch weave with a regula...,,,ACTIVE,NONE,28.0,ab724d6cb2340bd9c5294fd7f2811349f6509a27a8bc5c...
20,2018-09-20,6cc121e5cc202d2bf344ffe795002bdbf87178054bcda2...,609605006,0.040661,2,609605,Forever sweater,252,Sweater,Garment Upper body,...,Womens Tailoring,1003,Knitwear,"Jumper in a fine, sturdy knit with buttons on ...",1.0,1.0,ACTIVE,Regularly,32.0,6480abda4aa42dcf1f5c4597c616bc53365825f1cbdf68...
22,2018-09-20,7f0ac4394297dc4a885d3b9277ba526cbbfbf7fb7cae46...,685687001,0.016932,2,685687,W YODA KNIT OL OFFER,252,Sweater,Garment Upper body,...,Womens Everyday Collection,1023,Special Offers,V-neck knitted jumper with long sleeves and ri...,1.0,1.0,ACTIVE,Regularly,33.0,2c170eef5d71c660d26538cc8ba016cf8a65274925d82a...
29,2018-09-20,ac078972395fc6edd27d6db647ec2fcf149cefd4c2fae6...,654306001,0.047441,2,654306,Spanx alot lace swimsuit,57,Swimsuit,Swimwear,...,"Womens Swimwear, beachwear",1018,Swimwear,"Fully lined, lace shaping swimsuit that has a ...",1.0,1.0,ACTIVE,Regularly,28.0,0cc530c0cb9280137cdb39c4ebadf9785c8c0c2973999c...
35,2018-09-20,ae414fa70eb3c2ae0ec1640bbdf588531c8d034dfc5c06...,685687004,0.016932,2,685687,W YODA KNIT OL OFFER,252,Sweater,Garment Upper body,...,Womens Everyday Collection,1023,Special Offers,V-neck knitted jumper with long sleeves and ri...,1.0,1.0,ACTIVE,Regularly,46.0,8a100d26b9040246e1d4331b5f695bfc054140ef4e8d4b...
40,2018-09-20,c3bbcf0011725dd1012eef2f6f16e28bc3d0eb2b25a4b7...,650710001,0.013542,2,650710,MIRANDA mini pouch,78,Other accessories,Accessories,...,Divided Accessories,1019,Accessories,Pouch bag in grained imitation leather with a ...,,,ACTIVE,NONE,23.0,3a5367bc8e356cff4cd9ad8c88c68c59522d6b8e202804...


### 排序函数

- `sort_values()`: 按值排序。
- `sort_index()`: 按索引排序。

In [55]:
# 按值排序
df.sort_values('price').head()

Unnamed: 0,t_dat,customer_id,article_id,price,sales_channel_id,product_code,prod_name,product_type_no,product_type_name,product_group_name,...,section_name,garment_group_no,garment_group_name,detail_desc,FN,Active,club_member_status,fashion_news_frequency,age,postal_code
31056,2019-12-23,0d4fb6fb46dfe2759bcf7bc80340e8915b207aa2f74b5b...,742947003,0.000508,1,742947,20p terrys,512,Hair ties,Accessories,...,Womens Small accessories,1019,Accessories,Hair elastics without metal clips. Diameter 4....,1.0,1.0,ACTIVE,Regularly,40.0,e79ad63d14d9cb5b2f10c654229ece674a360c81672fbf...
31074,2019-12-23,0d4fb6fb46dfe2759bcf7bc80340e8915b207aa2f74b5b...,742947002,0.000508,1,742947,20p terrys,512,Hair ties,Accessories,...,Womens Small accessories,1019,Accessories,Hair elastics without metal clips. Diameter 4....,1.0,1.0,ACTIVE,Regularly,40.0,e79ad63d14d9cb5b2f10c654229ece674a360c81672fbf...
24703,2019-09-12,be1981ab818cf4ef6765b2ecaea7a2cbf14ccd6e8a7ee9...,214844002,0.000847,1,214844,30p pins,72,Hair clip,Accessories,...,Womens Small accessories,1019,Accessories,Metal hair grips. Length 5 cm.,,,ACTIVE,NONE,31.0,67851f0456e7070c20c713fe0f47eb15bcbf2a59d13b79...
46174,2020-07-20,1df07f916d7f648458702bd0b612caee88f1fb4cd1b660...,689866001,0.001,1,689866,Sofia headband,74,Hair/alice band,Accessories,...,Womens Small accessories,1019,Accessories,Hairband in jersey with a sewn-on knot-detail ...,,,ACTIVE,NONE,23.0,207d6b8e53f6efaad75e7b3b5dd51770c2285e804a4453...
45513,2020-07-11,be1981ab818cf4ef6765b2ecaea7a2cbf14ccd6e8a7ee9...,809320001,0.001254,1,809320,1p Fancy edge sock,302,Socks,Socks & Tights,...,"Womens Nightwear, Socks & Tigh",1021,Socks and Tights,"Socks in a soft rib knit with elasticated, fri...",,,ACTIVE,NONE,31.0,67851f0456e7070c20c713fe0f47eb15bcbf2a59d13b79...


In [31]:
# 按值排序
df.sort_values('price', ascending=False).head()

Unnamed: 0,t_dat,customer_id,article_id,price,sales_channel_id,product_code,prod_name,product_type_no,product_type_name,product_group_name,...,section_name,garment_group_no,garment_group_name,detail_desc,FN,Active,club_member_status,fashion_news_frequency,age,postal_code
48988,2020-08-24,863f0e03da282ae32a76775ce55d8a4605a85c84a26066...,916300002,0.50678,2,916300,PQ OLGA LEATHER DRESS,265,Dress,Garment Full body,...,Womens Premium,1001,Unknown,Knee-length shirt dress in soft leather with a...,,,ACTIVE,NONE,27.0,b6cd6d4a8a029ceb76b9bb83781c4bfabd02fdf4253092...
48437,2020-08-14,863f0e03da282ae32a76775ce55d8a4605a85c84a26066...,917509001,0.422034,2,917509,PQ BODEN LEATHER BLOUSE,258,Blouse,Garment Upper body,...,Womens Premium,1001,Unknown,Wide blouse in soft leather with a grandad col...,,,ACTIVE,NONE,27.0,b6cd6d4a8a029ceb76b9bb83781c4bfabd02fdf4253092...
27244,2019-10-19,84c34f4f564db1f437943c77af41f83bf6fd7c01701cbb...,776716001,0.422034,2,776716,D1 PE SAFFRON LEATHER JACK,262,Jacket,Garment Upper body,...,Collaborations,1001,Unknown,"Lightly padded, straight-cut leather jacket wi...",1.0,1.0,ACTIVE,Regularly,53.0,5f128cfa35124fbbaf9a782510be2b8410df3bab1f482d...
49908,2020-09-09,863f0e03da282ae32a76775ce55d8a4605a85c84a26066...,917509002,0.422034,2,917509,PQ BODEN LEATHER BLOUSE,258,Blouse,Garment Upper body,...,Womens Premium,1001,Unknown,Wide blouse in soft leather with a grandad col...,,,ACTIVE,NONE,27.0,b6cd6d4a8a029ceb76b9bb83781c4bfabd02fdf4253092...
27245,2019-10-19,84c34f4f564db1f437943c77af41f83bf6fd7c01701cbb...,776716001,0.422034,2,776716,D1 PE SAFFRON LEATHER JACK,262,Jacket,Garment Upper body,...,Collaborations,1001,Unknown,"Lightly padded, straight-cut leather jacket wi...",1.0,1.0,ACTIVE,Regularly,53.0,5f128cfa35124fbbaf9a782510be2b8410df3bab1f482d...


In [32]:
# 按索引排序
print(df.sort_index(ascending=False).head())

            t_dat                                        customer_id  \
50723  2020-09-22  f137c16fd175271922dad4006565503952f24750a57388...   
50722  2020-09-22  f137c16fd175271922dad4006565503952f24750a57388...   
50721  2020-09-22  e97c3a6c680cd3569df10f901a61fdffaf8f70300f6adf...   
50720  2020-09-22  e97c3a6c680cd3569df10f901a61fdffaf8f70300f6adf...   
50719  2020-09-22  e97c3a6c680cd3569df10f901a61fdffaf8f70300f6adf...   

       article_id     price  sales_channel_id  product_code       prod_name  \
50723   906633002  0.050831                 2        906633    Isach Loafer   
50722   752814020  0.033881                 2        752814   Milk RW slack   
50721   896064002  0.049627                 1        896064  Volpino slacks   
50720   896064002  0.049627                 1        896064  Volpino slacks   
50719   892857002  0.033102                 1        892857    Florence top   

       product_type_no product_type_name  product_group_name  ...  \
50723               86 