## Pandas数据排序

Series的排序：  
***Series.sort_values(ascending=True, inplace=False)***  
参数说明：
* ascending：默认为True升序排序，为False降序排序
* inplace：是否修改原始Series

DataFrame的排序：  
***DataFrame.sort_values(by, ascending=True, inplace=False)***  
参数说明：
* by：字符串或者List<字符串>，单列排序或者多列排序
* ascending：bool或者List<bool>，升序还是降序，如果是list对应by的多列
* inplace：是否修改原始DataFrame

In [2]:
import pandas as pd

### 0、读取数据

In [3]:
fpath = "./datas/beijing_tianqi/beijing_tianqi_2018.csv"
df = pd.read_csv(fpath)

# 替换掉温度的后缀℃
df.loc[:, "bWendu"] = df["bWendu"].str.replace("℃", "").astype('int32')
df.loc[:, "yWendu"] = df["yWendu"].str.replace("℃", "").astype('int32')

In [4]:
df.head()

Unnamed: 0,ymd,bWendu,yWendu,tianqi,fengxiang,fengli,aqi,aqiInfo,aqiLevel
0,2018-01-01,3,-6,晴~多云,东北风,1-2级,59,良,2
1,2018-01-02,2,-5,阴~多云,东北风,1-2级,49,优,1
2,2018-01-03,2,-5,多云,北风,1-2级,28,优,1
3,2018-01-04,0,-8,阴,东北风,1-2级,28,优,1
4,2018-01-05,3,-6,多云~晴,西北风,1-2级,50,优,1


### 1、Series的排序

In [5]:
df["aqi"].sort_values()

271     21
281     21
249     22
272     22
301     22
      ... 
317    266
71     287
91     287
72     293
86     387
Name: aqi, Length: 365, dtype: int64

In [6]:
df["aqi"].sort_values(ascending=False)

86     387
72     293
91     287
71     287
317    266
      ... 
301     22
272     22
249     22
281     21
271     21
Name: aqi, Length: 365, dtype: int64

In [8]:
df["tianqi"].sort_values()

225     中雨~小雨
230     中雨~小雨
197    中雨~雷阵雨
196    中雨~雷阵雨
112        多云
        ...  
191    雷阵雨~大雨
219     雷阵雨~阴
335      雾~多云
353         霾
348         霾
Name: tianqi, Length: 365, dtype: object

### 2、DataFrame的排序

#### 2.1 单列排序

In [9]:
df.sort_values(by="aqi")

Unnamed: 0,ymd,bWendu,yWendu,tianqi,fengxiang,fengli,aqi,aqiInfo,aqiLevel
271,2018-09-29,22,11,晴,北风,3-4级,21,优,1
281,2018-10-09,15,4,多云~晴,西北风,4-5级,21,优,1
249,2018-09-07,27,16,晴,西北风,3-4级,22,优,1
272,2018-09-30,19,13,多云,西北风,4-5级,22,优,1
301,2018-10-29,15,3,晴,北风,3-4级,22,优,1
...,...,...,...,...,...,...,...,...,...
317,2018-11-14,13,5,多云,南风,1-2级,266,重度污染,5
71,2018-03-13,17,5,晴~多云,南风,1-2级,287,重度污染,5
91,2018-04-02,26,11,多云,北风,1-2级,287,重度污染,5
72,2018-03-14,15,6,多云~阴,东北风,1-2级,293,重度污染,5


In [10]:
df.sort_values(by="aqi", ascending=False)

Unnamed: 0,ymd,bWendu,yWendu,tianqi,fengxiang,fengli,aqi,aqiInfo,aqiLevel
86,2018-03-28,25,9,多云~晴,东风,1-2级,387,严重污染,6
72,2018-03-14,15,6,多云~阴,东北风,1-2级,293,重度污染,5
71,2018-03-13,17,5,晴~多云,南风,1-2级,287,重度污染,5
91,2018-04-02,26,11,多云,北风,1-2级,287,重度污染,5
317,2018-11-14,13,5,多云,南风,1-2级,266,重度污染,5
...,...,...,...,...,...,...,...,...,...
249,2018-09-07,27,16,晴,西北风,3-4级,22,优,1
301,2018-10-29,15,3,晴,北风,3-4级,22,优,1
272,2018-09-30,19,13,多云,西北风,4-5级,22,优,1
271,2018-09-29,22,11,晴,北风,3-4级,21,优,1


#### 2.2 多列排序

In [11]:
# 按空气质量等级、最高温度排序，默认升序
df.sort_values(by=["aqiLevel", "bWendu"])

Unnamed: 0,ymd,bWendu,yWendu,tianqi,fengxiang,fengli,aqi,aqiInfo,aqiLevel
360,2018-12-27,-5,-12,多云~晴,西北风,3级,48,优,1
22,2018-01-23,-4,-12,晴,西北风,3-4级,31,优,1
23,2018-01-24,-4,-11,晴,西南风,1-2级,34,优,1
340,2018-12-07,-4,-10,晴,西北风,3级,33,优,1
21,2018-01-22,-3,-10,小雪~多云,东风,1-2级,47,优,1
...,...,...,...,...,...,...,...,...,...
71,2018-03-13,17,5,晴~多云,南风,1-2级,287,重度污染,5
90,2018-04-01,25,11,晴~多云,南风,1-2级,218,重度污染,5
91,2018-04-02,26,11,多云,北风,1-2级,287,重度污染,5
85,2018-03-27,27,11,晴,南风,1-2级,243,重度污染,5


In [12]:
# 两个字段都是降序
df.sort_values(by=["aqiLevel", "bWendu"], ascending=False)

Unnamed: 0,ymd,bWendu,yWendu,tianqi,fengxiang,fengli,aqi,aqiInfo,aqiLevel
86,2018-03-28,25,9,多云~晴,东风,1-2级,387,严重污染,6
85,2018-03-27,27,11,晴,南风,1-2级,243,重度污染,5
91,2018-04-02,26,11,多云,北风,1-2级,287,重度污染,5
90,2018-04-01,25,11,晴~多云,南风,1-2级,218,重度污染,5
71,2018-03-13,17,5,晴~多云,南风,1-2级,287,重度污染,5
...,...,...,...,...,...,...,...,...,...
362,2018-12-29,-3,-12,晴,西北风,2级,29,优,1
22,2018-01-23,-4,-12,晴,西北风,3-4级,31,优,1
23,2018-01-24,-4,-11,晴,西南风,1-2级,34,优,1
340,2018-12-07,-4,-10,晴,西北风,3级,33,优,1


In [13]:
# 分别指定升序和降序
df.sort_values(by=["aqiLevel", "bWendu"], ascending=[True, False])

Unnamed: 0,ymd,bWendu,yWendu,tianqi,fengxiang,fengli,aqi,aqiInfo,aqiLevel
178,2018-06-28,35,24,多云~晴,北风,1-2级,33,优,1
149,2018-05-30,33,18,晴,西风,1-2级,46,优,1
206,2018-07-26,33,25,多云~雷阵雨,东北风,1-2级,40,优,1
158,2018-06-08,32,19,多云~雷阵雨,西南风,1-2级,43,优,1
205,2018-07-25,32,25,多云,北风,1-2级,28,优,1
...,...,...,...,...,...,...,...,...,...
317,2018-11-14,13,5,多云,南风,1-2级,266,重度污染,5
329,2018-11-26,10,0,多云,东南风,1级,245,重度污染,5
335,2018-12-02,9,2,雾~多云,东北风,1级,234,重度污染,5
57,2018-02-27,7,0,阴,东风,1-2级,220,重度污染,5


In [2]:
import pandas as pd

In [3]:
df = pd.read_excel('.\source_data.xlsx', engine='openpyxl', skiprows=1, skipfooter=7,
                       usecols=list(range(11)))

In [5]:
df

Unnamed: 0,航班日期,机型,机尾号,航班类型,起飞机场,落地机场,计划起飞,实际起飞,计划落地,实际落地,装卸机操作
0,2020-07-01,B733,B1109,进港,CAN,HGH,2020-07-01 00:05:00,2020-07-01 00:00:00,2020-07-01 01:25:00,2020-07-01 01:20:00,卸机结束:2020/07/01 01:47:23 ; 卸机开始:2020/07/01 01:...
1,2020-07-01,B733,B1110,进港,LHW,HGH,2020-07-01 00:40:00,2020-07-01 00:28:00,2020-07-01 03:20:00,2020-07-01 03:04:00,卸机结束:2020/07/01 03:29:09 ; 卸机开始:2020/07/01 03:...
2,2020-07-01,B752,B1177,离港,HGH,PEK,2020-07-01 02:05:00,2020-07-01 02:13:00,2020-07-01 04:15:00,2020-07-01 04:06:00,卸机开始:2020/07/01 04:16:51 ; 卸机结束:2020/07/01 04:...
3,2020-07-01,B752,B1252,离港,HGH,TSN,2020-07-01 04:00:00,2020-07-01 04:15:00,2020-07-01 05:50:00,2020-07-01 06:03:00,卸机结束:2020/07/01 07:00:00 ; 卸机开始:2020/07/01 06:...
4,2020-07-01,B733,B1109,离港,HGH,CAN,2020-07-01 04:10:00,2020-07-01 04:07:00,2020-07-01 05:30:00,2020-07-01 05:36:00,卸机结束:2020/07/01 06:09:17 ; 卸机开始:2020/07/01 05:...
5,2020-07-01,B733,B1110,离港,HGH,SZX,2020-07-01 05:20:00,2020-07-01 05:15:00,2020-07-01 08:05:00,2020-07-01 08:09:00,卸机结束:2020/07/01 08:33:50 ; 卸机开始:2020/07/01 08:...
6,2020-07-01,B752,B1177,进港,PEK,HGH,2020-07-01 05:25:00,2020-07-01 05:18:00,2020-07-01 07:30:00,2020-07-01 06:52:00,卸机开始:2020/07/01 07:07:30 ; 卸机结束:2020/07/01 07:...
7,2020-07-01,B752,B1252,进港,TSN,HGH,2020-07-01 23:45:00,2020-07-01 23:55:00,2020-07-02 01:25:00,2020-07-02 01:26:00,卸机开始:2020/07/02 01:33:08 ; 卸机结束:2020/07/02 01:...
8,2020-07-02,B733,B1109,进港,CAN,HGH,2020-07-02 00:05:00,2020-07-02 00:09:00,2020-07-02 01:25:00,2020-07-02 01:23:00,卸机结束:2020/07/02 01:53:33 ; 卸机开始:2020/07/02 01:...
9,2020-07-02,B733,B1110,进港,LHW,HGH,2020-07-02 00:40:00,2020-07-02 00:19:00,2020-07-02 03:20:00,2020-07-02 02:54:00,卸机结束:2020/07/02 03:38:58 ; 卸机开始:2020/07/02 03:...


In [6]:
df.sort_values(by=["机尾号", "航班类型", "实际落地", "实际起飞"], ascending=[True, False, True, True], inplace=True)

In [7]:
df

Unnamed: 0,航班日期,机型,机尾号,航班类型,起飞机场,落地机场,计划起飞,实际起飞,计划落地,实际落地,装卸机操作
0,2020-07-01,B733,B1109,进港,CAN,HGH,2020-07-01 00:05:00,2020-07-01 00:00:00,2020-07-01 01:25:00,2020-07-01 01:20:00,卸机结束:2020/07/01 01:47:23 ; 卸机开始:2020/07/01 01:...
8,2020-07-02,B733,B1109,进港,CAN,HGH,2020-07-02 00:05:00,2020-07-02 00:09:00,2020-07-02 01:25:00,2020-07-02 01:23:00,卸机结束:2020/07/02 01:53:33 ; 卸机开始:2020/07/02 01:...
4,2020-07-01,B733,B1109,离港,HGH,CAN,2020-07-01 04:10:00,2020-07-01 04:07:00,2020-07-01 05:30:00,2020-07-01 05:36:00,卸机结束:2020/07/01 06:09:17 ; 卸机开始:2020/07/01 05:...
12,2020-07-02,B733,B1109,离港,HGH,CAN,2020-07-02 04:10:00,2020-07-02 04:17:00,2020-07-02 05:30:00,2020-07-02 05:43:00,卸机结束:2020/07/02 06:15:21 ; 卸机开始:2020/07/02 06:...
1,2020-07-01,B733,B1110,进港,LHW,HGH,2020-07-01 00:40:00,2020-07-01 00:28:00,2020-07-01 03:20:00,2020-07-01 03:04:00,卸机结束:2020/07/01 03:29:09 ; 卸机开始:2020/07/01 03:...
9,2020-07-02,B733,B1110,进港,LHW,HGH,2020-07-02 00:40:00,2020-07-02 00:19:00,2020-07-02 03:20:00,2020-07-02 02:54:00,卸机结束:2020/07/02 03:38:58 ; 卸机开始:2020/07/02 03:...
5,2020-07-01,B733,B1110,离港,HGH,SZX,2020-07-01 05:20:00,2020-07-01 05:15:00,2020-07-01 08:05:00,2020-07-01 08:09:00,卸机结束:2020/07/01 08:33:50 ; 卸机开始:2020/07/01 08:...
13,2020-07-02,B733,B1110,离港,HGH,SZX,2020-07-02 05:20:00,2020-07-02 05:14:00,2020-07-02 08:05:00,2020-07-02 08:03:00,卸机结束:2020/07/02 08:36:07 ; 卸机开始:2020/07/02 08:...
6,2020-07-01,B752,B1177,进港,PEK,HGH,2020-07-01 05:25:00,2020-07-01 05:18:00,2020-07-01 07:30:00,2020-07-01 06:52:00,卸机开始:2020/07/01 07:07:30 ; 卸机结束:2020/07/01 07:...
14,2020-07-02,B752,B1177,进港,PEK,HGH,2020-07-02 05:25:00,2020-07-02 05:26:00,2020-07-02 07:30:00,2020-07-02 07:07:00,卸机开始:2020/07/02 07:17:30 ; 卸机结束:2020/07/02 08:...


In [44]:
def find_operation_time(operation_info, action):
    for each in operation_info:
        if action in each:
            return each[6:].strip()

def create_new_df(df):
    new_df_data = {
        '航班日期': [],
        '机型': [],
        '机尾号': [],
        '起飞机场': [],
        '计划落地时间': [],
        '实际落地时间': [],
        '卸机开始时间': [],
        '卸机结束时间': [],
        '落地机场': [],
        '装机开始时间': [],
        '装机结束时间': [],
        '计划起飞时间': [],
        '实际起飞时间': [],
    }
    # 获取机尾号列表
    aircraft_tail_nums = list(set([_d['机尾号'] for idx, _d in df.iterrows()]))
    aircraft_tail_nums.sort()
    for each_num in aircraft_tail_nums:
        each_aircraft_info = df.loc[df["机尾号"] == each_num, :]
        for inx, _d in each_aircraft_info.iterrows():
            operation_info = _d['装卸机操作'].split(';')
            if _d['航班日期'].day == 1:
                # 1号数据
                if _d['航班类型'] == '进港':
                    # 1号进港数据
                    new_df_data['航班日期'].append(_d['航班日期'])
                    new_df_data['机型'].append(_d['机型'])
                    new_df_data['机尾号'].append(_d['机尾号'])
                    new_df_data['起飞机场'].append(_d['起飞机场'])
                    new_df_data['计划落地时间'].append(_d['计划落地'])
                    new_df_data['实际落地时间'].append(_d['实际落地'])
                    new_df_data['卸机开始时间'].append(find_operation_time(operation_info, '卸机开始'))
                    new_df_data['卸机结束时间'].append(find_operation_time(operation_info, '卸机结束'))
                else:
                    # 1号离港数据
                    new_df_data['落地机场'].append(_d['落地机场'])
                    new_df_data['装机开始时间'].append(find_operation_time(operation_info, '装机开始'))
                    new_df_data['装机结束时间'].append(find_operation_time(operation_info, '装机结束'))
                    new_df_data['计划起飞时间'].append(_d['计划起飞'])
                    new_df_data['实际起飞时间'].append(_d['实际起飞'])
            else:
                # 2号数据
                if _d['航班类型'] == '进港':
                    # 2号进港数据
                    new_df_data['航班日期'].append(_d['航班日期'])
                    new_df_data['机型'].append(_d['机型'])
                    new_df_data['机尾号'].append(_d['机尾号'])
                    new_df_data['起飞机场'].append(_d['起飞机场'])
                    new_df_data['计划落地时间'].append(_d['计划落地'])
                    new_df_data['实际落地时间'].append(_d['实际落地'])
                    new_df_data['卸机开始时间'].append(find_operation_time(operation_info, '卸机开始'))
                    new_df_data['卸机结束时间'].append(find_operation_time(operation_info, '卸机结束'))
                else:
                    # 2号离港数据
                    new_df_data['落地机场'].append(_d['落地机场'])
                    new_df_data['装机开始时间'].append(find_operation_time(operation_info, '装机开始'))
                    new_df_data['装机结束时间'].append(find_operation_time(operation_info, '装机结束'))
                    new_df_data['计划起飞时间'].append(_d['计划起飞'])
                    new_df_data['实际起飞时间'].append(_d['实际起飞'])
    return pd.DataFrame(data=new_df_data)

In [45]:
new_df = create_new_df(df)

In [46]:
new_df

Unnamed: 0,航班日期,机型,机尾号,起飞机场,计划落地时间,实际落地时间,卸机开始时间,卸机结束时间,落地机场,装机开始时间,装机结束时间,计划起飞时间,实际起飞时间
0,2020-07-01,B733,B1109,CAN,2020-07-01 01:25:00,2020-07-01 01:20:00,2020/07/01 01:31:20,020/07/01 01:47:23,CAN,2020/07/01 02:21:14,2020/07/01 03:23:22,2020-07-01 04:10:00,2020-07-01 04:07:00
1,2020-07-02,B733,B1109,CAN,2020-07-02 01:25:00,2020-07-02 01:23:00,2020/07/02 01:34:54,020/07/02 01:53:33,CAN,2020/07/02 02:55:22,2020/07/02 03:46:52,2020-07-02 04:10:00,2020-07-02 04:17:00
2,2020-07-01,B733,B1110,LHW,2020-07-01 03:20:00,2020-07-01 03:04:00,2020/07/01 03:16:03,020/07/01 03:29:09,SZX,2020/07/01 03:50:46,2020/07/01 04:49:19,2020-07-01 05:20:00,2020-07-01 05:15:00
3,2020-07-02,B733,B1110,LHW,2020-07-02 03:20:00,2020-07-02 02:54:00,2020/07/02 03:30:33,020/07/02 03:38:58,SZX,2020/07/02 04:29:13,2020/07/02 04:53:58,2020-07-02 05:20:00,2020-07-02 05:14:00
4,2020-07-01,B752,B1177,PEK,2020-07-01 07:30:00,2020-07-01 06:52:00,020/07/01 07:07:30,2020/07/01 07:25:02,PEK,2020/07/01 01:09:26,2020/07/01 01:39:38,2020-07-01 02:05:00,2020-07-01 02:13:00
5,2020-07-02,B752,B1177,PEK,2020-07-02 07:30:00,2020-07-02 07:07:00,020/07/02 07:17:30,2020/07/02 08:14:51,PEK,2020/07/02 00:48:26,2020/07/02 01:28:05,2020-07-02 02:05:00,2020-07-02 02:00:00
6,2020-07-01,B752,B1252,TSN,2020-07-02 01:25:00,2020-07-02 01:26:00,020/07/02 01:33:08,2020/07/02 01:33:10,TSN,2020/07/01 03:08:00,2020/07/01 03:53:52,2020-07-01 04:00:00,2020-07-01 04:15:00
7,2020-07-02,B752,B1252,JJN,2020-07-03 01:25:00,2020-07-03 02:02:00,020/07/03 04:31:32,2020/07/03 02:30:16,JJN,2020/07/02 03:02:00,2020/07/02 03:40:00,2020-07-02 04:00:00,2020-07-02 04:20:00


In [47]:
new_df.loc[:, "进港延误"] = new_df["实际落地时间"] - new_df["计划落地时间"]
new_df.loc[:, "离港延误"] = new_df["实际起飞时间"] - new_df["计划起飞时间"]

In [48]:
new_df

Unnamed: 0,航班日期,机型,机尾号,起飞机场,计划落地时间,实际落地时间,卸机开始时间,卸机结束时间,落地机场,装机开始时间,装机结束时间,计划起飞时间,实际起飞时间,进港延误,离港延误
0,2020-07-01,B733,B1109,CAN,2020-07-01 01:25:00,2020-07-01 01:20:00,2020/07/01 01:31:20,020/07/01 01:47:23,CAN,2020/07/01 02:21:14,2020/07/01 03:23:22,2020-07-01 04:10:00,2020-07-01 04:07:00,-1 days +23:55:00,-1 days +23:57:00
1,2020-07-02,B733,B1109,CAN,2020-07-02 01:25:00,2020-07-02 01:23:00,2020/07/02 01:34:54,020/07/02 01:53:33,CAN,2020/07/02 02:55:22,2020/07/02 03:46:52,2020-07-02 04:10:00,2020-07-02 04:17:00,-1 days +23:58:00,0 days 00:07:00
2,2020-07-01,B733,B1110,LHW,2020-07-01 03:20:00,2020-07-01 03:04:00,2020/07/01 03:16:03,020/07/01 03:29:09,SZX,2020/07/01 03:50:46,2020/07/01 04:49:19,2020-07-01 05:20:00,2020-07-01 05:15:00,-1 days +23:44:00,-1 days +23:55:00
3,2020-07-02,B733,B1110,LHW,2020-07-02 03:20:00,2020-07-02 02:54:00,2020/07/02 03:30:33,020/07/02 03:38:58,SZX,2020/07/02 04:29:13,2020/07/02 04:53:58,2020-07-02 05:20:00,2020-07-02 05:14:00,-1 days +23:34:00,-1 days +23:54:00
4,2020-07-01,B752,B1177,PEK,2020-07-01 07:30:00,2020-07-01 06:52:00,020/07/01 07:07:30,2020/07/01 07:25:02,PEK,2020/07/01 01:09:26,2020/07/01 01:39:38,2020-07-01 02:05:00,2020-07-01 02:13:00,-1 days +23:22:00,0 days 00:08:00
5,2020-07-02,B752,B1177,PEK,2020-07-02 07:30:00,2020-07-02 07:07:00,020/07/02 07:17:30,2020/07/02 08:14:51,PEK,2020/07/02 00:48:26,2020/07/02 01:28:05,2020-07-02 02:05:00,2020-07-02 02:00:00,-1 days +23:37:00,-1 days +23:55:00
6,2020-07-01,B752,B1252,TSN,2020-07-02 01:25:00,2020-07-02 01:26:00,020/07/02 01:33:08,2020/07/02 01:33:10,TSN,2020/07/01 03:08:00,2020/07/01 03:53:52,2020-07-01 04:00:00,2020-07-01 04:15:00,0 days 00:01:00,0 days 00:15:00
7,2020-07-02,B752,B1252,JJN,2020-07-03 01:25:00,2020-07-03 02:02:00,020/07/03 04:31:32,2020/07/03 02:30:16,JJN,2020/07/02 03:02:00,2020/07/02 03:40:00,2020-07-02 04:00:00,2020-07-02 04:20:00,0 days 00:37:00,0 days 00:20:00


In [24]:
writer = pd.ExcelWriter('cleaned_data.xlsx', engine='openpyxl', encoding='utf-8')
new_df.to_excel(writer)
writer.save()

In [36]:
new_df['航班日期'].to_string()

'0   2020-07-01\n1   2020-07-02\n2   2020-07-01\n3   2020-07-02\n4   2020-07-01\n5   2020-07-02\n6   2020-07-01\n7   2020-07-02'

Unnamed: 0,航班日期,机型,机尾号,起飞机场,计划落地时间,实际落地时间,卸机开始时间,卸机结束时间,落地机场,装机开始时间,装机结束时间,计划起飞时间,实际起飞时间,进港延误,离港延误
0,2020-07-01,B733,B1109,10,2020-07-01 01:25:00,2020-07-01 01:20:00,2020/07/01 01:31:20,020/07/01 01:47:23,15,2020/07/01 02:21:14,2020/07/01 03:23:22,2020-07-01 04:10:00,2020-07-01 04:07:00,-1 days +23:55:00,-1 days +23:57:00
1,2020-07-02,B733,B1109,10,2020-07-02 01:25:00,2020-07-02 01:23:00,2020/07/02 01:34:54,020/07/02 01:53:33,15,2020/07/02 02:55:22,2020/07/02 03:46:52,2020-07-02 04:10:00,2020-07-02 04:17:00,-1 days +23:58:00,0 days 00:07:00
2,2020-07-01,B733,B1110,10,2020-07-01 03:20:00,2020-07-01 03:04:00,2020/07/01 03:16:03,020/07/01 03:29:09,15,2020/07/01 03:50:46,2020/07/01 04:49:19,2020-07-01 05:20:00,2020-07-01 05:15:00,-1 days +23:44:00,-1 days +23:55:00
3,2020-07-02,B733,B1110,10,2020-07-02 03:20:00,2020-07-02 02:54:00,2020/07/02 03:30:33,020/07/02 03:38:58,15,2020/07/02 04:29:13,2020/07/02 04:53:58,2020-07-02 05:20:00,2020-07-02 05:14:00,-1 days +23:34:00,-1 days +23:54:00
4,2020-07-01,B752,B1177,10,2020-07-01 07:30:00,2020-07-01 06:52:00,020/07/01 07:07:30,2020/07/01 07:25:02,15,2020/07/01 01:09:26,2020/07/01 01:39:38,2020-07-01 02:05:00,2020-07-01 02:13:00,-1 days +23:22:00,0 days 00:08:00
5,2020-07-02,B752,B1177,10,2020-07-02 07:30:00,2020-07-02 07:07:00,020/07/02 07:17:30,2020/07/02 08:14:51,15,2020/07/02 00:48:26,2020/07/02 01:28:05,2020-07-02 02:05:00,2020-07-02 02:00:00,-1 days +23:37:00,-1 days +23:55:00
6,2020-07-01,B752,B1252,10,2020-07-02 01:25:00,2020-07-02 01:26:00,020/07/02 01:33:08,2020/07/02 01:33:10,15,2020/07/01 03:08:00,2020/07/01 03:53:52,2020-07-01 04:00:00,2020-07-01 04:15:00,0 days 00:01:00,0 days 00:15:00
7,2020-07-02,B752,B1252,10,2020-07-03 01:25:00,2020-07-03 02:02:00,020/07/03 04:31:32,2020/07/03 02:30:16,15,2020/07/02 03:02:00,2020/07/02 03:40:00,2020-07-02 04:00:00,2020-07-02 04:20:00,0 days 00:37:00,0 days 00:20:00


nan

In [264]:
df = pd.DataFrame(data={
    '进港延误': [-420, -3201],
    '离港延误': [420, -540, ]
})

In [265]:
df

Unnamed: 0,进港延误,离港延误
0,-420,420
1,-3201,-540


In [266]:
df['进港延误'].corr(df['离港延误'])

1.0

In [267]:
df.corr()

Unnamed: 0,进港延误,离港延误
进港延误,1.0,1.0
离港延误,1.0,1.0
