# Pandas 查询数据

## Pandas查询数据的几种方法
1.df.loc方法，根据行、列的标签值查询<br>
2.df.iloc方法，根据行列的数字位置查询<br>
3.df.where方法<br>
4.df.query方法

loc既能查询，又能覆盖写入，强烈推荐！

## Pandas使用df.loc查询数据的方法

1.使用单个`label`值查询数据<br>
2.使用值列表批量查询<br>
3.使用数值区间进行范围查询<br>
4.使用条件表达式查询<br>
5.调用函数查询

# 注意
* 以上查询方法，既适用于行，也适用于列
* 注意观察降维`dataFrame`->`Series`->值

In [3]:
import pandas as pd

## 0. 读取数据

In [55]:
df = pd.read_csv("datas/weather_20230115134249.csv", encoding='utf-8')

In [56]:
df.head()

Unnamed: 0,日期,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
0,2015-1-1,新北市,烏來區,福山,13.7℃,92,0.0
1,2015-1-2,臺南市,安平區,安平,23.5℃,70,0.0
2,2015-1-3,臺東縣,東河鄉,七塊厝,19.6℃,86,0.0
3,2015-1-4,新北市,貢寮區,福隆,14.2℃,96,-99.0
4,2015-1-5,南投縣,仁愛鄉,小奇萊,8.3℃,57,0.0


In [57]:
# 设定索引为日期，方便日期筛选
df.set_index('日期',inplace=True)

In [58]:
df.index

Index(['2015-1-1', '2015-1-2', '2015-1-3', '2015-1-4', '2015-1-5', '2015-1-6',
       '2015-1-7', '2015-1-8', '2015-1-9', '2015-1-10',
       ...
       '2016-4-19', '2016-4-20', '2016-4-21', '2016-4-22', '2016-4-23',
       '2016-4-24', '2016-4-25', '2016-4-26', '2016-4-27', '2016-4-28'],
      dtype='object', name='日期', length=484)

In [61]:
df.loc[:,"气温(度)"]=df["气温(度)"].str.replace("℃","")

In [39]:
df.dtypes

城市           object
行政区          object
观测站          object
气温(度)         int32
相对湿度(%)       int64
累积雨量(mm)    float64
dtype: object

In [62]:
df.head()

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-1-1,新北市,烏來區,福山,13.7,92,0.0
2015-1-2,臺南市,安平區,安平,23.5,70,0.0
2015-1-3,臺東縣,東河鄉,七塊厝,19.6,86,0.0
2015-1-4,新北市,貢寮區,福隆,14.2,96,-99.0
2015-1-5,南投縣,仁愛鄉,小奇萊,8.3,57,0.0


## 1. 使用单个label值查询数据
行或者列，都可以只传入单个值，实现精确匹配

In [69]:
# 得到单个值
df.loc['2015-5-12','气温(度)']

'13.5'

In [68]:
# 得到一个Series
df.loc['2015-5-12',['气温(度)','相对湿度(%)']]

气温(度)      13.5
相对湿度(%)      81
Name: 2015-5-12, dtype: object

## 2. 使用值列表批量查询

In [70]:
# 得到Series
df.loc[['2015-5-12','2015-10-10'],'气温(度)']

日期
2015-5-12     13.5
2015-10-10    22.1
Name: 气温(度), dtype: object

In [71]:
# 得到DataFrame
df.loc[['2015-5-12','2015-10-10'],['气温(度)','相对湿度(%)']]

Unnamed: 0_level_0,气温(度),相对湿度(%)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-5-12,13.5,81
2015-10-10,22.1,60


## 3. 使用数值区间进行范围查询
注意：区间既包含开始，也包含结束

In [72]:
# 行index按区间
df.loc['2015-5-12':'2015-5-13','气温(度)']

日期
2015-5-12    13.5
2015-5-13     -99
Name: 气温(度), dtype: object

In [74]:
# 列index按区间
df.loc['2015-10-10','气温(度)':'累积雨量(mm)']

气温(度)       22.1
相对湿度(%)       60
累积雨量(mm)     0.0
Name: 2015-10-10, dtype: object

In [77]:
# 行和列都按区间查询
df.loc['2015-5-12':'2015-5-13','气温(度)':'累积雨量(mm)']

Unnamed: 0_level_0,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2015-5-12,13.5,81,6.0
2015-5-13,-99.0,-9900,0.0


## 4. 使用条件表达式查询
boolean列表的长度得等于行数或者列数

**简单条件查询，最低温度低于-10度的列表**

In [90]:
df.loc[df["气温(度)"].astype(float)<-10,:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-1-16,苗栗縣,造橋鄉,造橋,-99,-9900,-99.0
2015-1-25,臺中市,西屯區,西屯,-99,-9900,-99.0
2015-2-22,南投縣,仁愛鄉,奇萊稜線A,-99,-9900,0.0
2015-5-13,臺南市,西港區,西港,-99,-9900,0.0
2015-6-8,屏東縣,新埤鄉,新埤,-99,-9900,-99.0
2015-6-11,新北市,瑞芳區,鼻頭角,-99,-9900,0.5
2015-6-12,臺中市,烏日區,烏日,-99,-9900,-99.0
2015-7-6,新北市,汐止區,汐止,-99,-9900,-99.0
2016-1-17,高雄市,梓官區,梓官,-99,-9900,-99.0
2016-3-29,屏東縣,長治鄉,長治,-99,-9900,0.0


In [91]:
# 观察一些这里的boolean条件
df["气温(度)"].astype(float)<-10

日期
2015-1-1     False
2015-1-2     False
2015-1-3     False
2015-1-4     False
2015-1-5     False
             ...  
2016-4-24    False
2016-4-25    False
2016-4-26    False
2016-4-27    False
2016-4-28    False
Name: 气温(度), Length: 484, dtype: bool

**复杂查询条件，查一下我心中的目标天气**

In [94]:
# 查询最高温度小于30度，最低温度大于15度的天气数据
df.loc[(df["气温(度)"].astype(float)<30) & (df["气温(度)"].astype(float)>15),:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-1-2,臺南市,安平區,安平,23.5,70,0.0
2015-1-3,臺東縣,東河鄉,七塊厝,19.6,86,0.0
2015-1-6,嘉義縣,大林鎮,大林,23.2,63,0.0
2015-1-7,花蓮縣,玉里鎮,玉里,18.5,85,0.0
2015-1-8,嘉義市,東區,嘉義市東區,25,64,0.0
...,...,...,...,...,...,...
2016-4-22,新北市,三重區,三重,16.3,76,3.5
2016-4-23,高雄市,茄萣區,興達,25.1,67,0.0
2016-4-24,新北市,板橋區,板橋,15.6,73,5.0
2016-4-26,花蓮縣,壽豐鄉,大坑,15.7,94,0.0


In [95]:
# 再次观察这里的boolean条件
(df["气温(度)"].astype(float)<30) & (df["气温(度)"].astype(float)>15)

日期
2015-1-1     False
2015-1-2      True
2015-1-3      True
2015-1-4     False
2015-1-5     False
             ...  
2016-4-24     True
2016-4-25    False
2016-4-26     True
2016-4-27     True
2016-4-28    False
Name: 气温(度), Length: 484, dtype: bool

## 5. 调用函数查询

In [96]:
# 直接写lambda表达式
df.loc[lambda df : (df["气温(度)"].astype(float)<30) & (df["气温(度)"].astype(float)>15),:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-1-2,臺南市,安平區,安平,23.5,70,0.0
2015-1-3,臺東縣,東河鄉,七塊厝,19.6,86,0.0
2015-1-6,嘉義縣,大林鎮,大林,23.2,63,0.0
2015-1-7,花蓮縣,玉里鎮,玉里,18.5,85,0.0
2015-1-8,嘉義市,東區,嘉義市東區,25,64,0.0
...,...,...,...,...,...,...
2016-4-22,新北市,三重區,三重,16.3,76,3.5
2016-4-23,高雄市,茄萣區,興達,25.1,67,0.0
2016-4-24,新北市,板橋區,板橋,15.6,73,5.0
2016-4-26,花蓮縣,壽豐鄉,大坑,15.7,94,0.0


In [128]:
df["累积雨量(mm)"]==0.0

日期
2015-1-1      True
2015-1-2      True
2015-1-3      True
2015-1-4     False
2015-1-5      True
             ...  
2016-4-24    False
2016-4-25     True
2016-4-26     True
2016-4-27     True
2016-4-28    False
Name: 累积雨量(mm), Length: 484, dtype: bool

In [145]:
#函数式编程的本质：
#    函数自身可以像变量一样传递

# 编写自己的函数，查询九月份累计雨量为0的数据
def query_my_data(df):
    return (df.index.str.startswith("2015-9")) & (df["累积雨量(mm)"]== 0.0)

df.loc[query_my_data,:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-9-1,臺南市,歸仁區,媽廟,26.2,72,0.0
2015-9-2,雲林縣,元長鄉,元長,21.4,67,0.0
2015-9-3,臺南市,後壁區,後壁,23.8,62,0.0
2015-9-6,臺中市,大甲區,大甲,18.7,67,0.0
2015-9-7,南投縣,仁愛鄉,廬山,19.4,54,0.0
2015-9-8,屏東縣,萬巒鄉,赤山,27.5,63,0.0
2015-9-10,金門縣,烏坵鄉,烏坵,11.2,88,0.0
2015-9-11,臺南市,將軍區,鯤鯓國小,20.1,82,0.0
2015-9-14,臺中市,太平區,中竹林,21.0,64,0.0
2015-9-15,臺南市,安南區,安南,23.2,81,0.0
