# Pandas 查询数据

## Pandas查询数据的几种方法
1.df.loc方法，根据行、列的标签值查询<br>
2.df.iloc方法，根据行列的数字位置查询<br>
3.df.where方法<br>
4.df.query方法

loc既能查询，又能覆盖写入，强烈推荐！

## Pandas使用df.loc查询数据的方法

1.使用单个`label`值查询数据<br>
2.使用值列表批量查询<br>
3.使用数值区间进行范围查询<br>
4.使用条件表达式查询<br>
5.调用函数查询

# 注意
* 以上查询方法，既适用于行，也适用于列
* 注意观察降维`dataFrame`->`Series`->值

In [2]:
import pandas as pd

## 0. 读取数据

In [3]:
df = pd.read_csv("../datas/weather_20230115134249.csv", encoding='utf-8')

In [3]:
df.head()

Unnamed: 0,日期,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
0,2015-01-01,新北市,烏來區,福山,13.7℃,92,0.0
1,2015-01-02,臺南市,安平區,安平,23.5℃,70,0.0
2,2015-01-03,臺東縣,東河鄉,七塊厝,19.6℃,86,0.0
3,2015-01-04,新北市,貢寮區,福隆,14.2℃,96,-99.0
4,2015-01-05,南投縣,仁愛鄉,小奇萊,8.3℃,57,0.0


In [4]:
# 设定索引为日期，方便日期筛选
df.set_index('日期',inplace=True)

In [5]:
df.index

Index(['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04', '2015-01-05',
       '2015-01-06', '2015-01-07', '2015-01-08', '2015-01-09', '2015-01-10',
       ...
       '2016-04-19', '2016-04-20', '2016-04-21', '2016-04-22', '2016-04-23',
       '2016-04-24', '2016-04-25', '2016-04-26', '2016-04-27', '2016-04-28'],
      dtype='object', name='日期', length=484)

In [6]:
df.loc[:,"气温(度)"]=df["气温(度)"].str.replace("℃","")

In [7]:
df.dtypes

城市           object
行政区          object
观测站          object
气温(度)        object
相对湿度(%)       int64
累积雨量(mm)    float64
dtype: object

In [8]:
df.head()

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-01-01,新北市,烏來區,福山,13.7,92,0.0
2015-01-02,臺南市,安平區,安平,23.5,70,0.0
2015-01-03,臺東縣,東河鄉,七塊厝,19.6,86,0.0
2015-01-04,新北市,貢寮區,福隆,14.2,96,-99.0
2015-01-05,南投縣,仁愛鄉,小奇萊,8.3,57,0.0


## 1. 使用单个label值查询数据
行或者列，都可以只传入单个值，实现精确匹配

In [9]:
# 得到单个值
df.loc['2015-05-12','气温(度)']

'13.5'

In [11]:
# 得到一个Series
df.loc['2015-05-12',['气温(度)','相对湿度(%)']]

气温(度)      13.5
相对湿度(%)      81
Name: 2015-05-12, dtype: object

## 2. 使用值列表批量查询

In [12]:
# 得到Series
df.loc[['2015-05-12','2015-10-10'],'气温(度)']

日期
2015-05-12    13.5
2015-10-10    22.1
Name: 气温(度), dtype: object

In [13]:
# 得到DataFrame
df.loc[['2015-05-12','2015-10-10'],['气温(度)','相对湿度(%)']]

Unnamed: 0_level_0,气温(度),相对湿度(%)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-05-12,13.5,81
2015-10-10,22.1,60


## 3. 使用数值区间进行范围查询
注意：区间既包含开始，也包含结束

In [14]:
# 行index按区间
df.loc['2015-05-12':'2015-05-13','气温(度)']

日期
2015-05-12    13.5
2015-05-13     -99
Name: 气温(度), dtype: object

In [15]:
# 列index按区间
df.loc['2015-10-10','气温(度)':'累积雨量(mm)']

气温(度)       22.1
相对湿度(%)       60
累积雨量(mm)       0
Name: 2015-10-10, dtype: object

In [16]:
# 行和列都按区间查询
df.loc['2015-05-12':'2015-05-13','气温(度)':'累积雨量(mm)']

Unnamed: 0_level_0,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2015-05-12,13.5,81,6.0
2015-05-13,-99.0,-9900,0.0


## 4. 使用条件表达式查询
boolean列表的长度得等于行数或者列数

**简单条件查询，最低温度低于-10度的列表**

In [17]:
df.loc[df["气温(度)"].astype(float)<-10,:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-01-16,苗栗縣,造橋鄉,造橋,-99,-9900,-99.0
2015-01-25,臺中市,西屯區,西屯,-99,-9900,-99.0
2015-02-22,南投縣,仁愛鄉,奇萊稜線A,-99,-9900,0.0
2015-05-13,臺南市,西港區,西港,-99,-9900,0.0
2015-06-08,屏東縣,新埤鄉,新埤,-99,-9900,-99.0
2015-06-11,新北市,瑞芳區,鼻頭角,-99,-9900,0.5
2015-06-12,臺中市,烏日區,烏日,-99,-9900,-99.0
2015-07-06,新北市,汐止區,汐止,-99,-9900,-99.0
2016-01-17,高雄市,梓官區,梓官,-99,-9900,-99.0
2016-03-29,屏東縣,長治鄉,長治,-99,-9900,0.0


In [18]:
# 观察一些这里的boolean条件
df["气温(度)"].astype(float)<-10

日期
2015-01-01    False
2015-01-02    False
2015-01-03    False
2015-01-04    False
2015-01-05    False
2015-01-06    False
2015-01-07    False
2015-01-08    False
2015-01-09    False
2015-01-10    False
2015-01-11    False
2015-01-12    False
2015-01-13    False
2015-01-14    False
2015-01-15    False
2015-01-16     True
2015-01-17    False
2015-01-18    False
2015-01-19    False
2015-01-20    False
2015-01-21    False
2015-01-22    False
2015-01-23    False
2015-01-24    False
2015-01-25     True
2015-01-26    False
2015-01-27    False
2015-01-28    False
2015-01-29    False
2015-01-30    False
              ...  
2016-03-30    False
2016-03-31    False
2016-04-01    False
2016-04-02    False
2016-04-03    False
2016-04-04    False
2016-04-05    False
2016-04-06    False
2016-04-07    False
2016-04-08    False
2016-04-09    False
2016-04-10    False
2016-04-11    False
2016-04-12    False
2016-04-13    False
2016-04-14    False
2016-04-15     True
2016-04-16    False
2016-04-17    Fal

**复杂查询条件，查一下我心中的目标天气**

In [19]:
# 查询最高温度小于30度，最低温度大于15度的天气数据
df.loc[(df["气温(度)"].astype(float)<30) & (df["气温(度)"].astype(float)>15),:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-01-02,臺南市,安平區,安平,23.5,70,0.0
2015-01-03,臺東縣,東河鄉,七塊厝,19.6,86,0.0
2015-01-06,嘉義縣,大林鎮,大林,23.2,63,0.0
2015-01-07,花蓮縣,玉里鎮,玉里,18.5,85,0.0
2015-01-08,嘉義市,東區,嘉義市東區,25,64,0.0
2015-01-11,彰化縣,大城鄉,三豐,17.4,81,0.0
2015-01-12,雲林縣,斗六市,斗六,23,58,0.0
2015-01-13,南投縣,魚池鄉,魚池,26.3,50,0.0
2015-01-14,臺東縣,海端鄉,向陽,16.3,71,0.0
2015-01-15,屏東縣,竹田鄉,竹田,26.7,80,0.0


In [20]:
# 再次观察这里的boolean条件
(df["气温(度)"].astype(float)<30) & (df["气温(度)"].astype(float)>15)

日期
2015-01-01    False
2015-01-02     True
2015-01-03     True
2015-01-04    False
2015-01-05    False
2015-01-06     True
2015-01-07     True
2015-01-08     True
2015-01-09    False
2015-01-10    False
2015-01-11     True
2015-01-12     True
2015-01-13     True
2015-01-14     True
2015-01-15     True
2015-01-16    False
2015-01-17    False
2015-01-18     True
2015-01-19     True
2015-01-20     True
2015-01-21     True
2015-01-22    False
2015-01-23     True
2015-01-24     True
2015-01-25    False
2015-01-26     True
2015-01-27     True
2015-01-28     True
2015-01-29     True
2015-01-30    False
              ...  
2016-03-30     True
2016-03-31     True
2016-04-01     True
2016-04-02     True
2016-04-03     True
2016-04-04     True
2016-04-05     True
2016-04-06     True
2016-04-07     True
2016-04-08    False
2016-04-09    False
2016-04-10     True
2016-04-11     True
2016-04-12     True
2016-04-13     True
2016-04-14     True
2016-04-15    False
2016-04-16     True
2016-04-17     Tr

## 5. 调用函数查询

In [21]:
# 直接写lambda表达式
df.loc[lambda df : (df["气温(度)"].astype(float)<30) & (df["气温(度)"].astype(float)>15),:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2015-01-02,臺南市,安平區,安平,23.5,70,0.0
2015-01-03,臺東縣,東河鄉,七塊厝,19.6,86,0.0
2015-01-06,嘉義縣,大林鎮,大林,23.2,63,0.0
2015-01-07,花蓮縣,玉里鎮,玉里,18.5,85,0.0
2015-01-08,嘉義市,東區,嘉義市東區,25,64,0.0
2015-01-11,彰化縣,大城鄉,三豐,17.4,81,0.0
2015-01-12,雲林縣,斗六市,斗六,23,58,0.0
2015-01-13,南投縣,魚池鄉,魚池,26.3,50,0.0
2015-01-14,臺東縣,海端鄉,向陽,16.3,71,0.0
2015-01-15,屏東縣,竹田鄉,竹田,26.7,80,0.0


In [22]:
df["累积雨量(mm)"]==0.0

日期
2015-01-01     True
2015-01-02     True
2015-01-03     True
2015-01-04    False
2015-01-05     True
2015-01-06     True
2015-01-07     True
2015-01-08     True
2015-01-09    False
2015-01-10    False
2015-01-11     True
2015-01-12     True
2015-01-13     True
2015-01-14     True
2015-01-15     True
2015-01-16    False
2015-01-17    False
2015-01-18    False
2015-01-19     True
2015-01-20     True
2015-01-21    False
2015-01-22    False
2015-01-23     True
2015-01-24     True
2015-01-25    False
2015-01-26     True
2015-01-27     True
2015-01-28     True
2015-01-29     True
2015-01-30    False
              ...  
2016-03-30     True
2016-03-31     True
2016-04-01     True
2016-04-02     True
2016-04-03     True
2016-04-04     True
2016-04-05     True
2016-04-06     True
2016-04-07     True
2016-04-08    False
2016-04-09     True
2016-04-10    False
2016-04-11     True
2016-04-12    False
2016-04-13     True
2016-04-14    False
2016-04-15    False
2016-04-16     True
2016-04-17     Tr

In [23]:
#函数式编程的本质：
#    函数自身可以像变量一样传递

# 编写自己的函数，查询九月份累计雨量为0的数据
def query_my_data(df):
    return (df.index.str.startswith("2015-9")) & (df["累积雨量(mm)"]== 0.0)

df.loc[query_my_data,:]

Unnamed: 0_level_0,城市,行政区,观测站,气温(度),相对湿度(%),累积雨量(mm)
日期,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
