Selection By Label
===
**按标签选择**

**Warning**

Whether a copy or a reference is returned for a setting operation, may depend on the context. This is sometimes called `chained assignment` and should be avoided. See [Returning a View versus Copy](http://pandas.pydata.org/pandas-docs/version/0.20.3/indexing.html#indexing-view-versus-copy)

设置操作是返回副本还是引用可能取决于上下文。这有时被称为`chained assignment`（链式分配），应该尽可能避免。参见 [Returning a View versus Copy](http://pandas.pydata.org/pandas-docs/version/0.20.3/indexing.html#indexing-view-versus-copy)

**Warning**

> `.loc` is strict when you present slicers that are not compatible (or convertible) with the index type. For example using integers in a `DatetimeIndex`. These will raise a `TypeError`.

> 当呈现与索引类型不兼容（或可转换）的切片器时，`.loc` 是严格的。例如在`DatetimeIndex`中使用整数。这会引发一个 `TypeError`.

In [3]:
import numpy as np
import pandas as pd

dfl = pd.DataFrame(np.random.randn(5,4), columns=list('ABCD'), index=pd.date_range('20130101',periods=5))

dfl

Unnamed: 0,A,B,C,D
2013-01-01,-1.340507,-1.037081,0.09712,-0.371959
2013-01-02,-0.488524,-1.824555,1.02477,1.058968
2013-01-03,0.628215,-1.115395,-2.420975,-0.580123
2013-01-04,0.562624,-0.234136,0.190008,0.771886
2013-01-05,-0.628014,2.285801,-2.721657,0.123267


In [4]:
 dfl.loc[2:3]

TypeError: cannot do slice indexing on <class 'pandas.core.indexes.datetimes.DatetimeIndex'> with these indexers [2] of <class 'int'>

String likes in slicing can be convertible to the type of the index and lead to natural slicing.

切片中类似字符串的值可以转换为索引的类型并导致自然切片。

In [5]:
dfl.loc['20130102':'20130104']

Unnamed: 0,A,B,C,D
2013-01-02,-0.488524,-1.824555,1.02477,1.058968
2013-01-03,0.628215,-1.115395,-2.420975,-0.580123
2013-01-04,0.562624,-0.234136,0.190008,0.771886


pandas provides a suite of methods in order to have **purely label based indexing**. This is a strict inclusion based protocol. **At least 1** of the labels for which you ask, must be in the index or a `KeyError` will be raised! When slicing, the start bound is *included*, **AND** the stop bound is *included*. Integers are valid labels, but they refer to the label **and not the position**.

pandas提供了一套方法，以便**纯粹基于标签的索引**。 这是一个严格的包含协议。 你的请求**至少有1**个标签，必须是在index中，否则就会触发“KeyError”！ 切片时，*包含*起始界限，**并且**也*包含* 停止界限。 整数是有效标签，但它们指的是标签**而不是位置**。

The `.loc` attribute is the primary access method. The following are valid inputs:

- A single label, e.g. `5` or `'a'`, (note that `5` is interpreted as a *label* of the index. This use is **not** an integer position along the index)
- A list or array of labels `['a', 'b', 'c']`
- A slice object with labels `'a':'f'` (note that contrary to usual python slices, **both** the start and the stop are included!)
- A boolean array
- A `callable`, see [Selection By Callable](http://pandas.pydata.org/pandas-docs/version/0.20.3/indexing.html#indexing-callable)

`.loc`属性是主要访问方法。 以下是有效输入：

- 单个标签，例如 `5`或''a'`，（注意`5`被解释为索引的*标签*。这个用法**不是**沿索引的整数位置）
- 标签列表或数组`['a'，'b'，'c']`
- 标签为`'a':'f'`的切片对象（请注意，与通常的python切片相反，开始和停止都被“包括”）
- 布尔数组
- 一个`callable`，参见[Selection By Callable](http://pandas.pydata.org/pandas-docs/version/0.20.3/indexing.html#indexing-callable)

In [7]:
s1 = pd.Series(np.random.randn(6),index=list('abcdef'))

s1

a    0.731278
b    0.441931
c   -0.504354
d   -0.304436
e   -0.359341
f   -0.060133
dtype: float64

In [8]:
s1.loc['c':]

c   -0.504354
d   -0.304436
e   -0.359341
f   -0.060133
dtype: float64

In [9]:
s1.loc['b']

0.44193064016158773

Note that setting works as well:

请注意，设置也同样适用：

In [11]:
s1.loc['c':] = 0

s1

a    0.731278
b    0.441931
c    0.000000
d    0.000000
e    0.000000
f    0.000000
dtype: float64

With a DataFrame

In [12]:
df1 = pd.DataFrame(np.random.randn(6,4),
                   index=list('abcdef'),
                   columns=list('ABCD'))

df1

Unnamed: 0,A,B,C,D
a,1.142116,0.498276,-0.423516,-0.684926
b,0.311051,0.622657,0.507311,0.270656
c,-0.799829,0.371249,0.824684,-1.022267
d,-0.630648,-0.190966,-0.58867,-0.51045
e,0.032056,-1.060193,-0.122763,-1.37256
f,-2.010867,0.447449,-1.608042,0.837206


In [13]:
df1.loc[['a', 'b', 'd'], :]

Unnamed: 0,A,B,C,D
a,1.142116,0.498276,-0.423516,-0.684926
b,0.311051,0.622657,0.507311,0.270656
d,-0.630648,-0.190966,-0.58867,-0.51045


Accessing via label slices

通过标签访问

In [14]:
df1.loc['d':, 'A':'C']

Unnamed: 0,A,B,C
d,-0.630648,-0.190966,-0.58867
e,0.032056,-1.060193,-0.122763
f,-2.010867,0.447449,-1.608042


For getting a cross section using a label (equiv to `df.xs('a')`)

使用标签获取横截面（相当于`df.xs('a')`）

In [15]:
df1.loc['a']

A    1.142116
B    0.498276
C   -0.423516
D   -0.684926
Name: a, dtype: float64

For getting values with a boolean array

使用boolean数据获取值

In [16]:
df1.loc['a'] > 0

A     True
B     True
C    False
D    False
Name: a, dtype: bool

In [17]:
df1.loc[:, df1.loc['a'] > 0]

Unnamed: 0,A,B
a,1.142116,0.498276
b,0.311051,0.622657
c,-0.799829,0.371249
d,-0.630648,-0.190966
e,0.032056,-1.060193
f,-2.010867,0.447449


For getting a value explicitly (equiv to deprecated `df.get_value('a','A')`)

用于显式获取值（等于弃用的`df.get_value（'a'，'A'）`）

In [18]:
# this is also equivalent to ``df1.at['a','A']``
df1.loc['a', 'A']

1.1421157493770389