[(Link)](https://github.com/jakevdp/PythonDataScienceHandbook/blob/8a34a4f653bdbdc01415a94dc20d4e9b97438965/notebooks/03.02-Data-Indexing-and-Selection.ipynb)

In [1]:
import numpy as np
import pandas as pd

# 1. Data selection from a Series

### Series as a dictionary

In [2]:
import pandas as pd
data = pd.Series([0.25, 0.5, 0.75, 1.0],
                 index=['a', 'b', 'c', 'd'])
data

a    0.25
b    0.50
c    0.75
d    1.00
dtype: float64

- If we want to access the value in row 'b':

In [3]:
data['b']

0.5

- If we want to check whether 'a' is in the index of `data`:

In [4]:
'a' in data

True

- If we want to look at the index and value from the Series like a key and value in a dictionary:

In [6]:
list(data.items())

[('a', 0.25), ('b', 0.5), ('c', 0.75), ('d', 1.0)]

- If we want to extend the Series by adding a row 'e' the same way we would the value corresponding to key 'e' in a dictionary:

In [8]:
data['e'] = 1.25
data

a    0.25
b    0.50
c    0.75
d    1.00
e    1.25
dtype: float64

- All decisions about memory and data are made by pandas under the hood
    - This makes it easy for us

### Series as a 1d array

- If we want to slice the index using the names of the rows:

In [9]:
data['a':'c']

a    0.25
b    0.50
c    0.75
dtype: float64

- If we want to slice the index using the integer representations of the rows:

In [14]:
data[0:2]

a    0.25
b    0.50
dtype: float64

- **Wait a minute**
    - The names of the first three rows are 'a', 'b', and 'c'
    - Their corresponding indices are 0, 1, and 2
        - Therefore, `data['a':'c']` and `data[0:2]` should return the same set of values
            - But `data['a':'c']` returns **3 rows** while `data[0:2]` only returns **2 rows**
                - We'll see why this occurs in the *Indexers: loc, iloc, and ix* below

- If we want to use masking i.e. conditions:

In [12]:
data[(data > 0.3) & (data < 0.8)]

b    0.50
c    0.75
dtype: float64

- If we want to use fancy indexing:

In [13]:
data[['a', 'e']]

a    0.25
e    1.25
dtype: float64

### Indexers: loc vs iloc

- Let's look at an example: a Series with a integer index

In [15]:
data = pd.Series(['a', 'b', 'c'], index=[1, 3, 5])
data

1    a
3    b
5    c
dtype: object

- Now, if we want the value in the row labeled 1, we can access it the normal way:

In [17]:
data[1]

'a'

- But if we try slicing from 1 to 3, we get something else

In [18]:
data[1:3]

3    b
5    c
dtype: object

- This is because the slicing uses the indices instead of the names
    - To overcome this, **we can use .loc**

In [19]:
data.loc[1:3]

1    a
3    b
dtype: object

- Using loc **always references the explicit index**

- If we want to use the **implicit** index (i.e. integer index), **we can use .iloc**

In [20]:
data.iloc[1:3]

3    b
5    c
dtype: object

- We could go into .ix, but that'll be deprecated in later pandas versions

_____

# 2. Data selection from a DataFrame

## DataFrame as a dictionary