In [1]:
import numpy as np
import pandas as pd

df = pd.read_csv('../data/000015', index_col= 'Date', names=['Date', 'Open', 'Close', 'High', 'Low', 'Volume', 'Money', 'PE', 'PB'], header=None)

# Basic indexing

In [19]:
df_prices = df[['Close', 'High', 'Low']]
   
series_close = df['Close']
close_of_a_day = series_close['2010-01-04']
close_of_a_day

# can't get row of dataframe like:
try:
    df_prices['2010-01-04']
except KeyError:
    pass

## Accessing attributes using dot operator

In [4]:
df.Close

# select by specifying column indexes
df[[1, 2]]

Unnamed: 0_level_0,Close,High
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2010-01-04,2777.1940,2831.8680
2010-01-05,2814.0500,2818.8180
2010-01-06,2796.7760,2841.5970
2010-01-07,2726.5180,2810.6950
2010-01-08,2730.0150,2734.9920
2010-01-11,2737.0080,2841.8780
2010-01-12,2779.8700,2781.0190
2010-01-13,2676.1590,2734.9640
2010-01-14,2710.1650,2714.0530
2010-01-15,2715.0440,2735.2300


## Range slicing

The synttax of the slicing operator exactly matches that of NumPy:

```python
ar[startIndex: endIndex: stepValue]
```

where the default values if not specified are as follows:

* 0 for startIndex
* arraysize-1 for endIndex
* 1 for stepValue

# Label, integer and mixed indexing

* The `.loc` operator: Allows label-oriented indexing
* The `.iloc` operator: Allows integer-based indexing
* The `.ix` operator: Allows mixed label and integer-based indexing


## Label-oriented indexing

The `.loc` operator supports pure label-based indexing. It accepts the following as valid inputs:

* A single label.
* List or array of labels.
* A slice object with labels.
* A Boolean array.

In [12]:
df.loc['2010-01-04']

# follows are same
df.loc['2010-01-04', 'Close']
df.loc['2010-01-04']['Close']
df['Close']['2010-01-04']

df.loc[['2010-01-04', '2010-01-05']]
df.loc['2010-01-04': '2010-02-05']

2777.1940000000004

### Selection using a Boolean array

In [21]:
df.loc[df['Close'] <= df['Close'].min(),:]

Unnamed: 0_level_0,Open,Close,High,Low,Volume,Money,PE,PB
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-03-10,1612.96,1574.927,1612.96,1571.858,1720594000.0,10812960000.0,1.079408,8.262363


## Integer-oriented indexing

The `iloc` operator supports integer-based positional indexing. It accepts the following as inputs:

* A single integer.
* A list or array of integers.
* A slice object with integers.

In [25]:
df.iloc[0:10,]

Unnamed: 0_level_0,Open,Close,High,Low,Volume,Money,PE,PB
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2010-01-04,2827.015,2777.194,2831.868,2776.921,1254720000.0,12684100000.0,2.781231,18.605709
2010-01-05,2784.758,2814.05,2818.818,2758.519,1565152000.0,17214700000.0,2.81413,18.799683
2010-01-06,2807.629,2796.776,2841.597,2795.723,1419338000.0,16095860000.0,2.779297,18.604318
2010-01-07,2794.445,2726.518,2810.695,2718.568,1462168000.0,15453930000.0,2.701644,18.194545
2010-01-08,2716.81,2730.015,2734.992,2692.33,1078874000.0,11860710000.0,2.683593,18.202034
2010-01-11,2841.878,2737.008,2841.878,2725.159,1792579000.0,17535350000.0,2.753622,18.293331
2010-01-12,2730.648,2779.87,2781.019,2703.055,1808328000.0,19659490000.0,2.794849,18.623318
2010-01-13,2713.191,2676.159,2734.964,2670.343,2362136000.0,23905530000.0,2.677216,17.971852
2010-01-14,2689.674,2710.165,2714.053,2668.39,1472739000.0,15952320000.0,2.69831,18.165844
2010-01-15,2710.085,2715.044,2735.23,2688.013,1310731000.0,13999460000.0,2.706514,18.207546


## Mixed indexing with the .ix opeator

The `.ix` operator behaves like a mixture of the `.loc` and `.iloc` operators, with the `.loc` behavior taking precedence. It takes the following as possible inputs:

* A single label or integer
* A list of integers or labels
* An integer slice or label slice
* A Boolean array

In [38]:
df.ix['2010-01-04']
df.ix[['2010-01-04', '2010-01-05']]
df.ix[df.index[-3:]]
df.ix[0]
df.ix[[0, 2]]
df.ix[1: 3]
df.ix[df['Close'] > 4044.6640]

Unnamed: 0_level_0,Open,Close,High,Low,Volume,Money,PE,PB
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2015-06-08,3996.849,4057.669,4068.551,3945.534,24615900000.0,216235600000.0,2.09359,20.732199
2015-06-09,4060.8198,4065.4939,4079.1026,3986.7427,21500910000.0,191766100000.0,2.070661,20.563958
2015-06-10,4004.71,4050.159,4096.841,3968.488,13430580000.0,129038500000.0,2.056192,20.386042
2015-06-11,4041.815,4049.553,4056.415,3996.274,11321040000.0,107690000000.0,2.104816,20.380379
2015-06-12,4073.435,4144.664,4144.664,4073.435,13625960000.0,136898100000.0,2.108937,20.522397
2015-06-15,4158.645,4129.638,4195.205,4105.045,13832800000.0,136647100000.0,2.047086,20.09816
2015-06-17,3978.398,4064.677,4074.225,3855.747,14189670000.0,131978400000.0,2.016946,19.622117
