## Slicing and Indexing DataFrames

#### Slicing time series

Slicing is particularly useful for time series since it's a common thing to want to filter for data within a date range. Add the `date` column to the index, then use `.loc[]` to perform the subsetting. The important thing to remember is to keep your dates in ISO 8601 format, that is, `"yyyy-mm-dd"` for year-month-day, `"yyyy-mm"` for year-month, and `"yyyy"` for year. Recall from Chapter  1 that you can multiple Boolean conditions using logical operators, such as `&`. To do so in one line of code, you'll need to add parentheses `()` around each condition. `pandas` is loaded as `pd` and `temperatures`, with no index, is available.

In [1]:
# importing pandas
import pandas as pd

# importing sales dataset
temperatures = pd.read_csv("../datasets/temperatures.csv")
temperatures.head()

Unnamed: 0.1,Unnamed: 0,date,city,country,avg_temp_c
0,0,2000-01-01,Abidjan,Côte D'Ivoire,27.293
1,1,2000-02-01,Abidjan,Côte D'Ivoire,27.685
2,2,2000-03-01,Abidjan,Côte D'Ivoire,29.061
3,3,2000-04-01,Abidjan,Côte D'Ivoire,28.162
4,4,2000-05-01,Abidjan,Côte D'Ivoire,27.547


### Instructions

* Use Boolean conditions, not `isin()` or `.loc[]`, and the full date `"yyyy-mm-dd"`, to subset `temperatures` for rows in 2010 and 2011 and print the results.
* Set the index of temperatures to the `date` column and sort it.
* Use `.loc[]` to subset `temperatures_ind` for rows in 2010 and 2011.
* Use `.loc[]` to subset `temperatures_ind` for rows and Aug 2010 to Feb 2011.

In [8]:
# Use Boolean conditions to subset temperatures for rows in 2010 and 2011
temperatures_bool = temperatures[(temperatures["date"] >= '2010-01-01') & (temperatures["date"] <= '2011-12-31')]
temperatures_bool

Unnamed: 0.1,Unnamed: 0,date,city,country,avg_temp_c
120,120,2010-01-01,Abidjan,Côte D'Ivoire,28.270
121,121,2010-02-01,Abidjan,Côte D'Ivoire,29.262
122,122,2010-03-01,Abidjan,Côte D'Ivoire,29.596
123,123,2010-04-01,Abidjan,Côte D'Ivoire,29.068
124,124,2010-05-01,Abidjan,Côte D'Ivoire,28.258
...,...,...,...,...,...
16474,16474,2011-08-01,Xian,China,23.069
16475,16475,2011-09-01,Xian,China,16.775
16476,16476,2011-10-01,Xian,China,12.587
16477,16477,2011-11-01,Xian,China,7.543


In [5]:
# Set date as the index and sort the index
temperatures_ind = temperatures.set_index("date").sort_index()
temperatures_ind

Unnamed: 0_level_0,Unnamed: 0,city,country,avg_temp_c
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2000-01-01,0,Abidjan,Côte D'Ivoire,27.293
2000-01-01,8415,Lahore,Pakistan,12.792
2000-01-01,15345,Tangshan,China,-5.406
2000-01-01,5115,Gizeh,Egypt,12.669
2000-01-01,8580,Lakhnau,India,15.152
...,...,...,...,...
2013-09-01,11549,Nanjing,China,
2013-09-01,11714,New Delhi,India,
2013-09-01,11879,New York,United States,17.408
2013-09-01,12209,Peking,China,


In [6]:
# Use .loc[] to subset temperatures_ind for rows in 2010 and 2011
temperatures_ind.loc["2010":"2011"]

Unnamed: 0_level_0,Unnamed: 0,city,country,avg_temp_c
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2010-01-01,4905,Faisalabad,Pakistan,11.810
2010-01-01,10185,Melbourne,Australia,20.016
2010-01-01,3750,Chongqing,China,7.921
2010-01-01,13155,São Paulo,Brazil,23.738
2010-01-01,5400,Guangzhou,China,14.136
...,...,...,...,...
2010-12-01,6896,Jakarta,Indonesia,26.602
2010-12-01,5246,Gizeh,Egypt,16.530
2010-12-01,11186,Nagpur,India,19.120
2010-12-01,14981,Sydney,Australia,19.559


In [7]:
# Use .loc[] to subset temperatures_ind for rows from Aug 2010 to Feb 2011
temperatures_ind.loc["2010-08":"2011-02"]

Unnamed: 0_level_0,Unnamed: 0,city,country,avg_temp_c
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2010-08-01,2602,Calcutta,India,30.226
2010-08-01,12337,Pune,India,24.941
2010-08-01,6562,Izmir,Turkey,28.352
2010-08-01,15637,Tianjin,China,25.543
2010-08-01,9862,Manila,Philippines,27.101
...,...,...,...,...
2011-01-01,4257,Dar Es Salaam,Tanzania,28.541
2011-01-01,11352,Nairobi,Kenya,17.768
2011-01-01,297,Addis Abeba,Ethiopia,17.708
2011-01-01,11517,Nanjing,China,0.144
