# Pandas II

## More indexing tricks

We'll start out with some data from Beer Advocate (see [Tom Augspurger](https://github.com/TomAugspurger/pydata-chi-h2t/blob/master/3-Indexing.ipynb) for some cool details on how he extracted this data)

In [8]:
import numpy as np
import pandas as pd
pd.options.display.max_rows = 10

In [4]:
df = pd.read_csv('data/beer_subset.csv.gz', parse_dates=['time'], compression='gzip')
df.head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
0,7.0,2511,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,blaheath,4.5,Batch 8144\tPitch black in color with a 1/2 f...,2009-10-05 21:31:48
1,5.7,19736,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,GJ40,4.0,Sampled from a 12oz bottle in a standard pint...,2009-10-05 21:32:09
2,4.8,11098,3182,Fürstenberg Premium Pilsener,German Pilsener,4.0,3.0,3.0,3.0,biegaman,3.5,Haystack yellow with an energetic group of bu...,2009-10-05 21:32:13
3,9.5,28577,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
4,5.8,398,119,Wolaver's Pale Ale,American Pale Ale (APA),4.0,3.0,4.0,3.5,champ103,3.0,A: Pours a slightly hazy golden/orange color....,2009-10-05 21:33:14


### Boolean indexing

Like a where clause in SQL. 

The indexer (or boolean mask) should be 1-dimensional and the same length as the thing being indexed.

In [10]:
df.abv < 5

0      False
1      False
2       True
3      False
4      False
       ...  
994    False
995    False
996    False
997    False
998    False
Name: abv, dtype: bool

In [12]:
df.loc[df.abv < 5].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
2,4.8,11098,3182,Fürstenberg Premium Pilsener,German Pilsener,4.0,3.0,3.0,3.0,biegaman,3.5,Haystack yellow with an energetic group of bu...,2009-10-05 21:32:13
7,4.8,1669,256,Great White,Witbier,4.5,4.5,4.5,4.5,n0rc41,4.5,"Ok, for starters great white I believe will b...",2009-10-05 21:34:29
21,4.6,401,118,Dark Island,Scottish Ale,4.0,4.0,3.5,4.0,abuliarose,4.0,"Poured into a snifter, revealing black opaque...",2009-10-05 21:47:36
22,4.9,5044,18968,Kipona Fest,Märzen / Oktoberfest,4.0,3.5,4.0,4.0,drcarver,4.0,A - a medium brown body with an off white hea...,2009-10-05 21:47:56
28,4.6,401,118,Dark Island,Scottish Ale,4.0,4.0,4.5,4.0,sisuspeed,4.0,The color of this beer fits the name well. Op...,2009-10-05 21:53:38


In [15]:
df.loc[((df.abv < 5) & (df.time > pd.Timestamp('2009-06'))) | (df.review_overall >= 4.5)].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
0,7.0,2511,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,blaheath,4.5,Batch 8144\tPitch black in color with a 1/2 f...,2009-10-05 21:31:48
1,5.7,19736,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,GJ40,4.0,Sampled from a 12oz bottle in a standard pint...,2009-10-05 21:32:09
2,4.8,11098,3182,Fürstenberg Premium Pilsener,German Pilsener,4.0,3.0,3.0,3.0,biegaman,3.5,Haystack yellow with an energetic group of bu...,2009-10-05 21:32:13
6,6.2,53128,1114,Smokin' Amber Kegs Gone Wild,American Amber / Red Ale,3.5,4.0,4.5,4.0,Deuane,4.5,An American amber with the addition of smoked...,2009-10-05 21:34:24
7,4.8,1669,256,Great White,Witbier,4.5,4.5,4.5,4.5,n0rc41,4.5,"Ok, for starters great white I believe will b...",2009-10-05 21:34:29


Be careful with the order of operations...

In [16]:
2 > 1 & 0

True

Safest to use parentheses...

In [17]:
(2 > 1) & 0

0

Select just the rows where the `beer_style` contains `'IPA'`:

In [19]:
df.beer_style.str?

In [20]:
df.beer_style.str.

SyntaxError: invalid syntax (<ipython-input-20-8785dd64c165>, line 1)

In [21]:
df.beer_style.str.contains('IPA')

0      False
1      False
2      False
3       True
4      False
       ...  
994    False
995    False
996    False
997    False
998    False
Name: beer_style, dtype: bool

In [24]:
df.loc[df.beer_style.str.contains('IPA')].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
3,9.5,28577,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
8,6.7,6549,140,Northern Hemisphere Harvest Wet Hop Ale,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31
16,8.0,36179,3818,Hoppe (Imperial Extra Pale Ale),American Double / Imperial IPA,4.0,3.0,4.0,3.5,nick76,3.0,"The aroma is papery with citrus, yeast, and s...",2009-10-05 21:43:23
23,6.5,44727,596,Portsmouth 5 C's IPA,American IPA,4.5,5.0,5.0,4.5,ALeF,5.0,As a devoted drinker of American and English ...,2009-10-05 21:48:46
26,5.9,37477,140,Sierra Nevada Anniversary Ale (2007-2009),American IPA,4.5,4.5,4.5,4.5,n0rc41,4.5,Poured a great dark color with great smell! t...,2009-10-05 21:51:33


Find the rows where the beer style is either `'American IPA'` or `'Pilsner'`:

In [25]:
df[(df.beer_style == 'American IPA') | (df.beer_style == 'Pilsner')].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
8,6.7,6549,140,Northern Hemisphere Harvest Wet Hop Ale,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31
23,6.5,44727,596,Portsmouth 5 C's IPA,American IPA,4.5,5.0,5.0,4.5,ALeF,5.0,As a devoted drinker of American and English ...,2009-10-05 21:48:46
26,5.9,37477,140,Sierra Nevada Anniversary Ale (2007-2009),American IPA,4.5,4.5,4.5,4.5,n0rc41,4.5,Poured a great dark color with great smell! t...,2009-10-05 21:51:33
32,7.5,6076,651,Flower Power India Pale Ale,American IPA,3.5,4.5,4.0,3.5,OnThenIn,4.0,Appearance: The beer pours a rather cloudy da...,2009-10-05 22:02:11
48,6.7,44749,140,Sierra Nevada Chico Estate Harvest Wet Hop Ale...,American IPA,4.5,3.5,4.0,4.5,mikey711,4.0,I love this concept. Way to go Sierra Nevada!...,2009-10-05 22:19:33


Or more succinctly:

In [26]:
df[df.beer_style.isin(['American IPA', 'Pilsner'])].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
8,6.7,6549,140,Northern Hemisphere Harvest Wet Hop Ale,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31
23,6.5,44727,596,Portsmouth 5 C's IPA,American IPA,4.5,5.0,5.0,4.5,ALeF,5.0,As a devoted drinker of American and English ...,2009-10-05 21:48:46
26,5.9,37477,140,Sierra Nevada Anniversary Ale (2007-2009),American IPA,4.5,4.5,4.5,4.5,n0rc41,4.5,Poured a great dark color with great smell! t...,2009-10-05 21:51:33
32,7.5,6076,651,Flower Power India Pale Ale,American IPA,3.5,4.5,4.0,3.5,OnThenIn,4.0,Appearance: The beer pours a rather cloudy da...,2009-10-05 22:02:11
48,6.7,44749,140,Sierra Nevada Chico Estate Harvest Wet Hop Ale...,American IPA,4.5,3.5,4.0,4.5,mikey711,4.0,I love this concept. Way to go Sierra Nevada!...,2009-10-05 22:19:33


#### Mini Exercise

- Select the rows where the scores of the 5 review_cols ('review_appearance', 'review_aroma', 'review_overall', 'review_palate', 'review_taste') are all at least 4.0.

- _Hint_: Like NumPy arrays, DataFrames have an any and all methods that check whether it contains any or all True values. These methods also take an axis argument for the dimension to remove.
    - 0 or 'index' removes (or aggregates over) the vertical dimension
    - 1 or 'columns' removes (aggregates over) the horizontal dimension.

In [42]:
df.columns

Index(['abv', 'beer_id', 'brewer_id', 'beer_name', 'beer_style',
       'review_appearance', 'review_aroma', 'review_overall', 'review_palate',
       'profile_name', 'review_taste', 'text', 'time'],
      dtype='object')

In [44]:
review_cols = [c for c in df.columns if c[0:6] == 'review']
review_cols

['review_appearance',
 'review_aroma',
 'review_overall',
 'review_palate',
 'review_taste']

In [46]:
df[(df.review_appearance >= 4) &
   (df.review_aroma >= 4) &
   (df.review_overall >= 4) &
   (df.review_palate >= 4) &
   (df.review_taste >= 4)].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
0,7.0,2511,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,blaheath,4.5,Batch 8144\tPitch black in color with a 1/2 f...,2009-10-05 21:31:48
1,5.7,19736,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,GJ40,4.0,Sampled from a 12oz bottle in a standard pint...,2009-10-05 21:32:09
3,9.5,28577,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
7,4.8,1669,256,Great White,Witbier,4.5,4.5,4.5,4.5,n0rc41,4.5,"Ok, for starters great white I believe will b...",2009-10-05 21:34:29
8,6.7,6549,140,Northern Hemisphere Harvest Wet Hop Ale,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31


Or the short way:

In [48]:
df[(df[review_cols] >= 4).all(1)].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
0,7.0,2511,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,blaheath,4.5,Batch 8144\tPitch black in color with a 1/2 f...,2009-10-05 21:31:48
1,5.7,19736,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,GJ40,4.0,Sampled from a 12oz bottle in a standard pint...,2009-10-05 21:32:09
3,9.5,28577,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
7,4.8,1669,256,Great White,Witbier,4.5,4.5,4.5,4.5,n0rc41,4.5,"Ok, for starters great white I believe will b...",2009-10-05 21:34:29
8,6.7,6549,140,Northern Hemisphere Harvest Wet Hop Ale,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31


In [53]:
(df[review_cols] >= 4)

Unnamed: 0,review_appearance,review_aroma,review_overall,review_palate,review_taste
0,True,True,True,True,True
1,True,True,True,True,True
2,True,False,False,False,False
3,True,True,True,True,True
4,True,False,True,False,False
...,...,...,...,...,...
994,True,True,False,True,True
995,True,False,True,True,True
996,False,True,False,False,True
997,True,False,False,False,True


In [52]:
(df[review_cols] >= 4).all(axis=1)

0       True
1       True
2      False
3       True
4      False
       ...  
994    False
995    False
996    False
997    False
998    False
dtype: bool

In [54]:
df[(df[review_cols] >= 4).all(1)].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
0,7.0,2511,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,blaheath,4.5,Batch 8144\tPitch black in color with a 1/2 f...,2009-10-05 21:31:48
1,5.7,19736,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,GJ40,4.0,Sampled from a 12oz bottle in a standard pint...,2009-10-05 21:32:09
3,9.5,28577,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
7,4.8,1669,256,Great White,Witbier,4.5,4.5,4.5,4.5,n0rc41,4.5,"Ok, for starters great white I believe will b...",2009-10-05 21:34:29
8,6.7,6549,140,Northern Hemisphere Harvest Wet Hop Ale,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31


Now select rows where the _average_ of the 5 `review_cols` is at least 4.

In [57]:
df[review_cols].mean(axis=1)

0      4.3
1      4.2
2      3.3
3      4.0
4      3.5
      ... 
994    4.0
995    4.2
996    3.5
997    3.7
998    3.7
dtype: float64

In [56]:
df[df[review_cols].mean(axis=1) >= 4].head()

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
0,7.0,2511,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,blaheath,4.5,Batch 8144\tPitch black in color with a 1/2 f...,2009-10-05 21:31:48
1,5.7,19736,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,GJ40,4.0,Sampled from a 12oz bottle in a standard pint...,2009-10-05 21:32:09
3,9.5,28577,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
5,7.0,966,365,Pike Street XXXXX Stout,American Stout,4.0,4.0,3.5,4.0,sprucetip,4.5,"From notes. Pours black, thin mocha head fade...",2009-10-05 21:33:48
6,6.2,53128,1114,Smokin' Amber Kegs Gone Wild,American Amber / Red Ale,3.5,4.0,4.5,4.0,Deuane,4.5,An American amber with the addition of smoked...,2009-10-05 21:34:24


## Hierarchical Indexing

- One of the most powerful and most complicated features of pandas
- Let's you represent high-dimensional datasets in a table

In [58]:
reviews = df.set_index(['profile_name', 'beer_id', 'time'])
reviews.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,abv,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,review_taste,text
profile_name,beer_id,time,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
blaheath,2511,2009-10-05 21:31:48,7.0,287,Bell's Cherry Stout,American Stout,4.5,4.0,4.5,4.0,4.5,Batch 8144\tPitch black in color with a 1/2 f...
GJ40,19736,2009-10-05 21:32:09,5.7,9790,Duck-Rabbit Porter,American Porter,4.5,4.0,4.5,4.0,4.0,Sampled from a 12oz bottle in a standard pint...
biegaman,11098,2009-10-05 21:32:13,4.8,3182,Fürstenberg Premium Pilsener,German Pilsener,4.0,3.0,3.0,3.0,3.5,Haystack yellow with an energetic group of bu...
nick76,28577,2009-10-05 21:32:37,9.5,3818,Unearthly (Imperial India Pale Ale),American Double / Imperial IPA,4.0,4.0,4.0,4.0,4.0,"The aroma has pine, wood, citrus, caramel, an..."
champ103,398,2009-10-05 21:33:14,5.8,119,Wolaver's Pale Ale,American Pale Ale (APA),4.0,3.0,4.0,3.5,3.0,A: Pours a slightly hazy golden/orange color....


In [59]:
reviews = reviews.sort_index()
reviews.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,abv,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,review_taste,text
profile_name,beer_id,time,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
05Harley,1307,2009-10-06 00:10:06,8.5,428,Der Weisse Bock,Weizenbock,4.0,4.0,4.0,4.0,4.0,Can't find the date on this one.\t\tPurchased...
ADZA,50994,2009-10-06 11:08:30,,11611,Saison De Coing (Quince Saison),Saison / Farmhouse Ale,4.0,4.0,3.5,3.5,3.5,I tried this breweries normal Saison ages ago...
ALeF,44727,2009-10-05 21:48:46,6.5,596,Portsmouth 5 C's IPA,American IPA,4.5,5.0,5.0,4.5,5.0,As a devoted drinker of American and English ...
ATPete,945,2009-10-06 22:46:54,10.0,173,Adam,Old Ale,4.0,4.5,4.0,4.0,4.5,12oz bottle\t\tPours a deep copper brown colo...
ATPete,5428,2009-10-06 22:53:26,10.0,335,New Holland Dragon's Milk Oak Barrel Ale,American Stout,3.5,4.5,4.0,4.0,4.0,22oz bottle\t\tPours a muddy brown color with...


In [61]:
reviews.loc['05Harley']

Unnamed: 0_level_0,Unnamed: 1_level_0,abv,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,review_taste,text
beer_id,time,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1307,2009-10-06 00:10:06,8.5,428,Der Weisse Bock,Weizenbock,4.0,4.0,4.0,4.0,4.0,Can't find the date on this one.\t\tPurchased...


In [62]:
reviews.loc['05Harley',1307]

Unnamed: 0_level_0,abv,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,review_taste,text
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2009-10-06 00:10:06,8.5,428,Der Weisse Bock,Weizenbock,4.0,4.0,4.0,4.0,4.0,Can't find the date on this one.\t\tPurchased...


In [71]:
reviews.loc[:,1307]

TypeError: cannot do label indexing on <class 'pandas.indexes.base.Index'> with these indexers [1307] of <class 'int'>

### Top Reviewers

Let's select all the reviews by the top reviewers, by label.

In [72]:
top_reviewers = df['profile_name'].value_counts().head(5).index
top_reviewers

Index(['corby112', 'Anthony1', 'nickd717', 'rfgetz', 'BigMcLargeHuge'], dtype='object')

In [73]:
reviews.loc[top_reviewers, :, :].head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,abv,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,review_taste,text
profile_name,beer_id,time,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Anthony1,99,2009-10-06 13:38:40,7.6,142,Spaten Optimator,Doppelbock,3.5,3.5,3.5,3.0,3.5,Dark reddish brown body with a one finger tan...
Anthony1,101,2009-10-06 13:54:26,5.5,35,Samuel Adams Winter Lager,Bock,3.5,3.0,3.0,3.0,3.5,"has a nice brownish/amber appearance, filtere..."
Anthony1,102,2009-10-06 13:51:49,5.3,35,Samuel Adams Octoberfest,Märzen / Oktoberfest,3.5,4.0,3.5,3.5,3.5,"Pours a bright, clear copper with a fluffy, l..."
Anthony1,103,2009-10-06 15:18:17,5.3,35,Samuel Adams Summer Ale,American Pale Wheat Ale,3.0,3.0,4.0,3.5,4.0,Pours a somewhat hazy medium gold color with ...
Anthony1,104,2009-10-06 13:51:02,4.9,35,Samuel Adams Boston Lager,Vienna Lager,3.5,3.0,3.5,3.0,3.5,Pours a into a pint glass with a light copper...


The syntax is a bit trickier when you want to specify a row Indexer *and* a column Indexer:

In [74]:
reviews.loc[(top_reviewers, 99, :), ['beer_name', 'brewer_name']]

SyntaxError: invalid syntax (<ipython-input-74-725966efb09e>, line 1)

In [75]:
reviews.loc[pd.IndexSlice[top_reviewers, 99, :], ['beer_name', 'brewer_id']]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,beer_name,brewer_id
profile_name,beer_id,time,Unnamed: 3_level_1,Unnamed: 4_level_1
Anthony1,99,2009-10-06 13:38:40,Spaten Optimator,142


Use `.loc` to select the `beer_name` and `beer_style` for the 10 most popular beers, as measured by number of reviews:

In [76]:
top_beers = df['beer_id'].value_counts().head(10).index
top_beers

Int64Index([52077, 38394, 53159, 52371, 6260, 13896, 52535, 102, 44932, 35738], dtype='int64')

In [77]:
reviews.loc[pd.IndexSlice[:, top_beers], ['beer_name', 'beer_style']]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,beer_name,beer_style
profile_name,beer_id,time,Unnamed: 3_level_1,Unnamed: 4_level_1
ATPete,44932,2009-10-06 22:15:41,Autumn Maple,Fruit / Vegetable Beer
ATPete,52371,2009-10-06 23:05:28,Sierra Nevada Estate Brewers Harvest Ale,American IPA
Anthony1,102,2009-10-06 13:51:49,Samuel Adams Octoberfest,Märzen / Oktoberfest
Anthony1,38394,2009-10-06 16:54:54,Pumking,Pumpkin Ale
ArrogantB,52371,2009-10-06 01:51:24,Sierra Nevada Estate Brewers Harvest Ale,American IPA
...,...,...,...,...
spartanfan,52077,2009-10-06 22:48:27,A Little Sumpin' Extra! Ale,American Double / Imperial IPA
stewart124,6260,2009-10-06 03:37:57,Punkin Ale,Pumpkin Ale
tobyandgina,52371,2009-10-06 05:21:57,Sierra Nevada Estate Brewers Harvest Ale,American IPA
ujsplace,38394,2009-10-05 23:41:32,Pumking,Pumpkin Ale


### Beware "chained indexing"

You can sometimes get away with using `[...][...]`, but try to avoid it!

In [78]:
df.loc[df.beer_style.str.contains('IPA')]['beer_name']

3            Unearthly (Imperial India Pale Ale)
8        Northern Hemisphere Harvest Wet Hop Ale
16               Hoppe (Imperial Extra Pale Ale)
23                          Portsmouth 5 C's IPA
26     Sierra Nevada Anniversary Ale (2007-2009)
                         ...                    
959                  A Little Sumpin' Extra! Ale
962                               Hop-a-lot-amus
971                        Founders Devil Dancer
972                              Dreadnaught IPA
984                   15th Anniversary Wood Aged
Name: beer_name, dtype: object

In [79]:
df.loc[df.beer_style.str.contains('IPA')]['beer_name'] = 'yummy'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':


In [80]:
df.loc[df.beer_style.str.contains('IPA')]['beer_name']

3            Unearthly (Imperial India Pale Ale)
8        Northern Hemisphere Harvest Wet Hop Ale
16               Hoppe (Imperial Extra Pale Ale)
23                          Portsmouth 5 C's IPA
26     Sierra Nevada Anniversary Ale (2007-2009)
                         ...                    
959                  A Little Sumpin' Extra! Ale
962                               Hop-a-lot-amus
971                        Founders Devil Dancer
972                              Dreadnaught IPA
984                   15th Anniversary Wood Aged
Name: beer_name, dtype: object

In [84]:
df.loc[df.beer_style.str.contains('IPA'), 'beer_name'] = 'yummy'
df.loc[df.beer_style.str.contains('IPA')].head(2)

Unnamed: 0,abv,beer_id,brewer_id,beer_name,beer_style,review_appearance,review_aroma,review_overall,review_palate,profile_name,review_taste,text,time
3,9.5,28577,3818,yummy,American Double / Imperial IPA,4.0,4.0,4.0,4.0,nick76,4.0,"The aroma has pine, wood, citrus, caramel, an...",2009-10-05 21:32:37
8,6.7,6549,140,yummy,American IPA,4.0,4.0,4.0,4.0,david18,4.0,I like all of Sierra Nevada's beers but felt ...,2009-10-05 21:34:31


See [the docs](http://pandas.pydata.org/pandas-docs/stable/timeseries.html) for more information on Pandas' complex time and date functionalities...