# In This Notebook

Exploratory data analysis of brewery-location relation object from brewery-db API. This object will provide location counts for each brewery.

# Setup

In [1]:
import os

from bkcharts import BoxPlot, Histogram, output_notebook, show
from bokeh.models import Range1d
import numpy as np
import pandas as pd

In [2]:
output_notebook()

In [3]:
wrk = '../../../data/wrk/brewery-db/'

In [4]:
def rstr(df):
    return df.shape, df.apply(lambda x: [x.unique()])

# Load Data

In [50]:
# read csv data into dataframe object
locpath = os.path.abspath(os.path.join(wrk, 'locations.csv'))
brewpath = os.path.abspath(os.path.join(wrk, 'breweries.csv'))
blpath = os.path.abspath(os.path.join(wrk, 'brewery_location.csv'))
locations = pd.read_csv(locpath)
breweries = pd.read_csv(brewpath)
bl = pd.read_csv(blpath)

In [51]:
bl.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6879 entries, 0 to 6878
Data columns (total 2 columns):
brewery_id     6879 non-null object
location_id    6879 non-null object
dtypes: object(2)
memory usage: 107.6+ KB


# Transform Data

In [52]:
### Get Brewery Counts by Year Added to Database
count = bl.location_id.groupby(bl.brewery_id).count()

In [53]:
count = pd.DataFrame(count)

In [54]:
count

Unnamed: 0_level_0,location_id
brewery_id,Unnamed: 1_level_1
00i2Hl,2
00wHoo,1
01Bp2T,2
01trKE,1
02Ne4w,13
02btbz,1
02rQpe,2
035gMh,1
03ZT7P,1
05F4ua,1


# Exploratory Data Analysis

The beer count distribution by brewery is positively skewed and heteroskedastic. With a median value of 1.0 and a mean of 1.197, it appears that a relatively smaller number of breweries carry a large number of beers.

In [55]:
count['location_id'].describe()

count    5746.000000
mean        1.197181
std         2.020762
min         1.000000
25%         1.000000
50%         1.000000
75%         1.000000
max       125.000000
Name: location_id, dtype: float64

In [58]:
hist = Histogram(count['location_id'], bins=100)

In [59]:
show(hist)

In [65]:
high_counts = count[count.location_id > 1]
high_counts = high_counts.reset_index()
high_counts = pd.merge(breweries, high_counts, how='inner', left_on='id', right_on='brewery_id')

In [66]:
high_counts

Unnamed: 0,brandClassification,createDate,description,established,forwardingId,id,images,isMassOwned,isOrganic,locations,mailingListUrl,name,nameShortDisplay,status,statusDisplay,updateDate,website,brewery_id,location_id
0,craft,2012-01-03 02:41:57,We started the Harpoon Brewery in 1986 because...,1986.0,,RzvedX,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,[{u'website': u'http://www.harpoonbrewery.com/...,,Harpoon Brewery,Harpoon,verified,Verified,2015-12-22 14:47:47,http://www.harpoonbrewery.com/,RzvedX,2
1,craft,2012-01-03 02:41:44,At one of the most difficult times in the hist...,1998.0,,Rx4Dnt,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,[{u'website': u'http://www.thebrewworks.com/al...,,Fegley's Brew Works,Fegley's Brew Works,verified,Verified,2015-12-22 14:38:31,http://www.thebrewworks.com,Rx4Dnt,2
2,craft,2012-01-03 02:42:02,The Milwaukee Brewing Company is the 'grown up...,1997.0,,BU4IJP,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,"[{u'website': u'http://Mkebrewing.com', u'open...",http://mkebrewing.com/email-signup/,Milwaukee Brewing Company,Milwaukee,verified,Verified,2015-12-22 15:21:38,http://mkebrewing.com/,BU4IJP,3
3,craft,2014-06-11 11:39:46,Evil Czech has been operating since 2011 and o...,2011.0,,Gnjzbl,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,[{u'website': u'http://www.evilczechbrewery.co...,,Evil Czech Brewery,Evil Czech,verified,Verified,2015-12-22 15:54:36,http://www.evilczechbrewery.com/,Gnjzbl,4
4,craft,2012-01-03 02:41:46,"Hand crafted and brewed to perfection, Big Riv...",,,mDVWws,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,[{u'website': u'http://www.bigrivergrille.com/...,,Big River Grille & Brewing Works,Big River Grille & Brewing Works,verified,Verified,2015-12-22 15:27:30,http://www.bigrivergrille.com/,mDVWws,4
5,craft,2012-01-03 02:42:12,"Having trained in Germany, we appreciate the a...",1996.0,,VoKbnS,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,"[{u'website': u'http://www.victorybeer.com/', ...",,Victory Brewing Company,Victory,verified,Verified,2015-12-22 15:01:55,http://www.victorybeer.com/,VoKbnS,3
6,craft,2012-01-03 02:41:52,DESTIHL was founded on the promise of supporti...,2007.0,,PBXvz0,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,[{u'website': u'http://www.destihlbrewery.com/...,https://visitor.r20.constantcontact.com/d.jsp?...,DESTIHL Brewery,DESTIHL,verified,Verified,2017-05-17 15:29:40,http://www.DESTIHL.com,PBXvz0,3
7,craft,2012-01-03 02:42:01,The Saturnalian Saga of Magic Hat began in 199...,1994.0,,qIqpZc,{u'large': u'https://s3.amazonaws.com/breweryd...,Y,N,"[{u'website': u'http://www.magichat.net/', u'o...",,Magic Hat Brewing Company,Magic Hat,verified,Verified,2017-06-22 14:13:11,http://www.magichat.net/,qIqpZc,2
8,craft,2012-01-03 02:41:51,Cigar City Brewing was founded with two goals ...,2008.0,,EYuZg3,{u'large': u'https://s3.amazonaws.com/breweryd...,N,N,"[{u'website': u'http://cigarcitybeer.com/', u'...",,Cigar City Brewing,Cigar City,verified,Verified,2015-12-22 14:43:16,http://cigarcitybeer.com/,EYuZg3,3
9,craft,2012-01-03 02:41:43,We started back in 2006 with three guys and on...,2006.0,,snQlvg,{u'large': u'https://s3.amazonaws.com/breweryd...,Y,N,"[{u'openToPublic': u'N', u'isPrimary': u'N', u...",,10 Barrel Brewing Company,10 Barrel,verified,Verified,2017-06-22 13:54:34,http://www.10barrel.com/,snQlvg,4


In [67]:
hist = Histogram(high_counts['location_id'], bins=100)

In [68]:
show(hist)