## Data Profiling

Gathering descriptive statistics can be a tedious process. Gladly, there are libraries that exist that perform all of the data crunching for you. They output a very clear profile of your data. *pandas-profiling* is one of them. That library offers out-of-the-box statistical profiling of your dataset. Since the dataset we are using is tidy and standardized, we can use the library right away on our dataset.

We will use a dataset of Craft Beers from the CraftCans website. This dataset only contains data from canned beers from breweries in the United States. It’s not clear from the website if this dataset reports every single canned beer brewed in the US or not. To be safe, you will consider this dataset to be a sample that may contain biases.

In [1]:
import pandas as pd

In [2]:
beers = pd.read_csv("data/beers.csv")

breweries = pd.read_csv("data/breweries.csv")

beers_and_breweries = pd.merge(beers, 
                               breweries, 
                               how = 'inner', 
                               left_on="brewery_id", 
                               right_on="brewery_id", 
                               sort= True, 
                               suffixes=('_beer', '_brewery'))

beers_and_breweries.head()

Unnamed: 0,tid,abv,ibu,id,name_beer,style,brewery_id,ounces,name_brewery,city,state
0,1493,0.045,50.0,2692,Get Together,American IPA,0,16.0,NorthGate Brewing,Minneapolis,MN
1,1494,0.049,26.0,2691,Maggie's Leap,Milk / Sweet Stout,0,16.0,NorthGate Brewing,Minneapolis,MN
2,1495,0.048,19.0,2690,Wall's End,English Brown Ale,0,16.0,NorthGate Brewing,Minneapolis,MN
3,1496,0.06,38.0,2689,Pumpion,Pumpkin Ale,0,16.0,NorthGate Brewing,Minneapolis,MN
4,1497,0.06,25.0,2688,Stronghold,American Porter,0,16.0,NorthGate Brewing,Minneapolis,MN


In [3]:
beers_and_breweries.dtypes

tid               int64
abv             float64
ibu             float64
id                int64
name_beer        object
style            object
brewery_id        int64
ounces          float64
name_brewery     object
city             object
state            object
dtype: object

In [7]:
import sys
sys.path.append("/usr/local/lib/python2.7/site-packages")
import seaborn as sbn