### Breweries Data from BreweryDB API

we will Use the Brewery DB API from the "https://www.brewerydb.com/developers/apps", to extract the necessary information to make a dataframe of breweriy and their characteristics.

In [None]:
import requests
import json
import pandas as pd
import numpy as np

### Load API key
go to "Developers" in the menu area on the https://www.brewerydb.com/developers/apps website and select "My API Keys" from the dropdown list. In the "Sandbox API Keys" section, copy the string of characters for the Sandbox key

**CAUTION** : When using an authentication credential for an API, treat your credential information like a password. You do not want anybody to see or use your API credentials!! There are many methods that are used to save API authentication credentials to load into the program. In this example we are using a plain text file, which is simple but is one of the least secure methods.

In [None]:
with open("brewDB_key.txt") as file:
    api_key = file.read()

In [None]:
# DO NOT LEAVE YOUR API KEY SHOWING IN YOUR CODE
api_key

## Connect to API

we will Use the `requests` library to connect to the internet, then send a request for access to the API (along with the authentication key). Then we will load the JSON data in from the API.

In [4]:
# endpoint structure for BreweryDB API
url = r"https://sandbox-api.brewerydb.com/v2/breweries/?key="

In [5]:
# send the request to the API
response = requests.get(url + api_key)

In [6]:
# check the status code
response.status_code

200

In [7]:
# load JSON data from API
breweriesdata = response.json()

In [8]:
# identify root data structure
type(breweriesdata)

dict

## Explore data structure

Because the root data structure of the JSON sent back from the API is a dictionary, we can check what keys are available to access other information. From there, we can find the structure path for the data that we want to extract for the dataframe to analyze.

In [9]:
breweriesdata.keys()

dict_keys(['currentPage', 'numberOfPages', 'totalResults', 'data', 'status'])

In [11]:
breweriesdata['currentPage']

1

In [12]:
breweriesdata['numberOfPages']

1

In [13]:
breweriesdata['totalResults']

19

In [14]:
breweriesdata['status']

'success'

`currentPage`, `numberOfPages`, `totalResults`, and `status` all reach the end of their data structure.

In [103]:
# breweriesdata['data']

In [16]:
type(breweriesdata['data'])

list

In [17]:
len(breweriesdata['data'])

19

In [19]:
breweriesdata['data'][0]

{'id': 'BznahA',
 'name': 'Anheuser-Busch InBev',
 'nameShortDisplay': 'Anheuser-Busch InBev',
 'description': "Anheuser-Busch operates 12 breweries in the United States, 14 in China and one in the United Kingdom. Anheuser-Busch's operations and resources are focused on adding to life's enjoyment not only through the responsible consumption of beer by adults, but through theme park entertainment and packaging.  In the United States, the company holds a 48.5 percent share of U.S. beer sales. Worldwide, Anheuser-Busch's beer sales volume was 128.4 million barrels in 2007.  The St. Louis-based company's subsidiaries include one of the largest U.S. manufacturers of aluminum beverage containers and one of the world's largest recyclers of aluminum beverage cans. Anheuser-Busch also has interests in malt production, rice milling, real estate development, turf farming, metalized and paper label printing, bottle production and transportation services.",
 'website': 'http://www.anheuser-busch.co

In [25]:
breweriesdata['data'][0].keys()

dict_keys(['id', 'name', 'nameShortDisplay', 'description', 'website', 'established', 'isOrganic', 'images', 'status', 'statusDisplay', 'createDate', 'updateDate', 'isMassOwned', 'isInBusiness', 'isVerified'])

In [61]:
breweriesdata['data'][0]['id']

'BznahA'

In [62]:
breweriesdata['data'][0]['name']

'Anheuser-Busch InBev'

In [63]:
breweriesdata['data'][0]['statusDisplay']

'Verified'

In [64]:
breweriesdata['data'][0]['isInBusiness']

'Y'

In [65]:
breweriesdata['data'][0]['isMassOwned']

'Y'

In [66]:
breweriesdata['data'][0]['website']

'http://www.anheuser-busch.com/'

In [67]:
breweriesdata['data'][0]['description']

"Anheuser-Busch operates 12 breweries in the United States, 14 in China and one in the United Kingdom. Anheuser-Busch's operations and resources are focused on adding to life's enjoyment not only through the responsible consumption of beer by adults, but through theme park entertainment and packaging.  In the United States, the company holds a 48.5 percent share of U.S. beer sales. Worldwide, Anheuser-Busch's beer sales volume was 128.4 million barrels in 2007.  The St. Louis-based company's subsidiaries include one of the largest U.S. manufacturers of aluminum beverage containers and one of the world's largest recyclers of aluminum beverage cans. Anheuser-Busch also has interests in malt production, rice milling, real estate development, turf farming, metalized and paper label printing, bottle production and transportation services."

In [68]:
breweriesdata['data'][1]

{'id': 'rd8LRZ',
 'name': 'Boston Beer Company (Samuel Adams)',
 'nameShortDisplay': 'Boston Beer Company (Samuel Adams)',
 'isOrganic': 'N',
 'status': 'new_unverified',
 'statusDisplay': 'New, Unverified',
 'createDate': '2018-12-09 18:05:53',
 'updateDate': '2018-12-09 18:05:53',
 'isMassOwned': 'N',
 'isInBusiness': 'Y',
 'isVerified': 'N'}

In [73]:
breweriesdata['data'][0]['established']

'1852'

In [74]:
breweriesdata['data'][0]['isOrganic']

'N'

The first and second items (nested dictionaries) in the list under the `breweriesdata['data']` key seem to be similar in structure. At a quick glance, each dictionary look like the information for a brewerie.

In [33]:
type(breweriesdata['data'][0]['images'])

dict

Many of the keys in this dictionary reach the end of the data structure, which is information we can extract for the dataframe. However, the `images` key access a nested dictionary

In [41]:
breweriesdata['data'][0]['images'].keys()

dict_keys(['icon', 'medium', 'large', 'squareMedium', 'squareLarge'])

In [45]:
type(breweriesdata['data'][0]['images']['squareLarge'])

str

In [46]:
breweriesdata['data'][1].keys()

dict_keys(['id', 'name', 'nameShortDisplay', 'isOrganic', 'status', 'statusDisplay', 'createDate', 'updateDate', 'isMassOwned', 'isInBusiness', 'isVerified'])

In [55]:
len(breweriesdata['data'][1])

11

The second brewerie in the dataset has dictionary keys that do not exist in the first beer dictionary, such as `website`, and `description`  . This data is ***missing*** from the first brewerie and will later be referenced when extracting the data.

## Extract Data

Now that we have identified the structure for breweries dataset, we can collect the following information:

- ID
- Name
- Year of Establishment
- Status
- Is it in business?
- Is it mass owned?
- Is it organic?
- Website
- Description

In [96]:
brew_info = { 'id':[],
            'name':[],
            'established':[],
            'statusDisplay':[],
            'isInBusiness':[],
            'isMassOwned':[],
            'isOrganic':[],
            'website':[],
            'description':[]
           }

In [97]:
breweries = breweriesdata['data']

In [98]:
for brew in breweries:
    for key in brew_info.keys():
        try:
            brew_info[key].append(brew[key])
        except KeyError:
            brew_info[key].append(np.nan)
    
        

In [99]:
# check collected information in "name" key of brew_info dictionary
brew_info['name']

['Anheuser-Busch InBev',
 'Boston Beer Company (Samuel Adams)',
 'Breckenridge Brewery',
 'Brouwerij De Leite',
 'Dock Street Brewery',
 'Guinness',
 'Harmon Brewing Company',
 'Jackalope Brewing Company',
 'Lagunitas Brewing Company',
 'Last Name Brewing',
 'Laughing Dog Brewing',
 'Miller Brewing Company',
 'New Holland Brewing Company',
 'Oskar Blues Brewery',
 'Portsmouth Brewery',
 'Sierra Nevada Brewing Company',
 'SweetWater Brewing Company',
 'Wachusett Brewing Company',
 'Zero Gravity Craft Brewery']

# Create Dataframe

Now that the data is collected, we can put it into a dataframe.

In [100]:
# use brew_info dictionary to make dataframe
brew_df = pd.DataFrame(data=brew_info)
brew_df.head(20)

Unnamed: 0,id,name,established,statusDisplay,isInBusiness,isMassOwned,isOrganic,website,description
0,BznahA,Anheuser-Busch InBev,1852.0,Verified,Y,Y,N,http://www.anheuser-busch.com/,Anheuser-Busch operates 12 breweries in the Un...
1,rd8LRZ,Boston Beer Company (Samuel Adams),,"New, Unverified",Y,N,N,,
2,IImUD9,Breckenridge Brewery,1990.0,Verified,Y,Y,N,http://www.breckbrew.com/,Breckenridge Brewery was founded in 1990 in Br...
3,uM2jeT,Brouwerij De Leite,2008.0,Verified,Y,N,N,http://www.deleite.be/,Brewing since 1997. Officially transformed in...
4,p3YrOa,Dock Street Brewery,1985.0,Verified,Y,N,N,http://www.dockstreetbeer.com,"Founded in 1985, Dock Street Brewing Co. was t..."
5,HaPdSL,Guinness,1759.0,Verified,Y,Y,N,http://www.guinness.com/,St. James's Gate Brewery (Irish: Grúdlann Ghea...
6,DMU2Kf,Harmon Brewing Company,,Verified,Y,N,N,http://harmonbrewingco.com/,This 15 barrel microbrewery and restaurant is ...
7,p1tFbP,Jackalope Brewing Company,2011.0,Verified,Y,N,N,http://www.jackalopebrew.com/,Jackalope Brewing Company is owned by Bailey S...
8,nLsoQ9,Lagunitas Brewing Company,1993.0,Verified,Y,Y,N,http://www.lagunitas.com/,From our earliest days of striving to make con...
9,941OaA,Last Name Brewing,2003.0,Verified,Y,N,N,http://www.lastnamebrewing.com/,Last Name Brewing is a craft brewery and tap r...


In [101]:
brew_df.to_csv("Breweries.csv", index=False)

In [102]:
len(brew_df)

19