# Breweries Association Data
This notebook:
* scrapes the data for breweries in each state as curated by [the Brewers Association's website](https://www.brewersassociation.org/directories/breweries/ "Official brewery directory, curated by the Brewers Association").
* selects information and populates information to a DataFrame
    * Brewery Name
    * Address
    * City
    * State
    * ZIP Code
* saves the data to a csv

### Import dependencies

In [12]:
import requests
import bs4 as bs
import re
import pandas as pd
import json

### Define State Dictionary for iteration

In [2]:
state_list= ["Alabama","Alaska","Arizona","Arkansas","California","Colorado","Connecticut","Delaware","Florida","Georgia",
"Hawaii" ,"Idaho" ,"Illinois" ,"Indiana" ,"Iowa" ,"Kansas" ,"Kentucky" ,"Louisiana" ,"Maine" ,"Maryland" ,"Massachusetts", 
"Michigan" ,"Minnesota" ,"Mississippi" ,"Missouri" ,"Montana" ,"Nebraska" ,"Nevada" ,"New Hampshire" ,"New Jersey" ,
"New Mexico" ,"New York" ,"North Carolina" ,"North Dakota" ,"Ohio" ,"Oklahoma" ,"Oregon" ,"Pennsylvania" ,"Rhode Island" ,
"South Carolina" ,"South Dakota" ,"Tennessee" ,"Texas" ,"Utah" ,"Vermont" ,"Virginia" ,"Washington" ,"West Virginia" ,
"Wisconsin" ,"Wyoming"]

state_list_test= ["Alabama","Alaska","North Carolina","South Dakota","Colorado"]

### Iterate the request for each state

In [5]:
response= requests.post('https://www.brewersassociation.org/wp-admin/admin-ajax.php',
                 data= {
                     "action": "get_breweries",
                    "_id": "Nevada",
                    "search_by": "statename"
                 })

In [15]:
print(response.text)

<div id='status-bar' class='well well-small'><p>We found <strong>51</strong> Breweries in Nevada</p></div><div class="brewery">
						<ul class="vcard simple brewery-info">
							<li class="name">10 Torr Distilling and Brewing</li><li class="address">490 Mill St </li><li>Reno, NV 89502 | <a href='http://www.google.com/maps/place/490 Mill St++Reno+NV+United States' target='_blank'>Map</a></li><li class="telephone">Phone: (775) 530-7014</li><li class="brewery_type">Type: <a href="https://www.brewersassociation.org/statistics/market-segments#micro" target="_blank">Micro</a></li><li class="url"><a href="http://www.10torr.com" target="_blank" >www.10torr.com</a></li></ul><ul class="vcard simple col2 logos"><a href="https://www.brewersassociation.org/business-tools/marketing-advertising/independent-craft-brewer-seal/" target="_blank"><img src="https://s3-us-west-2.amazonaws.com/brewersassoc/wp-content/uploads/2017/06/independent-craft-brewer-seal62x120.png" alt="Independent Craft Brewers Se

In [4]:
response_list= []

for state in state_list:
    response= requests.post('https://www.brewersassociation.org/wp-admin/admin-ajax.php',
                 data= {
                     "action": "get_breweries",
                    "_id": state,
                    "search_by": "statename"
                 })
    if response.status_code == 200:
        response_list.append(response)
        print(f"{state} found!")
    else:
        print(f"Not found. Skipping {state}...")

print("/\\"*30)
print("REQUEST COMPLETE")

Alabama found!
Alaska found!
Arizona found!
Arkansas found!
California found!
Colorado found!
Connecticut found!
Delaware found!
Florida found!
Georgia found!
Hawaii found!
Idaho found!
Illinois found!
Indiana found!
Iowa found!
Kansas found!
Kentucky found!
Louisiana found!
Maine found!
Maryland found!
Massachusetts found!
Michigan found!
Minnesota found!
Mississippi found!
Missouri found!
Montana found!
Nebraska found!
Nevada found!
New Hampshire found!
New Jersey found!
New Mexico found!
New York found!
North Carolina found!
North Dakota found!
Ohio found!
Oklahoma found!
Oregon found!
Pennsylvania found!
Rhode Island found!
South Carolina found!
South Dakota found!
Tennessee found!
Texas found!
Utah found!
Vermont found!
Virginia found!
Washington found!
West Virginia found!
Wisconsin found!
Wyoming found!
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
REQUEST COMPLETE


### Convert to BeautifulSoup object and prettify for easier reading

In [5]:
soup_list= [bs.BeautifulSoup(i.text,'html.parser') for i in response_list]

### Locate all the brewery classes

In [6]:
breweries= [soup.find_all(attrs= {"class":"brewery"}) for soup in soup_list]

### Convert to DataFrame

Having issues with this step right now because I made lists of lists... like a chump...

In [7]:
p= re.compile("<li>(.*?), ([A-Z][A-Z]) (\d{5})?")

Brewery_list= []
Address_list= []
City_list= []
State_list= []
ZIP_Code_list= []

fail_list= []

for state in breweries:
    for i in state:
#Since some breweries lack a ZIP Code, it throws an error.
        try:
            city_state_zip= i.find_all('li')[2]
#ZIP_Code_list should be produced first to throw the error before populating the other fields.
#This ensures all lists are the same length and can easily be added to the state_df.
            ZIP_Code_list.append(re.search(p, str(city_state_zip)).group(3))
            Brewery_list.append(i.find_all(attrs={"class":"name"})[0].string)
            Address_list.append(i.find_all(attrs={"class":"address"})[0].string.rstrip())
            City_list.append(re.search(p, str(city_state_zip)).group(1))
            State_list.append(re.search(p, str(city_state_zip)).group(2))
        except AttributeError:
            fail_list.append(i.find_all(attrs={"class":"name"})[0].string)

In [8]:
state_df= pd.DataFrame(
    {
            "Brewery": Brewery_list,
            "Address": Address_list,
            "City": City_list,
            "State": State_list,
            "ZIP Code": ZIP_Code_list
    })

In [9]:
state_df.head()

Unnamed: 0,Brewery,Address,City,State,ZIP Code
0,5 Rivers Brewing LLC,,Spanish Fort,AL,36527
1,Avondale Brewing Co,201 41st St S,Birmingham,AL,35222
2,Back Forty Beer Co,200 N 6th St,Gadsden,AL,35901
3,Back Forty Beer Company - Birmingham,3201 1st Avenue N,Birmingham,AL,35222
4,Below the Radar Brewing Co,220 Holmes Ave NE,Huntsville,AL,35801


In [10]:
state_df.describe()

Unnamed: 0,Brewery,Address,City,State,ZIP Code
count,8503,8503.0,8503,8503,8429
unique,8340,7452.0,3154,50,4960
top,Ballast Point Brewing Company,,Portland,CA,92121
freq,6,983.0,100,961,18


In [11]:
print(len(fail_list))
fail_list

8


['Cerveza Aldarra',
 'Pirate Republic Brewing',
 'Mount Vernon Brewery',
 'Cheeky Monkey Brewery and Cidery',
 'Cowaramup Brewing Company',
 'Eagle Bay Brewing Co',
 'Feral Brewing Company',
 'Northbridge Brewing Co']

### Save to csv

In [12]:
state_df.to_csv("Resources/Brewers_Association_Data.csv")