## Applied DataScience Capstone Project - Battle of the Neighbourhoods

### THE GOLD DREGGER

**Sudharshan P.R.**

***There is more left after a beer***

### PROBLEM DESCRIPTION AND BACKGROUND DISCUSSION

***The Interesting Background***

In the last decade, major cities in the U.S. have witnessed beer drinkers becoming beer lovers. This led to emergence of various craft brewery establishments that caters to the beer connoisseurs in their respective cities. Undoubtedly craft beers are being guzzled in gallons after gallons. This presents an interesting case for anyone looking beyond after finishing their beers.

The brewing process involves wheat or barley, hops and yeast along with a myriad of flavours. Once the beer portion of the process is extracted, there is the waste portion that remains collected. As it goes with waste from any organic process, this waste is actually wealth. This waste is nutritious since it still contains the solid portion of the ingredients that went into the beer making process.

The spent grains from beer is technically called as dreg(or trub - beer is a global drink - each region will have its term). This dreg isnt as bad as it sounds. In ancient to pre-modern times, this dreg was being fed to animals. And that is how dreg has been used up instead of rotting in a landfill. Interesting there are other uses of this dreg that can be more valueable than just being animal feed. 

Here are a few examples of using spent grain from beer making:

    1) Animal feed on farms
    2) Dreg to baking goods
    3) Wastewater treatment
    4) Dreg to compost
    5) Fish food
    6) Energy generation (experimental and too technical, hence I am not interested too much in this application)



***The Challenging Problem Description***

The big brand commercial beers already have their waste matter handled. Since the factories are located in the rural side, its easy for them to just sell it to the animal farms. The problem faced is by the craft breweries who have to link up with various other businesses to sell-off their dreg. 

Here is an opportunity to bring in the dreg, assemble it in a place and sell it as raw materials to some of the businesses mentioned above. Part of the dreg can be sold to bakeries, to urban farms to make compost, off-shore fish farms and for wastewater treatment. 

***The missing piece of information that we need is how many craft breweries are there in a given city, say New York City.***

### DESCRIPTION OF THE DATA

The data we require is simple. 
    
    - How many craft breweries are operational in New York City?
    - Which ones are the most popular?
    - Which ones have on-site brewing facilities

We can use the FourSquare API to gather all the breweries operating in New York City. Craft breweries are primarily located in Manhattan and Brooklyn, but we can expand it to all boroughs if required. Using this list of breweries along with their ratings, we can choose a smaller set of breweries to launch a pilot to evaluate the business. The FourSquare might not be able to answer the third question, for which we might need a beer or two from the place of interest. The data on a map should help us locate a few places of interest where we can have our dreg warehouse in or close to New York City.

Reference Link: https://modernfarmer.com/2015/08/recycled-brewery-waste/

### METHODOLOGY

We use the FourSquare API which provides the option to list out all venues in a particular category. Each category has a specific ID in the FourSquare API using which we can find out how many venues of interest are there in a particular city or neighborhood. 

For this particular problem, we need to get the category ID for breweries from the FourSquare developer site.

The FourSquare developer website is: https://developer.foursquare.com/docs/build-with-foursquare/categories/

From here, we get the category ID for breweries: 50327c8591d4c4b30a586d5d


Using this category ID, we then list out all breweries in an area of interest using the following steps:
        
     1)For the above business application, we will be looking at all NYC boroughs. 
     
     2)We then extract the list of all breweries along with their ratings, if available. 
     
     3)We then choose the 3 or 5 top rated venues to organize a field visit, get a few questions answered and guzzle down a few beers
  
Lets begin.

    1) Importing all required packages

In [1]:
import requests
import pandas as pd 
import numpy as np 
import random 

from geopy.geocoders import Nominatim 
from IPython.display import Image 
from IPython.core.display import HTML 
    
import json 
from pandas.io.json import json_normalize

import folium

    2) Setting up credentials for accessing the FourSquare API

In [2]:
CLIENT_ID = 'K0GYSKKQI0BC1RHBTXDSR0QWWVVDQYABARNBOYRTITACST2T' # your Foursquare ID
CLIENT_SECRET = 'C2P5LP0OAC0HU3QCKYSYYHW15S4OSHAUUMZT3WFEZ1PLV1BQ' # your Foursquare Secret
ACCESS_TOKEN = 'YYGLNU1W13OWG1V10JLLNOEKRUQM1KGDX3ZTRMO5KZQ5RPYB' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 10000
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: K0GYSKKQI0BC1RHBTXDSR0QWWVVDQYABARNBOYRTITACST2T
CLIENT_SECRET:C2P5LP0OAC0HU3QCKYSYYHW15S4OSHAUUMZT3WFEZ1PLV1BQ


    3) Using the GeoLocator from Lab Exercises of Week 2 to get the co-ordinates of interest for New York City 

In [3]:
address = '20 W 34th St, New York'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

40.65774729310345 -74.00725812068966


    4) Using the Category ID of Breweries, we get a list of all breweries in NYC 

In [4]:
radius = 100000 #Going big to get as many as breweries as possible
brewery_Category_id = '50327c8591d4c4b30a586d5d'

brewery_url ='https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&categoryId={}&radius={}&limit={}'.format(
            CLIENT_ID,
            CLIENT_SECRET,
            latitude,
            longitude,
            ACCESS_TOKEN,
            VERSION,
            brewery_Category_id ,
            radius,
            LIMIT
            )

**Sending the GET Request and getting all the results**

In [5]:
results = requests.get(brewery_url).json()['response']
results

{'venues': [{'id': '5c6f5ada835c9a0039edf02c',
   'name': 'Big Alice Barrel Room',
   'location': {'address': '52 34th St',
    'crossStreet': '2nd Ave',
    'lat': 40.6574153788505,
    'lng': -74.00696695747743,
    'labeledLatLngs': [{'label': 'display',
      'lat': 40.6574153788505,
      'lng': -74.00696695747743}],
    'distance': 44,
    'postalCode': '11232',
    'cc': 'US',
    'neighborhood': 'Greenwood Heights',
    'city': 'Brooklyn',
    'state': 'NY',
    'country': 'United States',
    'formattedAddress': ['52 34th St (2nd Ave)',
     'Brooklyn, NY 11232',
     'United States']},
   'categories': [{'id': '50327c8591d4c4b30a586d5d',
     'name': 'Brewery',
     'pluralName': 'Breweries',
     'shortName': 'Brewery',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/brewery_',
      'suffix': '.png'},
     'primary': True}],
   'delivery': {'id': '2477717',
    'url': 'https://www.seamless.com/menu/big-alice-barrel-room-52-34th-street-brooklyn/2477717?a

**Get relevant part of JSON and transform it into a pandas dataframe**

In [6]:
# assign relevant part of JSON to venues
venues = results["venues"]


# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe

  dataframe = json_normalize(venues)


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,...,location.state,location.country,location.formattedAddress,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,venuePage.id
0,5c6f5ada835c9a0039edf02c,Big Alice Barrel Room,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,52 34th St,2nd Ave,40.657415,-74.006967,"[{'label': 'display', 'lat': 40.6574153788505,...",...,NY,United States,"[52 34th St (2nd Ave), Brooklyn, NY 11232, Uni...",2477717.0,https://www.seamless.com/menu/big-alice-barrel...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,
1,5151c128e4b026975853f870,The Bronx Brewery,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,856 E 136th St,between Willow Ave and Walnut Ave,40.801774,-73.910297,"[{'label': 'display', 'lat': 40.80177432001329...",...,NY,United States,[856 E 136th St (between Willow Ave and Walnut...,2039435.0,https://www.seamless.com/menu/the-bronx-brewer...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,140525663.0
2,5e1e2003b57bd10008a2e101,Dockside Brewery,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,40 Bridgeport Ave,,41.200338,-73.107288,"[{'label': 'display', 'lat': 41.2003375, 'lng'...",...,CT,United States,"[40 Bridgeport Ave, Milford, CT 06460, United ...",2269810.0,https://www.grubhub.com/restaurant/dockside-br...,grubhub,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_grubhub_20180129.png,
3,572e1b2dcd10a820394a507b,Twin Elephant Brewing Company,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,,,40.729238,-74.380353,"[{'label': 'display', 'lat': 40.72923819029399...",...,NJ,United States,"[Chatham Township, NJ 07928, United States]",,,,,,,
4,588fc9eed8f3e90ab908bc48,Hudson Valley Brewery,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,7 E Main St,,41.501543,-73.962993,"[{'label': 'display', 'lat': 41.50154311094066...",...,NY,United States,"[7 E Main St, Beacon, NY 12508, United States]",,,,,,,
5,57eaf6a7498ee4d106a06ebf,Evil Twin Brewing NYC,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,1616 George St,Wyckoff Ave.,40.696225,-73.904225,"[{'label': 'display', 'lat': 40.69622528178149...",...,NY,United States,"[1616 George St (Wyckoff Ave.), Ridgewood, NY ...",2045157.0,https://www.seamless.com/menu/evil-twin-brewin...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,
6,60303c4749e1267fe66b53ff,TALEA Beer Co,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,87 Richardson St,Leonard St,40.718573,-73.948276,"[{'label': 'display', 'lat': 40.718573, 'lng':...",...,NY,United States,"[87 Richardson St (Leonard St), Brooklyn, NY 1...",,,,,,,
7,57a52544498e70074cf7a79b,Ship Bottom Brewery,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,830 N Bay Avenue Store #23,,39.569417,-74.236787,"[{'label': 'display', 'lat': 39.56941682974368...",...,NJ,United States,"[830 N Bay Avenue Store #23, Beach Haven, NJ 0...",,,,,,,
8,56f6cd56cd102dca40d553c9,Kings County Brewers Collective,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,381 Troutman St,,40.705974,-73.923487,"[{'label': 'display', 'lat': 40.70597434893818...",...,NY,United States,"[381 Troutman St, Brooklyn, NY 11237, United S...",2478522.0,https://www.seamless.com/menu/kings-county-bre...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,
9,5ee52222830eaa0008a81c31,The Keg & Lantern Brewing Company,"[{'id': '50327c8591d4c4b30a586d5d', 'name': 'B...",v-1628443191,False,158 Beard St,Van Brunt St,40.67551,-74.015746,"[{'label': 'display', 'lat': 40.67551, 'lng': ...",...,NY,United States,"[158 Beard St (Van Brunt St), Brooklyn, NY 112...",2304967.0,https://www.seamless.com/menu/keg--lantern-bre...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,


**Keeping only columns of interest**

In [7]:
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_cleaned = dataframe.loc[:, filtered_columns]

In [8]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_cleaned['categories'] = dataframe_cleaned.apply(get_category_type, axis=1)

In [9]:
dataframe_cleaned.columns = [column.split('.')[-1] for column in dataframe_cleaned.columns]

dataframe_cleaned

Unnamed: 0,name,categories,address,crossStreet,lat,lng,labeledLatLngs,distance,postalCode,cc,neighborhood,city,state,country,formattedAddress,id
0,Big Alice Barrel Room,Brewery,52 34th St,2nd Ave,40.657415,-74.006967,"[{'label': 'display', 'lat': 40.6574153788505,...",44,11232.0,US,Greenwood Heights,Brooklyn,NY,United States,"[52 34th St (2nd Ave), Brooklyn, NY 11232, Uni...",5c6f5ada835c9a0039edf02c
1,The Bronx Brewery,Brewery,856 E 136th St,between Willow Ave and Walnut Ave,40.801774,-73.910297,"[{'label': 'display', 'lat': 40.80177432001329...",17998,10454.0,US,,Bronx,NY,United States,[856 E 136th St (between Willow Ave and Walnut...,5151c128e4b026975853f870
2,Dockside Brewery,Brewery,40 Bridgeport Ave,,41.200338,-73.107288,"[{'label': 'display', 'lat': 41.2003375, 'lng'...",96835,6460.0,US,,Milford,CT,United States,"[40 Bridgeport Ave, Milford, CT 06460, United ...",5e1e2003b57bd10008a2e101
3,Twin Elephant Brewing Company,Brewery,,,40.729238,-74.380353,"[{'label': 'display', 'lat': 40.72923819029399...",32480,7928.0,US,,Chatham Township,NJ,United States,"[Chatham Township, NJ 07928, United States]",572e1b2dcd10a820394a507b
4,Hudson Valley Brewery,Brewery,7 E Main St,,41.501543,-73.962993,"[{'label': 'display', 'lat': 41.50154311094066...",94003,12508.0,US,,Beacon,NY,United States,"[7 E Main St, Beacon, NY 12508, United States]",588fc9eed8f3e90ab908bc48
5,Evil Twin Brewing NYC,Brewery,1616 George St,Wyckoff Ave.,40.696225,-73.904225,"[{'label': 'display', 'lat': 40.69622528178149...",9695,11385.0,US,,Ridgewood,NY,United States,"[1616 George St (Wyckoff Ave.), Ridgewood, NY ...",57eaf6a7498ee4d106a06ebf
6,TALEA Beer Co,Brewery,87 Richardson St,Leonard St,40.718573,-73.948276,"[{'label': 'display', 'lat': 40.718573, 'lng':...",8404,11211.0,US,,Brooklyn,NY,United States,"[87 Richardson St (Leonard St), Brooklyn, NY 1...",60303c4749e1267fe66b53ff
7,Ship Bottom Brewery,Brewery,830 N Bay Avenue Store #23,,39.569417,-74.236787,"[{'label': 'display', 'lat': 39.56941682974368...",122717,8008.0,US,,Beach Haven,NJ,United States,"[830 N Bay Avenue Store #23, Beach Haven, NJ 0...",57a52544498e70074cf7a79b
8,Kings County Brewers Collective,Brewery,381 Troutman St,,40.705974,-73.923487,"[{'label': 'display', 'lat': 40.70597434893818...",8878,11237.0,US,Bushwick,Brooklyn,NY,United States,"[381 Troutman St, Brooklyn, NY 11237, United S...",56f6cd56cd102dca40d553c9
9,The Keg & Lantern Brewing Company,Brewery,158 Beard St,Van Brunt St,40.67551,-74.015746,"[{'label': 'display', 'lat': 40.67551, 'lng': ...",2103,11231.0,US,,Brooklyn,NY,United States,"[158 Beard St (Van Brunt St), Brooklyn, NY 112...",5ee52222830eaa0008a81c31


In [10]:
df = dataframe_cleaned

In [11]:
df.groupby(['state', 'city'])['name'].count()

state  city            
CT     Milford              1
NJ     Beach Haven          1
       Chatham Township     1
       Jersey City          1
       Little Ferry         1
       Montclair            1
       Newark               1
NY     Astoria              1
       Bay Shore            1
       Beacon               1
       Bronx                1
       Brooklyn            19
       Chester              1
       Elmsford             1
       Far Rockaway         1
       Lindenhurst          1
       Long Island City     1
       Manorville           1
       Middletown           2
       New York             5
       Peekskill            1
       Queens               1
       Ridgewood            1
       Sunnyside            1
       Warwick              1
PA     Easton               1
       New Hope             1
Name: name, dtype: int64

As expected, a majority of the breweries are located in the NY region, specifically in Brooklyn and Manhattan.

### RESULTS

**VISUALIZING THESE LOCATIONS ON A NYC MAP**

In [12]:
latitude=40.65774729310345 
longitude=-74.00725812068966
map_nyc = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df['lat'], df['lng'], df['city']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_nyc)
map_nyc

### DISCUSSION

The Tri-borough area of Brooklyn, Manhattan and Queens (including Long Island City) is where the breweries are clustered.
A location that is able to cover this area presents a valuable opportunity to source 'THE DREG' from these breweries. 

Now we have the dreg, what next?

As an extension to the same project, we can use the FourSquare API to get the locations of farms (urban farms), wastewater plants, fisheries and bakeries to explore our business venture.

Here are a few data science problems to solve:
    
    - Once we have decided which businesses to partner with and selected which breweries to source from, we can solve a route optimizer for the sourcing and delivering processes
    - If we decide upon more than one business, we have several problems to optimize our sourcing and selling between the different businesses
    - Design different marketing campaigns for the sourcing breweries and business customers

### CONCLUSION

**Not all waste is actually waste that it should end up in a landfill. Useful waste such as kitchen scraps and industrial food production always have other useful applications which needs to be explored and scaled up. This should help clean up the environment a little, stretch our resources to the fullest extent possible and contribute to building the community.**