# Yelp API - Gathering Data

## In this Notebook:
 - Using YELP Fusion API, restaurant information from Dallas (within 25 miles of Dallas, TX) was saved

### Using offset and limit parameters in Yelp API

There is a limit of 50 places per API call, offset and limit parameters allow a total of 1000 places to be called.

Example:
with OFFSET = 50, LIMIT = 50,
you will receive results 51-100


### Steps taken to retrieve data

- Get maximum of 1000 restaurants within 25 miles of Dallas, Texas

## Testing endpoint 

In [10]:
import requests

API_KEY = 'eVBCTWO_MZ8g57v8-zOVL9Nm-UqUOrEg_0g43WNdxFcsXFA8SRNKflKrRaCN5iRwZnWdpMgpVoltFUEjc0SaUi65y6Y7VANNLVn8TKNNhIEc3LqoUNRIYRIbwYKJYXYx'
CLIENT_ID = 'zC-Jrqxt7h6piT-XMntPwg'

ENDPOINT = "https://api.yelp.com/v3/businesses/search"

HEADERS = {'Authorization': 'bearer %s' % API_KEY}

PARAMETERS = {'term': 'restaurants',
              'offset': 0,
              'limit': 50,
              'radius': 40000,
              'location': 'Dallas, TX'}

response = requests.get(url=ENDPOINT, params=PARAMETERS, headers=HEADERS)

## Next, let's gather the data for all restaurants within a radius of 40,000 m (or 25 miles) from Dallas, TX

In [21]:
import json

In [18]:
import time

PARAMETERS = {'term': 'restaurants',
              'offset': 0, # start at 0
              'limit': 50, # maximum is 50
              'radius': 40000, # in m
              'location': 'Dallas, TX'}

restaurants_in_dallas = []


# Cycle through restaurants
for offset_number in range(0,1000,50):
    PARAMETERS['offset'] = offset_number

    response = requests.get(url=ENDPOINT, params=PARAMETERS, headers=HEADERS)

    if not response.json().get('businesses', False):
        break

    restaurants_in_dallas.extend(response.json()['businesses'])

    print("{}-{}".format(offset_number, offset_number+50))

    time.sleep(0.5) ## Don't want to get blocked by Yelp API

0-50
50-100
100-150
150-200


In [25]:
restaurants_in_dallas

[{'id': '2jxPJngbOs5YU7keUJ1LyA',
  'alias': 'hawaiian-bros-island-grill-dallas',
  'name': 'Hawaiian Bros Island Grill',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/IEiM4a4EcWxY193rvpotyQ/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/hawaiian-bros-island-grill-dallas?adjust_creative=zC-Jrqxt7h6piT-XMntPwg&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=zC-Jrqxt7h6piT-XMntPwg',
  'review_count': 15,
  'categories': [{'alias': 'hawaiian', 'title': 'Hawaiian'},
   {'alias': 'hotdogs', 'title': 'Fast Food'}],
  'rating': 4.5,
  'coordinates': {'latitude': 32.8586594, 'longitude': -96.76862},
  'transactions': ['delivery', 'pickup'],
  'location': {'address1': '6011 Greenville Ave',
   'address2': None,
   'address3': '',
   'city': 'Dallas',
   'zip_code': '75206',
   'country': 'US',
   'state': 'TX',
   'display_address': ['6011 Greenville Ave', 'Dallas, TX 75206']},
  'phone': '+12142068646',
  'display_phone': '(214) 206-8646',
  'd

In [20]:
# This number includes duplicates
print(len(restaurants_in_dallas))

200


In [29]:
# Remove the duplicate entries
res_list = [i for n, i in enumerate(restaurants_in_dallas) if i not in restaurants_in_dallas[n + 1:]] 

In [28]:
len(res_list)

200

In [30]:
newlist = sorted(res_list, key=lambda k: k['name']) 

In [31]:
len(newlist)

200

In [36]:
restaurants_file = open("restaurants.json", "w")
json.dump(res_list, restaurants_file, indent=6)
restaurants_file.close()