# Getting Your Data From Yelp!

In order to make sure you are on track to completing the project, you will complete this workbook first. Below are steps that you need to take in order to make sure you have your data from yelp and are ready to analyze it. Your cohort lead will review this workbook with you the Wednesday before your project is due.    

## Part 1 - Understanding your data and question

You will be pulling data from the Yelp API to complete your analysis. The api, however, provides you with a lot of information that will not be pertinent to your analysis. YOu will pull data from the api and parse through it to keep only the data that you will need. In order to help you identify that information,look at the API documentation and understand what data the api will provide you. 

Identify which data fields you will want to keep for your analysis. 

https://www.yelp.com/developers/documentation/v3/get_started

In [1]:
import json
import requests
import data
import sys
import pandas as pd
import csv


In [2]:

print(sys.path)

['/Users/leratsayukova/Documents/Flatiron/Bikes', '/Users/leratsayukova/opt/anaconda3/lib/python38.zip', '/Users/leratsayukova/opt/anaconda3/lib/python3.8', '/Users/leratsayukova/opt/anaconda3/lib/python3.8/lib-dynload', '', '/Users/leratsayukova/opt/anaconda3/lib/python3.8/site-packages', '/Users/leratsayukova/opt/anaconda3/lib/python3.8/site-packages/aeosa', '/Users/leratsayukova/opt/anaconda3/lib/python3.8/site-packages/IPython/extensions', '/Users/leratsayukova/.ipython']


In [3]:
url= 'https://api.yelp.com/v3/businesses/search'

In [4]:
client_id = 'UPd8KVfQybexrmKSjNF-mA'
api_key = 'TQ3V7mYVhRo1vOV7HFkQEuG-QKC7eVgNjpOkfkGD43EnLnLLY2ub-owG779hx-vZI6YX6YQZSxmE9kOjoo8iQYMx3mq0pwqenQ3Jc_vYjMZCci2PPC-BevZVmZomYHYx'




___

## Part 2 - Create ETL pipeline for the business data from the API

Now that you know what data you need from the API, you want to write code that will execute a api call, parse those results and then insert the results into the DB.  

It is helpful to break this up into three different functions (*api call, parse results, and insert into DB*) and then you can write a function/script that pull the other three functions together. 

Let's first do this for the Business endpoint.

- Write a function to make a call to the yelp API

In [5]:
headers = {'Authorization':'Bearer {}'.format(api_key),}
 

In [6]:

term= 'Bike Shop'
location='Austin'
categories='Bikes'

In [7]:
url_params = {
                "term": term.replace(' ', '+'),
                "location": location.replace(' ', '+'),
                "categories" : categories,
                "limit": 50,
    
            }

In [8]:
def yelp_call(url_params, api_key):
    
    response= requests.get(url, headers=headers, params=url_params)  
    
    if response.status_code==200:
        return response.json()
     
    else: 
        return response.status_code
    


In [9]:
response= yelp_call(url_params, api_key)

## can we make a loop for the function to call itself ?

In [10]:
business_data= response['businesses']

In [11]:
response['total']

140

In [12]:
business_data[0]

{'id': 'WT_d47o-V5xlMNx8trI0-A',
 'alias': 'monkey-wrench-bicycles-austin',
 'name': 'Monkey Wrench Bicycles',
 'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/a4Tl6tvBcmXsdAM5Paq7FA/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/monkey-wrench-bicycles-austin?adjust_creative=UPd8KVfQybexrmKSjNF-mA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=UPd8KVfQybexrmKSjNF-mA',
 'review_count': 94,
 'categories': [{'alias': 'bikes', 'title': 'Bikes'},
  {'alias': 'bike_repair_maintenance', 'title': 'Bike Repair/Maintenance'}],
 'rating': 5.0,
 'coordinates': {'latitude': 30.3224761537188, 'longitude': -97.7254818941176},
 'transactions': [],
 'price': '$$',
 'location': {'address1': '5555 N Lamar',
  'address2': 'Ste L131',
  'address3': '',
  'city': 'Austin',
  'zip_code': '78751',
  'country': 'US',
  'state': 'TX',
  'display_address': ['5555 N Lamar', 'Ste L131', 'Austin, TX 78751']},
 'phone': '+15129682509',
 'display_phone': '(512) 968-2509',


In [13]:
# business_data[0].keys()

**What data do we want from each business?**

In [14]:
# business_data[0]['name']
# business_data[0]['rating']
# business_data[0]['review_count']
# business_data[0]['location']['zip_code']


    

- Write a function to parse the API response so that you can easily insert the data in to the DB

In [15]:
def parse_results(results):
    parsed_results =[]
    for biz in results:
        biz_info= ( biz['id'],
                   biz['name'],
                   biz['rating'],
                   biz['review_count'],
                   biz['location']['zip_code'])
        parsed_results.append(biz_info)
    
    return parsed_results



In [16]:
# parse_results(business_data)

In [17]:
parsed_results= parse_results(business_data)
parsed_results

[('WT_d47o-V5xlMNx8trI0-A', 'Monkey Wrench Bicycles', 5.0, 94, '78751'),
 ('wfKxBxJ8RFZj8jOB6Lpn-Q', 'Trek Bicycle Lamar', 4.5, 280, '78704'),
 ('oTujfSf88bPOcUWEmDBzjQ', 'The Peddler Bike Shop', 4.5, 176, '78751'),
 ('KVPUN4yU-2juc8Pc4sxKVQ', 'Clown Dog Bikes', 5.0, 134, '78705'),
 ('-Cza7JtBZZ7nuXZeStwPAA', 'Bike Farm', 4.5, 104, '78756'),
 ('Ved9jiedoOFf39iFHSyQbQ', 'ATX Bikes', 4.5, 72, '78749'),
 ('ZtuzXaoMnY1gd0kMZFdpcw', "Mellow Johnny's Bike Shop", 4.0, 205, '78701'),
 ('4EdSNL5cShH-ZNsUbwWSJQ', 'Trek Bicycle Research', 4.5, 93, '78759'),
 ('2lIEXCMqbUaYJ98_cwAd2A', 'East Side Pedal Pushers', 4.5, 128, '78702'),
 ('-4SfHHiTVTLeOEt8TF0nTQ', 'Bikealot', 4.5, 30, '78745'),
 ('gWhCMZVm0ITZC7KxI3M2Pw', 'Texas Cycle Werks', 4.5, 37, '78735'),
 ('lXodVpk5ZUOVymDBlb10Zg', 'Trek Bicycle Guadalupe', 4.5, 10, '78705'),
 ('xxFBa5ZuMb0S92wxYeAOcQ', 'Trek Bicycle Bee Cave', 4.0, 13, '78733'),
 ('JN_AiBjGmF4dDdJIYKT7SA', 'Cycle Progression', 4.5, 31, '78751'),
 ('18UGjpTexL3nRJ4UrMSM7w', 'Tre

- Write a function to take your parsed data and add it to the csv file where you will store all of your results. 

In [18]:
# pr_df= pd.DataFrame(parsed_results, columns= 
#              ['name', 'rating', 'review_count', 'zipcode'])

In [19]:
# pr_csv=pr_df.to_csv(path_or_buf='/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_data')

In [20]:
def df_save(csv_filepath, parsed_results):
    
    
    pr_df= pd.DataFrame(parsed_results, columns= 
             ['id','name', 'rating', 'review_count', 'zipcode'])
    
    pr_csv=pr_df.to_csv(path_or_buf=csv_filepath, mode="a", header="False")
   
    return print("Results added!")
   

In [21]:
# df_save('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_data', parsed_results)

- Write a script that combines the three functions above into a single process.

While it will take some experimentation to write the functions above, once you get them working it will be best to put them in a `.py` file and then import the functions to use in a script 

In [22]:
from helpers import *

**^^Not working 

In [23]:
def yelp_data(url_params, api_key):
    yelp_call(url_params, api_key)
    parse_results(business_data)
    df_save('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_data', parsed_results)
    our_data= pd.read_csv('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_data')
    return our_data

In [24]:
# yelp_data(url_params,api_key)

In [25]:


# create a variable  to keep track of which result you are in. 
cur = 0
num= response['total']

#set up a while loop to go through and grab the result 
while cur <= num and cur < 1000:
    #set the offset parameter to be where you currently are in the results 
    url_params['offset'] = cur
    #make your API call with the new offset number
    results =  yelp_call(url_params, api_key)
    
    #after you get your results you can now use your function to parse those results
    new_parsed_results = parse_results(results['businesses'])
    
    # use your function to insert your parsed results into the db
    df_save('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_data', new_parsed_results)
    
    
    #increment the counter by 50 to move on to the next results
    cur += 50

Results added!
Results added!
Results added!


In [26]:
results

{'businesses': [{'id': 'ybprq0tFMCCF3yDNqmzGTw',
   'alias': 'dc-choppers-motorcycle-supply-spicewood',
   'name': 'DC-Choppers Motorcycle Supply',
   'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/ROEIymUUaNO6XQRSGnNZxw/o.jpg',
   'is_closed': False,
   'url': 'https://www.yelp.com/biz/dc-choppers-motorcycle-supply-spicewood?adjust_creative=UPd8KVfQybexrmKSjNF-mA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=UPd8KVfQybexrmKSjNF-mA',
   'review_count': 1,
   'categories': [{'alias': 'motorcycledealers',
     'title': 'Motorcycle Dealers'}],
   'rating': 5.0,
   'coordinates': {'latitude': 30.4086, 'longitude': -98.055714635582},
   'transactions': [],
   'location': {'address1': '9408 A Hwy 71',
    'address2': None,
    'address3': '',
    'city': 'Spicewood',
    'zip_code': '78669',
    'country': 'US',
    'state': 'TX',
    'display_address': ['9408 A Hwy 71', 'Spicewood, TX 78669']},
   'phone': '+15122005284',
   'display_phone': '(512) 200-5284',


['yuNrhGEVy_5sxu_MOooI1Q',
 'bt6AjtQvgzJDQ7KTG_PgwQ',
 'I3hl1maFk4RDTrAetttUEg',
 'rDEpN-TQGeMnkfm1mDU8ZQ',
 'gWhCMZVm0ITZC7KxI3M2Pw',
 'EXVlG72y8UwvZtr-l9J2aQ',
 'KXVkCQ_u-8MfoA3iCAivhQ',
 '-k_6pPEW11RbveUExNT9WQ',
 'lPD6aRJwqxtgxFwa2SXuHQ',
 'Y5RaFaBi8rsL2NAkne17gQ',
 'npWp_SATDZ0q0TADjDs58w',
 'jUcwnPiSENIBqt7Yw87_Zw',
 '18UGjpTexL3nRJ4UrMSM7w',
 'JN_AiBjGmF4dDdJIYKT7SA',
 'qb81dk8ByC_LlNqMNkRhJQ',
 'EwnnAPpLbN7EI1026nthxQ',
 'HliPtTKSvqoM8TmbT3aB7w',
 '7s-l-xIWiYy6AxwzxWvn-Q',
 'XqtWbuVX3RHCizKmIEYDow',
 'nqF6cR_WUjl1Y-k66Fgdow',
 '-Ijgv6vbojWToDadfgTrDw',
 '-DeoafeWHFwpMlfCYCtLXw',
 'go_CpQY6UGh5_gSPN2E93Q',
 'Xe2dyC5eNT62LJ4wBmzdBw',
 'MKHYQFle8uE7q8qcjpjqcg',
 'Wm2Dtg0_ceoTpVrHfrhNUw',
 'TgXHDpideTpqUwb3_s_28Q',
 'dZ8Qv1Xz2ZHIqhg0X5sSRQ',
 'lruhpI6HUVtgnSme-IPvaQ',
 'UnxujXYrUYgRScXarDSbNQ',
 '9e3kryPr-Eyok1hrVWsTSA',
 'KVPUN4yU-2juc8Pc4sxKVQ',
 '2CdRYaeOjM7V4mzg2HKVMQ',
 '4sNCemcKk5IvqkK0P-SyAA',
 'ZZYRwgJuOEYpqA80HwtwRQ',
 '4SLk7-r4gaRFUOd3aaVPeQ',
 'yL3bIVFc6t5tW66MRbdU7A',
 

In [27]:
Bikes_csv= pd.read_csv('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_data')
pd.set_option('display.max_rows', None, 'display.max_columns', None)
Bikes_csv

Unnamed: 0.1,Unnamed: 0,id,name,rating,review_count,zipcode
0,0.0,WT_d47o-V5xlMNx8trI0-A,Monkey Wrench Bicycles,5.0,94,78751
1,1.0,wfKxBxJ8RFZj8jOB6Lpn-Q,Trek Bicycle Lamar,4.5,280,78704
2,2.0,oTujfSf88bPOcUWEmDBzjQ,The Peddler Bike Shop,4.5,176,78751
3,3.0,KVPUN4yU-2juc8Pc4sxKVQ,Clown Dog Bikes,5.0,134,78705
4,4.0,-Cza7JtBZZ7nuXZeStwPAA,Bike Farm,4.5,104,78756
5,5.0,Ved9jiedoOFf39iFHSyQbQ,ATX Bikes,4.5,72,78749
6,6.0,ZtuzXaoMnY1gd0kMZFdpcw,Mellow Johnny's Bike Shop,4.0,205,78701
7,7.0,4EdSNL5cShH-ZNsUbwWSJQ,Trek Bicycle Research,4.5,93,78759
8,8.0,2lIEXCMqbUaYJ98_cwAd2A,East Side Pedal Pushers,4.5,128,78702
9,9.0,gWhCMZVm0ITZC7KxI3M2Pw,Texas Cycle Werks,4.5,37,78735


In [28]:
biz_ids=Bikes_csv.loc[:,'id'].to_list()
biz_ids_lst=list(set(biz_ids))


In [29]:
biz_ids_lst

['yuNrhGEVy_5sxu_MOooI1Q',
 'bt6AjtQvgzJDQ7KTG_PgwQ',
 'I3hl1maFk4RDTrAetttUEg',
 'rDEpN-TQGeMnkfm1mDU8ZQ',
 'gWhCMZVm0ITZC7KxI3M2Pw',
 'EXVlG72y8UwvZtr-l9J2aQ',
 'KXVkCQ_u-8MfoA3iCAivhQ',
 '-k_6pPEW11RbveUExNT9WQ',
 'lPD6aRJwqxtgxFwa2SXuHQ',
 'Y5RaFaBi8rsL2NAkne17gQ',
 'npWp_SATDZ0q0TADjDs58w',
 'jUcwnPiSENIBqt7Yw87_Zw',
 '18UGjpTexL3nRJ4UrMSM7w',
 'lXodVpk5ZUOVymDBlb10Zg',
 'JN_AiBjGmF4dDdJIYKT7SA',
 'qb81dk8ByC_LlNqMNkRhJQ',
 'EwnnAPpLbN7EI1026nthxQ',
 'HliPtTKSvqoM8TmbT3aB7w',
 '7s-l-xIWiYy6AxwzxWvn-Q',
 'XqtWbuVX3RHCizKmIEYDow',
 'nqF6cR_WUjl1Y-k66Fgdow',
 '-Ijgv6vbojWToDadfgTrDw',
 '-DeoafeWHFwpMlfCYCtLXw',
 'go_CpQY6UGh5_gSPN2E93Q',
 'Xe2dyC5eNT62LJ4wBmzdBw',
 'MKHYQFle8uE7q8qcjpjqcg',
 'Wm2Dtg0_ceoTpVrHfrhNUw',
 'TgXHDpideTpqUwb3_s_28Q',
 'dZ8Qv1Xz2ZHIqhg0X5sSRQ',
 'lruhpI6HUVtgnSme-IPvaQ',
 'UnxujXYrUYgRScXarDSbNQ',
 '9e3kryPr-Eyok1hrVWsTSA',
 'KVPUN4yU-2juc8Pc4sxKVQ',
 '2CdRYaeOjM7V4mzg2HKVMQ',
 '4sNCemcKk5IvqkK0P-SyAA',
 'ZZYRwgJuOEYpqA80HwtwRQ',
 '4SLk7-r4gaRFUOd3aaVPeQ',
 

In [30]:
biz_id_1=biz_ids[0]
biz_id_1

'WT_d47o-V5xlMNx8trI0-A'

___

## Part 3 -  Create ETL pipeline for the restaurant review data from the API

You've done this for the Businesses, now you need to do this for reviews. You will follow the same process, but your functions will be specific to reviews. Above you have a model of the functions you will need to write, and how to pull them together in one script. For this part, you have the process below 

- In order to pull the reviews, you will need the business ids. So your first step will be to get all of the business ids from your businesses csv. 

- Write a function that takes a business id and makes a call to the API for reivews


- Write a function to parse out the relevant information from the reviews

- Write a function to save the parse data into a csv file containing all of the reviews. 

- Combine the functions above into a single script  

In [31]:
# url_params_reviews = {
#                 "term": term.replace(' ', '+'),
#                 "location": location.replace(' ', '+'),
#                 "categories" : categories,
#                 "limit": 50,
    
#             }

In [32]:
biz_ids_lst.pop(13)

'lXodVpk5ZUOVymDBlb10Zg'

In [33]:
biz_ids_lst

['yuNrhGEVy_5sxu_MOooI1Q',
 'bt6AjtQvgzJDQ7KTG_PgwQ',
 'I3hl1maFk4RDTrAetttUEg',
 'rDEpN-TQGeMnkfm1mDU8ZQ',
 'gWhCMZVm0ITZC7KxI3M2Pw',
 'EXVlG72y8UwvZtr-l9J2aQ',
 'KXVkCQ_u-8MfoA3iCAivhQ',
 '-k_6pPEW11RbveUExNT9WQ',
 'lPD6aRJwqxtgxFwa2SXuHQ',
 'Y5RaFaBi8rsL2NAkne17gQ',
 'npWp_SATDZ0q0TADjDs58w',
 'jUcwnPiSENIBqt7Yw87_Zw',
 '18UGjpTexL3nRJ4UrMSM7w',
 'JN_AiBjGmF4dDdJIYKT7SA',
 'qb81dk8ByC_LlNqMNkRhJQ',
 'EwnnAPpLbN7EI1026nthxQ',
 'HliPtTKSvqoM8TmbT3aB7w',
 '7s-l-xIWiYy6AxwzxWvn-Q',
 'XqtWbuVX3RHCizKmIEYDow',
 'nqF6cR_WUjl1Y-k66Fgdow',
 '-Ijgv6vbojWToDadfgTrDw',
 '-DeoafeWHFwpMlfCYCtLXw',
 'go_CpQY6UGh5_gSPN2E93Q',
 'Xe2dyC5eNT62LJ4wBmzdBw',
 'MKHYQFle8uE7q8qcjpjqcg',
 'Wm2Dtg0_ceoTpVrHfrhNUw',
 'TgXHDpideTpqUwb3_s_28Q',
 'dZ8Qv1Xz2ZHIqhg0X5sSRQ',
 'lruhpI6HUVtgnSme-IPvaQ',
 'UnxujXYrUYgRScXarDSbNQ',
 '9e3kryPr-Eyok1hrVWsTSA',
 'KVPUN4yU-2juc8Pc4sxKVQ',
 '2CdRYaeOjM7V4mzg2HKVMQ',
 '4sNCemcKk5IvqkK0P-SyAA',
 'ZZYRwgJuOEYpqA80HwtwRQ',
 '4SLk7-r4gaRFUOd3aaVPeQ',
 'yL3bIVFc6t5tW66MRbdU7A',
 

In [34]:
def yelp_call_reviews(url_review, api_key):
    response= requests.get(url_reviews, headers=headers)  
    if response.status_code==200:
        return response.json()
    else: 
        return response.status_code


In [77]:
for bid in biz_ids_lst:
    url_reviews='https://api.yelp.com/v3/businesses/' + bid + '/reviews'
    url_data = yelp_call_reviews(url_reviews, api_key)

In [78]:
url_data = yelp_call_reviews(url_reviews, api_key)

In [117]:
reviews

[{'id': 'SxeQl8DzOfbc4IQyqbCACQ',
  'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=SxeQl8DzOfbc4IQyqbCACQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrmKSjNF-mA',
  'text': 'It\'s funny how most "Sporting Goods" stores these days just sell clothing.\n\nAcademy doesn\'t have a very good selection of tennis gear, but I guess that\'s...',
  'rating': 4,
  'time_created': '2018-05-24 10:26:05',
  'user': {'id': 'UTXobrN3nD6tyqynVVCjqg',
   'profile_url': 'https://www.yelp.com/user_details?userid=UTXobrN3nD6tyqynVVCjqg',
   'image_url': 'https://s3-media3.fl.yelpcdn.com/photo/S31zNfn4hTpAXJzyUmDuvA/o.jpg',
   'name': 'Glenn R.'}},
 {'id': '1F8c32t_C4ae4cCWcBLUJw',
  'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=1F8c32t_C4ae4cCWcBLUJw&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrm

In [86]:
url_data['reviews']

[{'id': 'SxeQl8DzOfbc4IQyqbCACQ',
  'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=SxeQl8DzOfbc4IQyqbCACQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrmKSjNF-mA',
  'text': 'It\'s funny how most "Sporting Goods" stores these days just sell clothing.\n\nAcademy doesn\'t have a very good selection of tennis gear, but I guess that\'s...',
  'rating': 4,
  'time_created': '2018-05-24 10:26:05',
  'user': {'id': 'UTXobrN3nD6tyqynVVCjqg',
   'profile_url': 'https://www.yelp.com/user_details?userid=UTXobrN3nD6tyqynVVCjqg',
   'image_url': 'https://s3-media3.fl.yelpcdn.com/photo/S31zNfn4hTpAXJzyUmDuvA/o.jpg',
   'name': 'Glenn R.'}},
 {'id': '1F8c32t_C4ae4cCWcBLUJw',
  'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=1F8c32t_C4ae4cCWcBLUJw&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrm

In [40]:
review_data= yelp_call_reviews(url_data, api_key)
reviews= review_data['reviews']


In [41]:
for review in reviews:
    print(review)

{'id': 'SxeQl8DzOfbc4IQyqbCACQ', 'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=SxeQl8DzOfbc4IQyqbCACQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrmKSjNF-mA', 'text': 'It\'s funny how most "Sporting Goods" stores these days just sell clothing.\n\nAcademy doesn\'t have a very good selection of tennis gear, but I guess that\'s...', 'rating': 4, 'time_created': '2018-05-24 10:26:05', 'user': {'id': 'UTXobrN3nD6tyqynVVCjqg', 'profile_url': 'https://www.yelp.com/user_details?userid=UTXobrN3nD6tyqynVVCjqg', 'image_url': 'https://s3-media3.fl.yelpcdn.com/photo/S31zNfn4hTpAXJzyUmDuvA/o.jpg', 'name': 'Glenn R.'}}
{'id': '1F8c32t_C4ae4cCWcBLUJw', 'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=1F8c32t_C4ae4cCWcBLUJw&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrmKSjNF-mA', 'text': 'I ha

In [115]:
biz_ids_lst

['yuNrhGEVy_5sxu_MOooI1Q',
 'bt6AjtQvgzJDQ7KTG_PgwQ',
 'I3hl1maFk4RDTrAetttUEg',
 'rDEpN-TQGeMnkfm1mDU8ZQ',
 'gWhCMZVm0ITZC7KxI3M2Pw',
 'EXVlG72y8UwvZtr-l9J2aQ',
 'KXVkCQ_u-8MfoA3iCAivhQ',
 '-k_6pPEW11RbveUExNT9WQ',
 'lPD6aRJwqxtgxFwa2SXuHQ',
 'Y5RaFaBi8rsL2NAkne17gQ',
 'npWp_SATDZ0q0TADjDs58w',
 'jUcwnPiSENIBqt7Yw87_Zw',
 '18UGjpTexL3nRJ4UrMSM7w',
 'JN_AiBjGmF4dDdJIYKT7SA',
 'qb81dk8ByC_LlNqMNkRhJQ',
 'EwnnAPpLbN7EI1026nthxQ',
 'HliPtTKSvqoM8TmbT3aB7w',
 '7s-l-xIWiYy6AxwzxWvn-Q',
 'XqtWbuVX3RHCizKmIEYDow',
 'nqF6cR_WUjl1Y-k66Fgdow',
 '-Ijgv6vbojWToDadfgTrDw',
 '-DeoafeWHFwpMlfCYCtLXw',
 'go_CpQY6UGh5_gSPN2E93Q',
 'Xe2dyC5eNT62LJ4wBmzdBw',
 'MKHYQFle8uE7q8qcjpjqcg',
 'Wm2Dtg0_ceoTpVrHfrhNUw',
 'TgXHDpideTpqUwb3_s_28Q',
 'dZ8Qv1Xz2ZHIqhg0X5sSRQ',
 'lruhpI6HUVtgnSme-IPvaQ',
 'UnxujXYrUYgRScXarDSbNQ',
 '9e3kryPr-Eyok1hrVWsTSA',
 'KVPUN4yU-2juc8Pc4sxKVQ',
 '2CdRYaeOjM7V4mzg2HKVMQ',
 '4sNCemcKk5IvqkK0P-SyAA',
 'ZZYRwgJuOEYpqA80HwtwRQ',
 '4SLk7-r4gaRFUOd3aaVPeQ',
 'yL3bIVFc6t5tW66MRbdU7A',
 

In [116]:
reviews

[{'id': 'SxeQl8DzOfbc4IQyqbCACQ',
  'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=SxeQl8DzOfbc4IQyqbCACQ&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrmKSjNF-mA',
  'text': 'It\'s funny how most "Sporting Goods" stores these days just sell clothing.\n\nAcademy doesn\'t have a very good selection of tennis gear, but I guess that\'s...',
  'rating': 4,
  'time_created': '2018-05-24 10:26:05',
  'user': {'id': 'UTXobrN3nD6tyqynVVCjqg',
   'profile_url': 'https://www.yelp.com/user_details?userid=UTXobrN3nD6tyqynVVCjqg',
   'image_url': 'https://s3-media3.fl.yelpcdn.com/photo/S31zNfn4hTpAXJzyUmDuvA/o.jpg',
   'name': 'Glenn R.'}},
 {'id': '1F8c32t_C4ae4cCWcBLUJw',
  'url': 'https://www.yelp.com/biz/academy-sports-outdoors-sunset-valley?adjust_creative=UPd8KVfQybexrmKSjNF-mA&hrid=1F8c32t_C4ae4cCWcBLUJw&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_reviews&utm_source=UPd8KVfQybexrm

In [42]:
def yelp_reviews(url_params_reviews, api_key):
    yelp_call_reviews(biz_ids_lst, api_key)
    parse_reviews(reviews)
    df_save_reviews('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_reviews', parsed_reviews)
    our_data= pd.read_csv('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_reviews')
    return our_data

In [126]:
def parse_reviews(bid, reviews):
    parsed_reviews =[]
    for review in reviews:
        review_info= ( bid,
                    review['id'],
                    review['text'],
                    review['rating'] 
                           )
        parsed_reviews.append(review_info)
        
    return parsed_reviews


In [134]:
def df_save_reviews(csv_filepath, parsed_reviews):
    
    
    pr_df= pd.DataFrame(parsed_reviews, columns= 
             ['business_id','id','text', 'rating'])
    
    pr_csv=pr_df.to_csv(path_or_buf=csv_filepath, mode="a", header=False)
   
    return print("Results added!")
            

In [137]:
for bid in biz_ids_lst[:5]:
# #     print(bid)
    url_reviews='https://api.yelp.com/v3/businesses/' + bid + '/reviews'
    url_data = yelp_call_reviews(url_reviews, api_key)
    new_reviews = yelp_call_reviews(bid, api_key)
    #after you get your results you can now use your function to parse those results
    new_parsed_reviews = parse_reviews(bid,new_reviews['reviews'])
     # use your function to insert your parsed results into the db
    df_save_reviews('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_reviews', new_parsed_reviews)
   


Results added!
Results added!
Results added!
Results added!
Results added!


In [None]:
pd.

In [66]:
parsed_reviews= parse_reviews(reviews)

In [67]:
set(parsed_reviews)

set()

In [144]:
pd.read_csv('/Users/leratsayukova/Documents/Flatiron/Bikes/data/csv_reviews', 
            names=['business_id', 'review_id', 'text', 'rating']).reset_index(drop=True)

Unnamed: 0,business_id,review_id,text,rating
0,yuNrhGEVy_5sxu_MOooI1Q,BqZcJHMW2_I6Vnu6Cke_SA,Dude! These guys are amazing! Great pricing an...,5
1,yuNrhGEVy_5sxu_MOooI1Q,Akb-CaOXueiaN0o2FiqgLg,"The shop is never open, the work never done co...",1
2,yuNrhGEVy_5sxu_MOooI1Q,F8uc4_dKupu94oyortyuoA,Took my bike in to get a jet kit put in after ...,5
3,bt6AjtQvgzJDQ7KTG_PgwQ,0KOtl-DRffz-tTVxPHfeMA,I went to the Whole Earth on South Lamar regul...,5
4,bt6AjtQvgzJDQ7KTG_PgwQ,-4wyj3KTQLfJkgv_p8FAeA,Was in need of some hiking boots. Was greeting...,5
5,bt6AjtQvgzJDQ7KTG_PgwQ,rAtQrSZiErKSJBux147tRg,I had very bad experience in this store . I wa...,1
6,I3hl1maFk4RDTrAetttUEg,oRQ1JwyFgSqncHFsQ6y21Q,"The team is great, they are truly in it to tak...",5
7,I3hl1maFk4RDTrAetttUEg,9mWXJt8vMK3kPIW6vVVSLA,I have never had such a great experience from ...,5
8,I3hl1maFk4RDTrAetttUEg,fSjdIbskK1JCQYUiERLmVw,went there on a couple of friends recommendati...,5
9,rDEpN-TQGeMnkfm1mDU8ZQ,RyomZ8B4LQb0SZ3EhpbSxQ,Thank you Platinum Motorcycle for such an amaz...,5


## Part 4 -  Using python and pandas, write code to answer the questions below. 


- Which are the 5 most reviewed businesses in your dataset?
- What is the highest rating recieved in your data set and how many businesses have that rating?
- What percentage of businesses have a rating greater than or  4.5?
- What percentage of businesses have a rating less than 3?
- What percentage of your businesseshave a price label of one dollar sign? Two dollar signs? Three dollar signs? No dollar signs?
- Return the text of the reviews for the most reviewed business. 
- Find the highest rated business and return text of the most recent review. If multiple business have the same rating, select the business with the most reviews. 
- Find the lowest rated business and return text of the most recent review.  If multiple business have the same rating, select the business with the least reviews. 


___

# Reference help

###  Pagination

Returning to the Yelp API, the [documentation](https://www.yelp.com/developers/documentation/v3/business_search) also provides us details regarding the API limits. These often include details about the number of requests a user is allowed to make within a specified time limit and the maximum number of results to be returned. In this case, we are told that any request has a maximum of 50 results per request and defaults to 20. Furthermore, any search will be limited to a total of 1000 results. To retrieve all 1000 of these results, we would have to page through the results piece by piece, retriving 50 at a time. Processes such as these are often refered to as pagination.

Now that you have an initial response, you can examine the contents of the json container. For example, you might start with ```response.json().keys()```. Here, you'll see a key for `'total'`, which tells you the full number of matching results given your query parameters. Write a loop (or ideally a function) which then makes successive API calls using the offset parameter to retrieve all of the results (or 5000 for a particularly large result set) for the original query. As you do this, be mindful of how you store the data. 

**Note: be mindful of the API rate limits. You can only make 5000 requests per day, and APIs can make requests too fast. Start prototyping small before running a loop that could be faulty. You can also use time.sleep(n) to add delays. For more details see https://www.yelp.com/developers/documentation/v3/rate_limiting.**

***Below is sample code that you can use to help you deal with the pagination parameter and bring all of the functions together.***


***Also, something might cause your code to break while it is running. You don't want to constantly repull the same data when this happens, so you should insert the data into the database as you call and parse it, not after you have all of the data***


In [None]:
# create a variable  to keep track of which result you are in. 
cur = 0
num = response['total']*3
#set up a while loop to go through and grab the result 
while cur < num and cur < 1000:
    #set the offset parameter to be where you currently are in the results 
    url_params['offset'] = cur
    #make your API call with the new offset number
    results = yelp_call(url_params, api_key)
    
    #after you get your results you can now use your function to parse those results
    parsed_results = parse_results(results)
    
    # use your function to insert your parsed results into the db
    db_insert(parsed_results)
    #increment the counter by 50 to move on to the next results
    cur += 20