# Data

Our objective is to search within a **5km radius of Chennai** located at the **Latitude : 13.0827° N** & **Longitude : 80.2707° E**. Using the FoursquareAPI & ZomatoAPI the data mentioned below is retrieved. Based on the Venue Name, Latitude & Longitude obtained from the FoursquareAPI we request for data from the ZomatoAPI. We have collected a total of **119 Venues within a 5km radius** of the geographical co-ordinates of Chennai. Since we are using two datasets each from different API, there maybe some noise. In-order to clean this data, we are eliminating venues with latitude & longitude values more than 0.0004. On cleaning this data we are left with **74 venues** to obtain a working model.

The following data has been collected from the **FoursquareAPI** :
* Venue Name
* Category
* Latitude
* Longitude

The following data has been collected from the **ZomatoAPI** :
* Average Price for Two People
* Price Range
* Rating
* Address

### Example of Data

In [7]:
import requests
from pandas.io.json import json_normalize
import pandas as pd

client_id = 'SPJ5ZZH2JCBKLK3Y5HF51JBNPK5GGGPI2J43MDND3Z3MFD2L'
client_secret = 'Y1J35FIYYFC1LUW3ATM0TVBTB1E3TVY4ALG2F2FZXSXYNCFB'
version = '20180605'
chennai_latitude = 13.0827
chennai_longitude = 80.2707
radius = 5000 #5KM
limit = 3
offset = 3
fs_venues = pd.DataFrame(columns = ['name', 'categories', 'lat', 'lng'])

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

while(True):
    url = ('https://api.foursquare.com/v2/venues/explore?client_id={}'
           '&client_secret={}&v={}&ll={},{}&radius={}&limit={}&offset={}').format(client_id, 
                                                                        client_secret, 
                                                                        version, 
                                                                        chennai_latitude, 
                                                                        chennai_longitude, 
                                                                        radius,
                                                                        limit,
                                                                        offset)
    result = requests.get(url).json()
    venues_fetched = len(result['response']['groups'][0]['items'])
    venues = result['response']['groups'][0]['items']
    venues = json_normalize(venues)

    # Filter the columns
    filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
    venues = venues.loc[:, filtered_columns]
    
    # Filter the category for each row
    venues['venue.categories'] = venues.apply(get_category_type, axis = 1)
    
    # Clean all column names
    venues.columns = [col.split(".")[-1] for col in venues.columns]
    fs_venues = pd.concat([fs_venues, venues], axis = 0, sort = False)
    
    if (venues_fetched < 100):
        break
    else:
        offset = offset + 100

fs_venues = fs_venues.reset_index(drop = True)
fs_venues.rename(columns = {'name':'Venue Name','categories':'Category','lat':'Latitude','lng':'Longitude'})

Unnamed: 0,Venue Name,Category,Latitude,Longitude
0,Seena bhai tiffen centre,Restaurant,13.08967,80.278455
1,Escape Cinemas,Multiplex,13.058746,80.26417
2,Shree Mithai,Indian Restaurant,13.072118,80.247865
