<h1 align=center><font size = 5>Attractions in Ahmedabad City, Gujrat, India</font></h1>

# Table of Contents

1. <a href="#item1">Introduction</a>
2. <a href="#item2">Data Collection using Foursquare API</a>  
3. <a href="#item3">Data Cleaning</a>  
4. <a href="#item4">Methodology</a>  
5. <a href="#item5">Analysis & Result</a>  
6. <a href="#item5">Conclusion</a>  

## Introduction

Ahmedabad is the largest city and former capital of the Indian state of Gujarat. It has emerged as an important economic and industrial hub in India. It is the second-largest producer of cotton in India, and its stock exchange is the country's second oldest. Cricket is a popular sport in Ahmedabad. It is a major tourist attraction.

Whenever a user is visiting a city they start looking for places to visit during their stay. They primarily look for places based on the venue ratings across all venues and the average prices such that the locations fits in their budget.

##### The aim of the project is to identify venues in Ahmedabad, India based on their rating and average prices. 

#### Description of the Problem

To start a new venture in a city, it is important to first identify the places which can draw maximum customers.
Here, we'll identify places that are frequented most by the visitors based on the information collected from Foursquare API and Data Science. Once we have the plot with the venues, any company can launch an application using the same data and suggest users such information.

## Data Collection Using Foursquare API

Firstly we will have a look at Ahmedabad Map using the folium library.

#### Ahmedabad, India

Ahmedabad is located on the banks of the Sabarmati River, 23 km (14 mi) from the state capital Gandhinagar, which is its twin city. Ahmedabad has emerged as an important economic and industrial hub in India. Being an important historical town, it has two clear distinct segments - the old town area and the newly developed commercial centers in the east.

We can use the geopy library to extract the latitude and longitude values of Ahmedabad but it seems off and thus, we'll directly supply the values in this case.

In [2]:
AMD_LATITUDE = '23.0225'
AMD_LONGITUDE = '72.5714'
print('The geograpical coordinates of Ahmedabad are {}, {}.'.format(AMD_LATITUDE, AMD_LONGITUDE))

The geograpical coordinates of Ahmedabad are 23.0225, 72.5714.


In [11]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium

print('Folium installed and imported!')

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    branca-0.4.1               |             py_0          26 KB  conda-forge
    brotlipy-0.7.0             |py36h8c4c3a4_1000         346 KB  conda-forge
    chardet-3.0.4              |py36h9f0ad1d_1006         188 KB  conda-forge
    cryptography-2.9.2         |   py36h45558ae_0         613 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    pandas-1.0.3               |   py36h83

In [14]:
ahmedabad_map = folium.Map(location = [23.0225, 72.5714], zoom_start=13)
ahmedabad_map

### Collect Data Using Foursquare API¶
We begin by fetching a total of all venues in Ahmedabad upto a range of 4 Kilometers using the Foursquare API. The Foursquare API has the explore API which allows us to find venue recommendations within a given radius from the given coordinates. We will use this API to find all the venues we need.

In [15]:
FOURSQUARE_CLIENT_ID = 'ZON3J3KGEICTDB0LUVQZDZYJCBNAYGQ13H3TXHLQ3QGK5OHU'
FOURSQUARE_CLIENT_SECRET = 'RU3XRV3A3WKW05NWQCU5D12HGSFXF0KZ0LX5ZMPSFTLDBTUU'
RADIUS = 4000 # 4 Km
NO_OF_VENUES = 100
VERSION = '20200527' # Current date

In [16]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
    

In [17]:
import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

from pandas.io.json import json_normalize
import requests

pd.set_option('display.max_rows', None)

offset = 0
total_venues = 0
foursquare_venues = pd.DataFrame(columns = ['name', 'categories', 'lat', 'lng'])

while (True):
    url = ('https://api.foursquare.com/v2/venues/explore?client_id={}'
           '&client_secret={}&v={}&ll={},{}&radius={}&limit={}&offset={}').format(FOURSQUARE_CLIENT_ID, 
                                                                        FOURSQUARE_CLIENT_SECRET, 
                                                                        VERSION, 
                                                                        AMD_LATITUDE, 
                                                                        AMD_LONGITUDE, 
                                                                        RADIUS,
                                                                        NO_OF_VENUES,
                                                                        offset)
    result = requests.get(url).json()
    venues_fetched = len(result['response']['groups'][0]['items'])
    total_venues = total_venues + venues_fetched
    print("Total {} venues fetched within a total radius of {} Km".format(venues_fetched, RADIUS/1000))

    venues = result['response']['groups'][0]['items']
    venues = json_normalize(venues)

    # Filter the columns
    filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
    venues = venues.loc[:, filtered_columns]

    # Filter the category for each row
    venues['venue.categories'] = venues.apply(get_category_type, axis = 1)

    # Clean all column names
    venues.columns = [col.split(".")[-1] for col in venues.columns]
    foursquare_venues = pd.concat([foursquare_venues, venues], axis = 0, sort = False)
    
    if (venues_fetched < 100):
        break
    else:
        offset = offset + 100

foursquare_venues = foursquare_venues.reset_index(drop = True)
print("\nTotal {} venues fetched".format(total_venues))


Total 80 venues fetched within a total radius of 4.0 Km

Total 80 venues fetched




## Plotting the Venues

We will first plot the Foursquare data on the map.

In [36]:
ahmedabad_map = folium.Map(location = [23.0225, 72.5714], zoom_start = 13)

for name, latitude, longitude in zip(foursquare_venues['name'], foursquare_venues['lat'], foursquare_venues['lng']):
    label = '{}'.format(name)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [latitude, longitude],
        radius = 5,
        popup = label,
        color = 'green',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7,
        parse_html = False).add_to(ahmedabad_map)  

ahmedabad_map

Sabarmati River flows through the Ahmedabad town and nearly divides it into two parts - **East Ahmedabad** & **West Ahmedabad**.
Now we generate the dataframe for the venues

In [91]:
# assign relevant part of JSON to venues
venues = result['response']['groups'][0]['items']
   
# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.head()


  """


Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.crossStreet,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,...,venue.location.city,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.location.postalCode,venue.venuePage.id,venue.location.neighborhood
0,e-0-4bf8c05492d19521074a5a1f-0,0,"[{'summary': 'This spot is popular', 'type': '...",4bf8c05492d19521074a5a1f,Crossword,Shree Krishna Center,Mithakhali Six Roads,23.032656,72.565404,"[{'label': 'display', 'lat': 23.03265560009132...",...,Ahmedabad,Gujarāt,India,"[Shree Krishna Center (Mithakhali Six Roads), ...","[{'id': '4bf58dd8d48988d114951735', 'name': 'B...",0,[],,,
1,e-0-4baa1849f964a5204d4a3ae3-1,0,"[{'summary': 'This spot is popular', 'type': '...",4baa1849f964a5204d4a3ae3,Swati Snacks,"Nr Gandhi Baug Society, Law Garden",Panchvati Road,23.024438,72.559087,"[{'label': 'display', 'lat': 23.02443798062042...",...,Ahmedabad,Gujarāt,India,"[Nr Gandhi Baug Society, Law Garden (Panchvati...","[{'id': '4bf58dd8d48988d1c7941735', 'name': 'S...",0,[],,,
2,e-0-4f2c2b84e4b0e6b070ef1352-2,0,"[{'summary': 'This spot is popular', 'type': '...",4f2c2b84e4b0e6b070ef1352,Manek Chowk Khau Gali,Manek Chowk,Manekchowk,23.023505,72.588539,"[{'label': 'display', 'lat': 23.02350527643007...",...,Ahmedabad,Gujarāt,India,"[Manek Chowk (Manekchowk), Ahmedabad 380001, G...","[{'id': '4bf58dd8d48988d1c7941735', 'name': 'S...",0,[],380001.0,,
3,e-0-4bc574270a30d13a40125a9c-3,0,"[{'summary': 'This spot is popular', 'type': '...",4bc574270a30d13a40125a9c,TOMATO'S,1-3 Mardia Plaza,C G Road,23.026693,72.557488,"[{'label': 'display', 'lat': 23.02669296963554...",...,Ahmedabad,Gujarāt,India,"[1-3 Mardia Plaza (C G Road), Ahmedabad, Gujar...","[{'id': '4bf58dd8d48988d1c1941735', 'name': 'M...",0,[],,,
4,e-0-4ed13a649adf254457e88251-4,0,"[{'summary': 'This spot is popular', 'type': '...",4ed13a649adf254457e88251,Manek Chowk,Manek chowk,Mandvi's pole,23.023626,72.588553,"[{'label': 'display', 'lat': 23.02362593356704...",...,Ahmedabad,Gujarāt,India,"[Manek chowk (Mandvi's pole), Ahmedabad 380001...","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],380001.0,,


In [93]:
dataframe_filtered = dataframe

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['famous'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()

Unnamed: 0,referralId,count,items,id,name,address,crossStreet,lat,lng,labeledLatLngs,...,state,country,formattedAddress,categories,count.1,groups,postalCode,id.1,neighborhood,famous
0,e-0-4bf8c05492d19521074a5a1f-0,0,"[{'summary': 'This spot is popular', 'type': '...",4bf8c05492d19521074a5a1f,Crossword,Shree Krishna Center,Mithakhali Six Roads,23.032656,72.565404,"[{'label': 'display', 'lat': 23.03265560009132...",...,Gujarāt,India,"[Shree Krishna Center (Mithakhali Six Roads), ...","[{'id': '4bf58dd8d48988d114951735', 'name': 'B...",0,[],,,,Bookstore
1,e-0-4baa1849f964a5204d4a3ae3-1,0,"[{'summary': 'This spot is popular', 'type': '...",4baa1849f964a5204d4a3ae3,Swati Snacks,"Nr Gandhi Baug Society, Law Garden",Panchvati Road,23.024438,72.559087,"[{'label': 'display', 'lat': 23.02443798062042...",...,Gujarāt,India,"[Nr Gandhi Baug Society, Law Garden (Panchvati...","[{'id': '4bf58dd8d48988d1c7941735', 'name': 'S...",0,[],,,,Snack Place
2,e-0-4f2c2b84e4b0e6b070ef1352-2,0,"[{'summary': 'This spot is popular', 'type': '...",4f2c2b84e4b0e6b070ef1352,Manek Chowk Khau Gali,Manek Chowk,Manekchowk,23.023505,72.588539,"[{'label': 'display', 'lat': 23.02350527643007...",...,Gujarāt,India,"[Manek Chowk (Manekchowk), Ahmedabad 380001, G...","[{'id': '4bf58dd8d48988d1c7941735', 'name': 'S...",0,[],380001.0,,,Snack Place
3,e-0-4bc574270a30d13a40125a9c-3,0,"[{'summary': 'This spot is popular', 'type': '...",4bc574270a30d13a40125a9c,TOMATO'S,1-3 Mardia Plaza,C G Road,23.026693,72.557488,"[{'label': 'display', 'lat': 23.02669296963554...",...,Gujarāt,India,"[1-3 Mardia Plaza (C G Road), Ahmedabad, Gujar...","[{'id': '4bf58dd8d48988d1c1941735', 'name': 'M...",0,[],,,,Mexican Restaurant
4,e-0-4ed13a649adf254457e88251-4,0,"[{'summary': 'This spot is popular', 'type': '...",4ed13a649adf254457e88251,Manek Chowk,Manek chowk,Mandvi's pole,23.023626,72.588553,"[{'label': 'display', 'lat': 23.02362593356704...",...,Gujarāt,India,"[Manek chowk (Mandvi's pole), Ahmedabad 380001...","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],380001.0,,,Fast Food Restaurant


## Data Cleaning

**We don't want to use all the data in the data frame. This project aims at identifying the categories of venues in Ahmedabad & their rating. This would enable any visitor to identify the venues he/she wants to visit based on their rating.**

**Hence, we will drop certain columns which are not of our interest**

In [98]:
dataframe_filtered.dtypes

referralId           object
count                 int64
items                object
id                   object
name                 object
address              object
crossStreet          object
lat                 float64
lng                 float64
labeledLatLngs       object
distance              int64
cc                   object
city                 object
state                object
country              object
formattedAddress     object
categories           object
count                 int64
groups               object
postalCode           object
id                   object
neighborhood         object
famous               object
Type                 object
dtype: object

#### **There is a need to clean the data in the category and items to clearly understand the type of venue and for what it is famous for?.**

In [104]:
# function that extracts the type of the venue from items column
def get_items_type(row):
    try:
        items_list = row['categories']
    except:
        items_list = row['items']
        
    if len(items_list) == 0:
        return None
    else:
        return items_list[0]['name']

# filter the category for each row
dataframe_filtered['Type'] = dataframe_filtered.apply(get_items_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered.head()

Unnamed: 0,referralId,count,items,id,name,address,crossStreet,lat,lng,labeledLatLngs,...,country,formattedAddress,categories,count.1,groups,postalCode,id.1,neighborhood,famous,Type
0,e-0-4bf8c05492d19521074a5a1f-0,0,"[{'summary': 'This spot is popular', 'type': '...",4bf8c05492d19521074a5a1f,Crossword,Shree Krishna Center,Mithakhali Six Roads,23.032656,72.565404,"[{'label': 'display', 'lat': 23.03265560009132...",...,India,"[Shree Krishna Center (Mithakhali Six Roads), ...","[{'id': '4bf58dd8d48988d114951735', 'name': 'B...",0,[],,,,Bookstore,Bookstore
1,e-0-4baa1849f964a5204d4a3ae3-1,0,"[{'summary': 'This spot is popular', 'type': '...",4baa1849f964a5204d4a3ae3,Swati Snacks,"Nr Gandhi Baug Society, Law Garden",Panchvati Road,23.024438,72.559087,"[{'label': 'display', 'lat': 23.02443798062042...",...,India,"[Nr Gandhi Baug Society, Law Garden (Panchvati...","[{'id': '4bf58dd8d48988d1c7941735', 'name': 'S...",0,[],,,,Snack Place,Snack Place
2,e-0-4f2c2b84e4b0e6b070ef1352-2,0,"[{'summary': 'This spot is popular', 'type': '...",4f2c2b84e4b0e6b070ef1352,Manek Chowk Khau Gali,Manek Chowk,Manekchowk,23.023505,72.588539,"[{'label': 'display', 'lat': 23.02350527643007...",...,India,"[Manek Chowk (Manekchowk), Ahmedabad 380001, G...","[{'id': '4bf58dd8d48988d1c7941735', 'name': 'S...",0,[],380001.0,,,Snack Place,Snack Place
3,e-0-4bc574270a30d13a40125a9c-3,0,"[{'summary': 'This spot is popular', 'type': '...",4bc574270a30d13a40125a9c,TOMATO'S,1-3 Mardia Plaza,C G Road,23.026693,72.557488,"[{'label': 'display', 'lat': 23.02669296963554...",...,India,"[1-3 Mardia Plaza (C G Road), Ahmedabad, Gujar...","[{'id': '4bf58dd8d48988d1c1941735', 'name': 'M...",0,[],,,,Mexican Restaurant,Mexican Restaurant
4,e-0-4ed13a649adf254457e88251-4,0,"[{'summary': 'This spot is popular', 'type': '...",4ed13a649adf254457e88251,Manek Chowk,Manek chowk,Mandvi's pole,23.023626,72.588553,"[{'label': 'display', 'lat': 23.02362593356704...",...,India,"[Manek chowk (Mandvi's pole), Ahmedabad 380001...","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],380001.0,,,Fast Food Restaurant,Fast Food Restaurant


In [105]:
selected_venues = dataframe_filtered[['name', 'address','count','famous', 'Type']]
selected_venues.head()

Unnamed: 0,name,address,count,count.1,famous,Type
0,Crossword,Shree Krishna Center,0,0,Bookstore,Bookstore
1,Swati Snacks,"Nr Gandhi Baug Society, Law Garden",0,0,Snack Place,Snack Place
2,Manek Chowk Khau Gali,Manek Chowk,0,0,Snack Place,Snack Place
3,TOMATO'S,1-3 Mardia Plaza,0,0,Mexican Restaurant,Mexican Restaurant
4,Manek Chowk,Manek chowk,0,0,Fast Food Restaurant,Fast Food Restaurant


In [106]:
selected_venues.shape

(80, 6)

## Finally now we have a filtered dataframe of famous venues in Ahmedabad. It displays the type of venue and what it is famous for?