<h1 align=center><font size = 5>Venues/Business Analysis in India</font></h1>

#### Introduction

This project is about doing a business and strategizing the location and business. Sometimes, it is challenging for the investors to invest the money. this project will do the analysis for them and help in taking better business decisions. 

#### Introduction/Business Problem

Whenever people are thinking about doing a business it is always challenging to work from scratch and find out a location and the type of business. Most of the time they heard from people about the business and location without doing any analysis which i am afraid to say that people used to fail multiple times only because of this reason.

Nowadays, Investors are looking forward to invest their money in good startups which help both of them to get succeeded but they may take wrong decisions as they don't know the competition in that city or area. This project will give a good insight to all the people who want to explore and study more about the existing businesses along with help them to do analysis first further take any decision.

It will provide the information about the similar cities in an area to determine the same kind of opportunities in different locations.

####  Data section

For this purpose, this project contains the data of India and based on the requirements it will analysis the state and give the result in the form of DataFrames and maps with proper location. Here, one example of state i.e. Andhra Pradesh has been taken. It will generate all the businesses present in Andhra Pradesh state with city information as well.

Data will analyse the top 10 venues/businesses which are present in the respective city and classify them using K-Means algorithm to give the better recommendation of the businesses which a person can think about it. It will prove the complete report about the cities and will help to understand the similar cities in terms of business.

Data is gathered from https://simplemaps.com/data/in-cities which provides the information about the major Cities, States, Population, Latitude and Longitude information.This information will be used to analyse existing business in a particular state and help to determine the scope of more business areas.

Here we can change the state as well and get the results related to a particular state. If we want to have all the information at once about India that is also possible but it will become little clumbsy.

Here, Folium library is used for plotting the maps and FourSquare API is used to get the venues information and extracting the features related to venues.
We can explore more on FourSquare API to get the results about the items on a particular venue along with the ratings and comments etc.

In [518]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
print('Libraries import completed')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries import completed


### 1. Data Pre-processing steps
Reading downloaded data from https://simplemaps.com/data/in-cities

In [519]:
df = pd.read_csv(r'in.csv')
df.dtypes

city                  object
lat                  float64
lng                  float64
country               object
iso2                  object
admin                 object
capital               object
population           float64
population_proper    float64
dtype: object

In [520]:
df.shape

(212, 9)

In [521]:
df.head()

Unnamed: 0,city,lat,lng,country,iso2,admin,capital,population,population_proper
0,Mumbai,18.987807,72.836447,India,IN,Mahārāshtra,admin,18978000.0,12691836.0
1,Delhi,28.651952,77.231495,India,IN,Delhi,admin,15926000.0,7633213.0
2,Kolkata,22.562627,88.363044,India,IN,West Bengal,admin,14787000.0,4631392.0
3,Chennai,13.084622,80.248357,India,IN,Tamil Nādu,admin,7163000.0,4328063.0
4,Bengalūru,12.977063,77.587106,India,IN,Karnātaka,admin,6787000.0,5104047.0


Let's re-structure the data.

In [522]:
state_data = df[['admin','city','lat','lng','population']].sort_values(by = ['admin'], ascending=True)
state_data.columns = (['State','City','Latitude','Longitude','Population'])
state_data

Unnamed: 0,State,City,Latitude,Longitude,Population
189,Andaman and Nicobar Islands,Port Blair,11.666667,92.750000,127562.0
156,Andhra Pradesh,Proddatūr,14.750200,78.548129,197451.0
170,Andhra Pradesh,Hindupur,13.828065,77.491425,168312.0
125,Andhra Pradesh,Khammam,17.247672,80.143682,290839.0
82,Andhra Pradesh,Guntūr,16.299737,80.457293,530577.0
...,...,...,...,...,...
153,West Bengal,Haldia,22.025278,88.058333,200762.0
7,West Bengal,Hāora,22.576882,88.318566,4841638.0
190,West Bengal,Alīpur Duār,26.483500,89.522855,127342.0
2,West Bengal,Kolkata,22.562627,88.363044,14787000.0


Use geopy library to get the latitude and longitude values of Andhra Pradesh.

In [523]:
address = 'Andhra Pradesh, India'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Hyderabad, Andhra Pradesh are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Hyderabad, Andhra Pradesh are 15.9240905, 80.1863809.


In [524]:
ap_data = state_data[state_data['State']=='Andhra Pradesh']
ap_data.set_index=None
ap_data.head()

Unnamed: 0,State,City,Latitude,Longitude,Population
156,Andhra Pradesh,Proddatūr,14.7502,78.548129,197451.0
170,Andhra Pradesh,Hindupur,13.828065,77.491425,168312.0
125,Andhra Pradesh,Khammam,17.247672,80.143682,290839.0
82,Andhra Pradesh,Guntūr,16.299737,80.457293,530577.0
159,Andhra Pradesh,Machilīpatnam,16.187466,81.13888,192827.0


In [525]:
print("We have {} major cities in Andhra Pradesh".format(ap_data.shape[0]))

We have 20 major cities in Andhra Pradesh


In [555]:
# create map of Hyderabad using latitude and longitude values
map_ap= folium.Map(location=[latitude, longitude], zoom_start=5)

# add markers to map
for lat, lng, borough, neighborhood in zip(ap_data['Latitude'], ap_data['Longitude'], ap_data['State'], ap_data['City']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_ap)  
map_ap

**Folium** is a great visualization library. Feel free to zoom into the above map, and click on each circle mark to reveal the name of the state and its respective city.

Define Foursquare Credentials and Version

In [554]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180604'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 
CLIENT_SECRET:


### 2. Explore Cities in Andra Pradesh, India 
Let's create a function to repeat the same process to all the cities in Andra Pradesh

In [528]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    LIMIT = 200
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [529]:
ap_venues = getNearbyVenues(names=ap_data['City'],
                                   latitudes=ap_data['Latitude'],
                                   longitudes=ap_data['Longitude']
                                  )

Proddatūr
Hindupur
Khammam
Guntūr
Machilīpatnam
Kākināda
Rājahmundry
Karīmnagar
Vishākhapatnam
Tirupati
Warangal
Kurnool
Kagaznāgār
Nizāmābād
Vizianagaram
Hyderabad
Nandyāl
Nellore
Chīrāla
Ongole


#### Get the list of Venues/Businesses we have in Andra Pradesh State- City wise

In [530]:
print(ap_venues.shape)
ap_venues.head()

(67, 7)


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Proddatūr,14.7502,78.548129,Cine Hub,14.750211,78.551526,Multiplex
1,Proddatūr,14.7502,78.548129,ibaco,14.751263,78.548719,Ice Cream Shop
2,Hindupur,13.828065,77.491425,Sundaram Soda,13.830011,77.492712,Juice Bar
3,Khammam,17.247672,80.143682,Khammam Bus Station,17.25014,80.142381,Bus Station
4,Khammam,17.247672,80.143682,Hotel Vishnu Residency,17.250873,80.142643,Hotel


#### Let's check how many Venues/Businesses were returned for each city

In [531]:
ap_venues.groupby('City').count()

Unnamed: 0_level_0,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Chīrāla,4,4,4,4,4,4
Guntūr,1,1,1,1,1,1
Hindupur,1,1,1,1,1,1
Karīmnagar,4,4,4,4,4,4
Khammam,6,6,6,6,6,6
Kurnool,4,4,4,4,4,4
Kākināda,4,4,4,4,4,4
Machilīpatnam,4,4,4,4,4,4
Nandyāl,1,1,1,1,1,1
Nellore,8,8,8,8,8,8


#### Let's find out how many unique categories can be curated from all the returned venues

In [532]:
print('There are {} uniques categories.'.format(len(ap_venues['Venue Category'].unique())))

There are 35 uniques categories.


### 3. Analyze each City

In [533]:
# one hot encoding
ap_onehot = pd.get_dummies(ap_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ap_onehot['City'] = ap_venues['City'] 

# move neighborhood column to the first column
fixed_columns = [ap_onehot.columns[-1]] + list(ap_onehot.columns[:-1])
ap_onehot = ap_onehot[fixed_columns]

ap_onehot.head()

Unnamed: 0,City,ATM,Arts & Crafts Store,Asian Restaurant,Bakery,Boat or Ferry,Breakfast Spot,Buffet,Bus Station,Café,...,Pharmacy,Pizza Place,Restaurant,Shoe Store,Shopping Mall,Snack Place,Tourist Information Center,Train Station,Vegetarian / Vegan Restaurant,Watch Shop
0,Proddatūr,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Proddatūr,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Hindupur,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Khammam,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
4,Khammam,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [534]:
ap_onehot.shape

(67, 36)

Next, let's group rows by city and by taking the mean of the frequency of occurrence of each category

In [535]:
ap_grouped = ap_onehot.groupby('City').mean().reset_index()
ap_grouped.head()

Unnamed: 0,City,ATM,Arts & Crafts Store,Asian Restaurant,Bakery,Boat or Ferry,Breakfast Spot,Buffet,Bus Station,Café,...,Pharmacy,Pizza Place,Restaurant,Shoe Store,Shopping Mall,Snack Place,Tourist Information Center,Train Station,Vegetarian / Vegan Restaurant,Watch Shop
0,Chīrāla,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0
1,Guntūr,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Hindupur,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Karīmnagar,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.25,0.0,0.0,0.25,0.0,0.0,0.25,0.0,0.0,0.0
4,Khammam,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [536]:
temp = ap_grouped[ap_grouped['City'] == 'Hyderabad'].T.reset_index().head()
temp.head()

Unnamed: 0,index
0,City
1,ATM
2,Arts & Crafts Store
3,Asian Restaurant
4,Bakery


Let's put that into a *pandas* dataframe
First, let's write a function to sort the venues in descending order.

In [537]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each city.

In [538]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
cities_venues_sorted = pd.DataFrame(columns=columns)
cities_venues_sorted['City'] = ap_grouped['City']

for ind in np.arange(ap_grouped.shape[0]):
    cities_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ap_grouped.iloc[ind, :], num_top_venues)

cities_venues_sorted.head()

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Chīrāla,Vegetarian / Vegan Restaurant,Clothing Store,Gift Shop,Breakfast Spot,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Watch Shop
1,Guntūr,Bus Station,Café,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Watch Shop,Hotel
2,Hindupur,Juice Bar,Watch Shop,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
3,Karīmnagar,ATM,Tourist Information Center,Shoe Store,Pharmacy,Asian Restaurant,Bakery,Arts & Crafts Store,Boat or Ferry,Breakfast Spot,Gift Shop
4,Khammam,Bus Station,Fried Chicken Joint,Electronics Store,Hotel,Café,Food Court,Fast Food Restaurant,Deli / Bodega,Clothing Store,Watch Shop


### 4. Cluster Cities
Run *k*-means to cluster the neighborhood into 5 clusters.

In [539]:
# set number of clusters
kclusters = 5

ap_grouped_clustering = ap_grouped.drop('City', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ap_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 2, 0, 1, 2, 3, 3, 1, 4, 3], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each city.

In [540]:
# add clustering labels
cities_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

ap_merged = ap_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
ap_merged = ap_merged.join(cities_venues_sorted.set_index('City'), on='City')
ap_merged.head() # check the last columns!

Unnamed: 0,State,City,Latitude,Longitude,Population,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
156,Andhra Pradesh,Proddatūr,14.7502,78.548129,197451.0,3.0,Ice Cream Shop,Multiplex,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
170,Andhra Pradesh,Hindupur,13.828065,77.491425,168312.0,0.0,Juice Bar,Watch Shop,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
125,Andhra Pradesh,Khammam,17.247672,80.143682,290839.0,2.0,Bus Station,Fried Chicken Joint,Electronics Store,Hotel,Café,Food Court,Fast Food Restaurant,Deli / Bodega,Clothing Store,Watch Shop
82,Andhra Pradesh,Guntūr,16.299737,80.457293,530577.0,2.0,Bus Station,Café,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Watch Shop,Hotel
159,Andhra Pradesh,Machilīpatnam,16.187466,81.13888,192827.0,1.0,Pharmacy,Tourist Information Center,Indian Restaurant,Movie Theater,Bus Station,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café


In [541]:
ap_merged.dropna(inplace=True)
ap_merged

Unnamed: 0,State,City,Latitude,Longitude,Population,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
156,Andhra Pradesh,Proddatūr,14.7502,78.548129,197451.0,3.0,Ice Cream Shop,Multiplex,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
170,Andhra Pradesh,Hindupur,13.828065,77.491425,168312.0,0.0,Juice Bar,Watch Shop,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
125,Andhra Pradesh,Khammam,17.247672,80.143682,290839.0,2.0,Bus Station,Fried Chicken Joint,Electronics Store,Hotel,Café,Food Court,Fast Food Restaurant,Deli / Bodega,Clothing Store,Watch Shop
82,Andhra Pradesh,Guntūr,16.299737,80.457293,530577.0,2.0,Bus Station,Café,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Watch Shop,Hotel
159,Andhra Pradesh,Machilīpatnam,16.187466,81.13888,192827.0,1.0,Pharmacy,Tourist Information Center,Indian Restaurant,Movie Theater,Bus Station,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
123,Andhra Pradesh,Kākināda,16.960361,82.238086,292923.0,3.0,Hotel,Multiplex,Vegetarian / Vegan Restaurant,Men's Store,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Watch Shop
119,Andhra Pradesh,Rājahmundry,17.005171,81.777839,304804.0,1.0,Movie Theater,Indie Movie Theater,Clothing Store,Watch Shop,Gift Shop,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Café
127,Andhra Pradesh,Karīmnagar,18.436738,79.13222,288251.0,1.0,ATM,Tourist Information Center,Shoe Store,Pharmacy,Asian Restaurant,Bakery,Arts & Crafts Store,Boat or Ferry,Breakfast Spot,Gift Shop
22,Andhra Pradesh,Vishākhapatnam,17.704052,83.297663,1529000.0,1.0,Snack Place,Boat or Ferry,Movie Theater,Watch Shop,Café,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store
128,Andhra Pradesh,Tirupati,13.635505,79.419888,287035.0,1.0,Indian Restaurant,Fried Chicken Joint,Fast Food Restaurant,Watch Shop,Hotel,Food Court,Electronics Store,Deli / Bodega,Clothing Store,Café


Finally, let's visualize the resulting clusters

In [542]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=6)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ap_merged['Latitude'], ap_merged['Longitude'], ap_merged['City'], ap_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)- 1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Here we can see that most of the cities resides in Cluster - 1 where some of them are in Cluster - 3 and very few are there in Cluster - 0, 2, 4.
Fell free to click on the circles to get the information and do zoom In/Out.

## 5. Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster.

In [543]:
def create_frequency_df(cluster):
    df = pd.DataFrame((cluster.loc[:,cluster.columns[2]]).values)
    for i in range(3,12):
        df = pd.concat([df, pd.DataFrame((cluster.loc[:,cluster.columns[i]]).values)])
    df.columns = ['Business']
    return pd.DataFrame(df['Business'].unique())

#### Cluster 1

In [544]:
cluster = ap_merged.loc[ap_merged['Cluster Labels'] == 0, ap_merged.columns[[1] + list(range(5, ap_merged.shape[1]))]]
cluster

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
170,Hindupur,0.0,Juice Bar,Watch Shop,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café


In [545]:
df = create_frequency_df(cluster)
print("List of businesses which are most common in {} are as:- \n{}".format(cluster.loc[:,cluster.columns[0]].values , df.to_string(index=False)))

List of businesses which are most common in ['Hindupur'] are as:- 
                    0
            Juice Bar
           Watch Shop
                Hotel
  Fried Chicken Joint
           Food Court
 Fast Food Restaurant
    Electronics Store
        Deli / Bodega
       Clothing Store
                 Café


Among these businesses we should select any business and start working on it. We can have more better idea about the location using map displayed above.

#### Cluster 2

In [546]:
cluster = ap_merged.loc[ap_merged['Cluster Labels'] == 1, ap_merged.columns[[1] + list(range(5, ap_merged.shape[1]))]]
cluster

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
159,Machilīpatnam,1.0,Pharmacy,Tourist Information Center,Indian Restaurant,Movie Theater,Bus Station,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
119,Rājahmundry,1.0,Movie Theater,Indie Movie Theater,Clothing Store,Watch Shop,Gift Shop,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Café
127,Karīmnagar,1.0,ATM,Tourist Information Center,Shoe Store,Pharmacy,Asian Restaurant,Bakery,Arts & Crafts Store,Boat or Ferry,Breakfast Spot,Gift Shop
22,Vishākhapatnam,1.0,Snack Place,Boat or Ferry,Movie Theater,Watch Shop,Café,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store
128,Tirupati,1.0,Indian Restaurant,Fried Chicken Joint,Fast Food Restaurant,Watch Shop,Hotel,Food Court,Electronics Store,Deli / Bodega,Clothing Store,Café
36,Warangal,1.0,Pharmacy,Multiplex,Asian Restaurant,Indian Restaurant,Deli / Bodega,Food Court,Fast Food Restaurant,Electronics Store,Clothing Store,Café
101,Nizāmābād,1.0,Pharmacy,Indian Restaurant,Electronics Store,Bus Station,Café,Food Court,Fast Food Restaurant,Deli / Bodega,Clothing Store,Watch Shop
166,Vizianagaram,1.0,Train Station,Indian Restaurant,Fast Food Restaurant,Buffet,Watch Shop,Café,Food Court,Electronics Store,Deli / Bodega,Clothing Store
150,Ongole,1.0,Arts & Crafts Store,Shopping Mall,Restaurant,Mountain,Watch Shop,Café,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega


In [547]:
df = create_frequency_df(cluster)
print("List of businesses which are most common in {} are as:- \n\n{}".format(cluster.loc[:,cluster.columns[0]].values , df.to_string(index=False)))

List of businesses which are most common in ['Machilīpatnam' 'Rājahmundry' 'Karīmnagar' 'Vishākhapatnam' 'Tirupati'
 'Warangal' 'Nizāmābād' 'Vizianagaram' 'Ongole'] are as:- 

                          0
                   Pharmacy
              Movie Theater
                        ATM
                Snack Place
          Indian Restaurant
              Train Station
        Arts & Crafts Store
 Tourist Information Center
        Indie Movie Theater
              Boat or Ferry
        Fried Chicken Joint
                  Multiplex
              Shopping Mall
             Clothing Store
                 Shoe Store
       Fast Food Restaurant
           Asian Restaurant
          Electronics Store
                 Restaurant
                 Watch Shop
                Bus Station
                     Buffet
                   Mountain
                  Gift Shop
                       Café
                      Hotel
              Deli / Bodega
                 Food Court
            

#### Cluster 3

In [548]:
cluster = ap_merged.loc[ap_merged['Cluster Labels'] ==2 , ap_merged.columns[[1] + list(range(5, ap_merged.shape[1]))]]
cluster

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
125,Khammam,2.0,Bus Station,Fried Chicken Joint,Electronics Store,Hotel,Café,Food Court,Fast Food Restaurant,Deli / Bodega,Clothing Store,Watch Shop
82,Guntūr,2.0,Bus Station,Café,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Watch Shop,Hotel


In [549]:
df = create_frequency_df(cluster)
print("List of businesses which are most common in {} are as:- \n\n{}".format(cluster.loc[:,cluster.columns[0]].values , df.to_string(index=False)))

List of businesses which are most common in ['Khammam' 'Guntūr'] are as:- 

                    0
          Bus Station
  Fried Chicken Joint
                 Café
    Electronics Store
                Hotel
           Food Court
 Fast Food Restaurant
        Deli / Bodega
       Clothing Store
           Watch Shop


#### Cluster 4

In [550]:
cluster = ap_merged.loc[ap_merged['Cluster Labels'] ==3 , ap_merged.columns[[1] + list(range(5, ap_merged.shape[1]))]]
cluster

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
156,Proddatūr,3.0,Ice Cream Shop,Multiplex,Hotel,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Café
123,Kākināda,3.0,Hotel,Multiplex,Vegetarian / Vegan Restaurant,Men's Store,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Watch Shop
94,Kurnool,3.0,Hotel,Train Station,Pizza Place,Multiplex,Bus Station,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store
66,Nellore,3.0,Food Court,Shopping Mall,Men's Store,Multiplex,Pizza Place,Watch Shop,Bakery,Boat or Ferry,Breakfast Spot,Fried Chicken Joint
138,Chīrāla,3.0,Vegetarian / Vegan Restaurant,Clothing Store,Gift Shop,Breakfast Spot,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Watch Shop


In [551]:
df = create_frequency_df(cluster)
print("List of businesses which are most common in {} are as:- \n\n{}".format(cluster.loc[:,cluster.columns[0]].values , df.to_string(index=False)))

List of businesses which are most common in ['Proddatūr' 'Kākināda' 'Kurnool' 'Nellore' 'Chīrāla'] are as:- 

                             0
                Ice Cream Shop
                         Hotel
                    Food Court
 Vegetarian / Vegan Restaurant
                     Multiplex
                 Train Station
                 Shopping Mall
                Clothing Store
                   Pizza Place
                   Men's Store
                     Gift Shop
           Fried Chicken Joint
                Breakfast Spot
                   Bus Station
          Fast Food Restaurant
                    Watch Shop
             Electronics Store
                        Bakery
                 Deli / Bodega
                 Boat or Ferry
                          Café


#### Cluster 5

In [552]:
cluster= ap_merged.loc[ap_merged['Cluster Labels'] ==4 , ap_merged.columns[[1] + list(range(5, ap_merged.shape[1]))]]
cluster

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
161,Nandyāl,4.0,Café,Watch Shop,Fried Chicken Joint,Food Court,Fast Food Restaurant,Electronics Store,Deli / Bodega,Clothing Store,Bus Station,Hotel


In [553]:
df = create_frequency_df(cluster)
print("List of businesses which are most common in {} are as:- \n\n{}".format(cluster.loc[:,cluster.columns[0]].values , df.to_string(index=False)))

List of businesses which are most common in ['Nandyāl'] are as:- 

                    0
                 Café
           Watch Shop
  Fried Chicken Joint
           Food Court
 Fast Food Restaurant
    Electronics Store
        Deli / Bodega
       Clothing Store
          Bus Station
                Hotel
