# Capstone Project - Get the best venues in Mumbai for opening a Gym
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Results](#results) 
* [Discussion](#discussion)
* [Conclusion](#conclusion)



## Introduction: Business Problem <a name="introduction"></a>

People are more and more interested in living health and actives lives and this is achieved not only by following a proper diet and conscious alimentary choices but also with regular trainings. Gyms play a fundamental roles in our lives, and we need to organize our days properly if we would get some workouts done since considering the usual working hours, the time we spend in traffic or public transportation, there is not so much time left in a day. Gyms are becoming more and more popular and an increasing number of customers spend an average of several hours per week training into gym facilities.

In this project we will try to find an optimal location for a Gym/Fitness Centre. Specifically, this report will be targeted to stakeholders interested in opening an **Gym/Fitness Centre** in **Mumbai**, India.

Since there are lots of Gyms in Mumbai we will try to detect **locations that are not already crowded with gyms**. We are also particularly interested in **areas with no Gym/Fitness Centre in vicinity**. We would also prefer locations **as close to city center as possible**, assuming that first two conditions are met.

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.


## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing Gym/Fitness Centre in the neighborhood 
* number of and distance to Gym/Fitness Centre in the neighborhood, if any

The Dataset that will be used is:
1. Mumbai Neighborhood Data which have the list of neighborhood names with associated latitude and longitude, here is the link to the dataset: 'https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Mumbai'

Other sources used to get the geographic data and geocoding data are:
1. Foursquare API to get the best venues for a given neighborhood
2. Geocoder package for latitude and longitude coordinates

### How data will be used to solve the problem?

Wikipedia Page has Mumbai’s Neighborhood Data which is stored in a table form. Pandas’s library will be used to scrape the table and store it in a Pandas DataFrame.

After we have the Data with the geocoding coordinates available, we’ll be using the foursquare’s geographic data to get the best venue of the neighborhood. Then we will collect the Gym location data and cluster them using K-Means clustering technique.


## Methodology  <a name="methodology"></a>

The data used in the project is Mumbai's Neighborhood Data. The objective of the project is to find the markets where setting up a Gym won’t be too competitive.
In order to achieve this objective, we need to collect the data about the Neighborhoods of Mumbai and also visualize it using the folium maps.

In this project we will direct our efforts on detecting areas of Mumbai that have low Gym/Fitness Center density. We will limit our analysis to area ~2km around city center.

In first step we have collected the required data: location and trending venues within 2km from Mumbai. The Foursquare’s Developer API will help us get the venues of the different Neighborhoods in Mumbai. Approximately 7022 different venues are found by the Foursquare’s API which will further be used to identify Gym/Fitness Center (according to Foursquare categorization).

In second step we will focus on Gym areas and within those create clusters of locations. We will take into consideration the locations with very low number to totally no Gyms/Fitness Center in the neighborhoodsin radius of 2000 meters. We will present map of all such locations but also create clusters (using k-means clustering) of those locations to identify general zones / neighborhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.


## Results <a name="results"></a>

Our analysis shows that although there is a great number of gyms in Mumbai, there are pockets of low Gym/Fitness Center density fairly close to city center. Highest concentration of Gyms was detected near western suburbs of Mumbai and moderate concentration in South Mumbai. On the other hand, Eastern and Harbour suburbs of Mumbai has very low number to totally no Gyms/Fitness Center in the neighborhoods. 
This represents a great opportunity and high potential areas to open new Gyms as there is no competition from the existing Gyms. Therefore, this project recommends property developers to capitalize on these findings to open new Gyms/Fitness Center in neighborhoods in Eastern and Harbour suburbs of Mumbai with little to no competition. Property developers with unique selling propositions to stand out from the competition can also open new Gyms/Fitness Center in neighborhoods in South Mumbai with moderate competition. 

Purpose of this analysis was only to provide info on areas close to Mumbai center but not crowded with existing Gyms or Fitness Ceters, it is entirely possible that there is a very good reason for small number of gyms in any of those areas, reasons which would make them unsuitable for a new gyms regardless of lack of competition in the area. Recommended zones should therefore be considered only as a starting point for more detailed analysis which could eventually result in location which has not only no nearby competition but also other factors taken into account and all other relevant conditions met.

![image.png](attachment:image.png)


## Discussion <a name="discussion"></a>

There are several factors that determine the success of a business at a location. There can be different strategies to open a business. One strategy is to open a business in an area where there is little to no competition. This is a less risky way of setting up a new business since business won’t be faced with problems such as intense competition and increasing the quality of the products at a very high rate to keep up with the increasing demands of the consumers and compete with rival firms. But, that also comes with a cost of building the consumer base from scratch which means almost no returns in the initial few months of setting up the business
These types of businesses are the ones having an average or low real estate rates and very less competitors in the area.

The other strategy is to open a business where there are already big markets that are setup. These kind of businesses has the advantage of tapping into huge amount of consumers since the demand is already taken care of(as popular markets have large number of consumers). But, on the other hand it is a very risky strategy. Due to the intense competition in these markets, the quality of the products should also be good otherwise the competition would be lost to other big businesses that are already selling in the market. This strategy is a high risk, high return strategy. Nonetheless, many other factors should also be considered before operating on these kind of neighborhoods(for example: assessing if it’s even worth to spend that much money on the expensive shops)
Neighborhoods of Andheri and Colaba are some of the most expensive neighborhoods in Mumbai but they’re also one of those that have a good number of Gym/Fitness Centers. One way to answer this problem is that there are other factors present possibly emergance of big markets in those areas where large number of consumers shop which could make opening Gyms and Fitness Centers in those areas profitable due to high sales.



## Conclusion  <a name="conclusion"></a>

The main objective of our project is to find the best venues in different Neighborhoods of Mumbai and to assess which of these locations would be best to open a Gym or Fitness Center.
Our project helped us to answer these kind of questions.
For any given search query, we can find how many shops related to that query is in a particular Neighborhood of Mumbai. Based on this data, it will help to make the decision regarding where to open a Gym or Fitness Center in Mumbai very easy and straightforward.