# Introduction

Calgary, Alberta is most often noted as one of Canada's fastest growing cities, putting it in a position to become one of the top 3 biggest cities in Canada.

The client is an entrepreneur who runs a successful Yoga Studio in Vancouver, and wishes to open a Yoga Studio in the city of Calgary, believing it to be an ideal place to open a second studio due to its growth. They have enlisted your help to further confirm and determine if Calgary is indeed a good location to expand the business.


**Business Problem**

They want you to collect data on the different businesses that currently exist within Calgary and to help them determine if expansion is a good idea, and if so - where in Calgary it is best to expand.

They want you to account for the following points: 
<ol>
<li> Density - how many other yoga studios are located in the surrounding area? </li>
<li> Proximity to Downtown Core - how close is it to the Downtown Core? </li>
<li> Risk Factor - overally, how risky is it to open a yoga studio in Calgary? </li>
</ol> 



# Data

The neighbourhood information for Calgary can be found using the link to a Wikipedia page below. The page outlines the different Postal Codes in addition to their corresponding Boroughs, Neighbourhoods, Latitude and Longitudes.

Postal Codes in Calgary: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T

Below is an example of the Information as displayed on Wikepedia.
<table>
  <thead>
    <tr>
      <th>Postal Code</th>
      <th>Borough</th>
      <th>Neighbourhood</th>
      <th>Latitude</th>
      <th>Longitude</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>T2A</td>
      <td>Calgary</td>
      <td>Penbrooke Meadows, Marlborough </td>  
      <td>51.049680</td>  
      <td>-113.964320</td>    
    </tr>
  </tbody>
</table>

Once the data has been cleaned and organized, Foursquare will be used to attain the top venues for each neighbourhood. This information will be then used to create clusters throughout the city.

The top venues and their corresponding clusters will be analyzed to determine how similar/dissimilar they are to each other, and subsequently, provide more information as to the types of investors and business owners that would be interested in doing business in the city.


# Methodology

##### Uploading Calgary Data, using Beautiful Soup and Pandas Dataframes

Using Beautiful soup, the Wikipedia web page 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T' was scraped. The data was cleaned, disregarding any null values or neighbourhoods and boroughs that were "not assigned"; focusing only on the information that was relevant to the city of Calgary. Boroughs that were outside of Calgary were also removed to prevent redundancy.

##### Foursquare
Foursquare was then used to explore the neighbourhoods of Calgary and find the top 100 venues located within a 1500m (1.5km) radius from each neighbourhood. The number of each unique venue was counted and identified per neighbourhood, tabulated and charted. 

The following headers were used:

<table>
  <thead>
    <tr>
      <th>Neighbourhood</th>
      <th>1st Most Common Venue</th>
      <th>2nd Most Common Venue</th>
      <th>3rd Most Common Venue</th>
      <th>4th Most Common Venue</th>
      <th>5th Most Common Venue</th>  
      <th>6th Most Common Venue</th>  
      <th>7th Most Common Venue</th>  
      <th>8th Most Common Venue</th>  
      <th>9th Most Common Venue</th>
      <th>10th Most Common Venue</th>
    </tr>
  </thead>
</table>


##### k-means Clusters

The Elbow Method was used to determine the best "k" value before proceeding with k-means clustering. Although the data was not very clear in this step, the value of k=7 was assumed for the clustering and analysis.

Each neighbourhood was assgined to a cluster, giving information to the most common venues per neighbourhood, and consequentially, per cluster.

Below is an example of the information obtained for Cluster 1
<table>
  <thead>
    <tr>
      <th>Neighbourhood</th>
      <th>Cluster Labels</th> 
      <th>1st Most Common Venue</th>
      <th>2nd Most Common Venue</th>
      <th>3rd Most Common Venue</th>
      <th>4th Most Common Venue</th>
      <th>5th Most Common Venue</th>  
      <th>6th Most Common Venue</th>  
      <th>7th Most Common Venue</th>  
      <th>8th Most Common Venue</th>  
      <th>9th Most Common Venue</th>
      <th>10th Most Common Venue</th>           
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Calgary</td>
      <td>1</td>
      <td>Hardware Store</td>  
      <td>Fast Food Restaurant</td>  
      <td>Convenience Store</td>
      <td>Cosmetics Shop</td> 
      <td>Deli/Bodega</td> 
      <td>Department Store</td>     
      <td>Dim Sum Restaurant</td>     
      <td>Diner</td>    
      <td>Discount Store</td>     
      <td>Dog Run</td>        
    </tr>
  </tbody>
</table>

After collecting the information per cluster, the information can be examined as a whole to determine if opening a Yoga Studio in Calgary is a wise business move for the client.


# Results

By looking at the clusters, the following are found to be the Top 3 most common venues (in descending order) per cluster in the city of Calgary:

* Cluster 0 - Coffee Shop, Convenience Store / Pub / Chinese Restaurant, Liquor Store
* Cluster 1 - Hardware Store / Fast Food Restaurant / Convenience Store
* Cluster 2 - American Restaurant / Skating Rink, Hotel / Fast Food Restaurant
* Cluster 3 - Coffee Shop / {Various} Restaurants / {Various} Restaurants
* Cluster 4 - Dog Run / Yoga Studio / Fast Food Restaurant
* Cluster 5 - Convenience Store / Home Service / {Various Stores}
* Cluster 6 - Flea Market / Fast Food Restaurant / Convenience Store

The most common ones being Restaurant / Food Services in nature.


# Discussion

By looking at the results obtained, one of the first things that was deduced is that the most common venue - in general - in Calgary, are businesses that belong to the food and dining industry. When grouping them by the most common venues, yoga studio was rarely in the Top 3 Most Common Venues for the neighbourhoods in each cluster.

In Cluster 0, Yoga Studios appeared five times. Out of the five times, the highest recorded was as the 3rd Most Common Venue.

In Cluster 3, it was the 5th most common venue for one of the neighbourhoods, the 2nd most common venue for the only neighbourhood in Cluster 4 and the 3rd most common venue for one of two neighbourhoods with a "Yoga Studio" in Cluster 5.

Out of all the clusters, it was the 2nd most common venue in Cluster 4, in the neighbourhood of Brentwood, Collingwood, Nose Hill.

From this information, a few things can be noted based of the criteria the client provided.

   * Density
   * Proximity to Downtown Core
   * Risk Factor

With respect to Density, existing Yoga Studios are not as abundant when compared to the food and dining restaurants in Calgary. Furthermore, they are also distanced enough from each other, as they are scattered across multiple clusters. There are advantages and disadvantages to opening a studio in locations where there are no other yoga studios around the vicinity.

An advantage could be that there is less competition around the general area. Another could be that opening a yoga studio in a different neighbourhood, far from the ones with existing studios could access a group of customers that may have avoided the other studios due to its distance. The accessibility it would create for people wishing to attend a yoga studio located closer to their homes would be appealing to new customers.

However, a disadvantage of opening a yoga studio where there are not a lot of other studios nearby could be a lack of visibility. As most people are creatures of habit, it can be assumed that most people typically stick to travelling and visiting the same area(s) in their day to day lives. This could prove challenging in terms of visibility, as less yoga studio users may be passing by the new studio. This of course, could be mitigated with differnt marketing strategies, but it is something to consider.

Since the existing studios are scattered throughout Calgary, their proximity to downtown Calgary is varied. Selecting a place closer or in downtown could make it accessible to users who work downtown who wish to attend the studio before or after work, minimizing extra commute in their part. However, it can be assumed that the closer the building is to downtown, the higher the cost would be to rent or own a lot.

That being said, selecting a location situated far from downtown and going into the suburbs could have its perks and setbacks as well. It opens the opportunity for family or group yoga classes, should the client wish. It could promote a more "relaxing" or "calming" outside environment for users who are entering or leaving the studio. However, choosing a location far into the suburbs (for example, in the deep south such as the neighbourhood in Cluster 1), could defer those wishing to attend the studio who live far in the North.

Due to the lack of competition, it can be argued that the biggest concern and risk associated would be the chosen location and its associated cost.






# Conclusion

Should the client be willing to take large risks in expanding their business, then expanding their business close to downtown core, would be their best approach. With the freedom of not having that many competitors, they would be able to establish good visibility and with the right marketing, attract new customers to the studio.