<a href="https://colab.research.google.com/github/tdmpni/coursera_capstone1/blob/main/Coursera_Final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LeNike Pop-up Store Evaluation
*This is a project for the IBM Data Science Certification*

## Introduction & Business Problem

Pop up stores are a great marketing tool for various brands and particularly for those that do not traditionally operate brick and mortar stores.  
A pop up store is where a brand opens a store in a certain area  for a short period of time. The likes of Nike, BarkShop and many more have had great success with pop up stores. Pop up stores can lead to massive social media exposure and rapid sales as people have a fear of missing out since they know the store will be gone at any moment.

**The problem:** _Where an online brand should open a pop-up store_

Toronto is North America's fastest growing city, and this project shall tackle the business problem of leading a online clothing brand deciding on where in Toronto to open a pop-up store. 

The brand shall be called LeNike. The ideal location will be trendy, and have a high concentration of other retail businesses to ensure a high amount of foot traffic.

Commercial real estate in Toronto is some of the most expensive in North America, so LeNike needs to make sure it picks the right spot for their investment to pay off. 

The stakeholders for this project are LeNike's executive team that will need to be presented with this data in order to decide where in Toronto to open the store. 

The executives must ideally pick the best location as 2020 was a difficult year for retail, and they need to have a successful 2021 in order for the company to continue surviving. 


## Data

This project shall use data from FourSquare to help the executives make a decision.

FourSquare has a robust API, and we'll mainly be using their _explore_ endpoint to gather data on trending venues and neighbour hoods in the city.

The API returns various information about venues including their category, location, and groups.

We'll be using the location and category data to help cluster the various venues in the city. This will thenthen ultimately help us decide where the best location for exposure will be. This will be a location that has a high number of different business clustered together as to attract the most diverse population of consumers as LeNike is an inclusive brand.

## Methodology
*Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why.*

In [2]:
#Import necessary libraries
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


### Sourcing data from FourSquare

In [31]:
#FourSquare API Housekeeping
CLIENT_ID = 'YE3ORUH24FXCX5UNUWZUH2LR2NEKOCUAMQL0YQEF30FXVGOK' 
CLIENT_SECRET = 'DCO5MTJIP2INBZJQ5HTKSRIULPKP32CXWMF1F3AIYPQAOI3F'
VERSION = '20200201' # Using 2020 version before COVID as a number of venues closed after
NEAR = 'Toronto, ON'

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&near={}&v={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    NEAR, 
    VERSION
    )
# Getting venue information from FourSquare
api_results = requests.get(url).json()

In [28]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [38]:
venues = api_results['response']['groups'][0]['items']

nearby_venues = pd.json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head(100)

Unnamed: 0,name,categories,lat,lng
0,High Park,Park,43.646479,-79.463425
1,Northern Belle,Cocktail Bar,43.650906,-79.41284
2,Riverdale Park East,Park,43.669951,-79.355493
3,Downtown Toronto,Neighborhood,43.653232,-79.385296
4,Cedarvale Park,Field,43.692535,-79.428705
5,Waterfront Trail,Trail,43.635859,-79.467529
6,Maryam Hotel,Hotel,43.766961,-79.401199
7,Humber Bay Park,Park,43.622396,-79.478389
8,Humber River Footbridge,Bridge,43.631851,-79.471321
9,The Distillery Historic District,Historic Site,43.650244,-79.359323


In [30]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

30 venues were returned by Foursquare.


# Visualizing the data

In [54]:
m = folium.Map(location=[43.6532, -79.3832], zoom_start=15)

In [55]:
venues_viz = pd.DataFrame(nearby_venues)
venues_viz.head()

Unnamed: 0,name,categories,lat,lng
0,High Park,Park,43.646479,-79.463425
1,Northern Belle,Cocktail Bar,43.650906,-79.41284
2,Riverdale Park East,Park,43.669951,-79.355493
3,Downtown Toronto,Neighborhood,43.653232,-79.385296
4,Cedarvale Park,Field,43.692535,-79.428705


In [56]:
#loop to add venues to map

for venue in venues_viz:
  name = venues_viz['name']
  lat = venues_viz['lat']
  lng = venues_viz['lng']
  folium.Marker(
    location =[lat, lng],
).add_to(m)

In [57]:
#show the map
m

## Results

There is a high concentration of business along Queen St in Toronto. Thus, this would be an ideal place for LeNike to setup shop

# Discussion
The FourSquare API did not return a large number of venues for Toronto, and so the analysis had to run using light data. For future projects, stitching together multiple data sources using APIs may be the best for big data anlaysis. 

# Conclusion 
Overall, LeNike can definitely set up a pop-up store in Toronto as there are a number of venues in the city that are close together and generate a good deal of foot traffic. 