<a align=left href="https://cognitiveclass.ai"><img src = "https://ibm.box.com/shared/static/9gegpsmnsoo25ikkbl4qzlvlyjbgxs5x.png" width = 400> </a>

<h1 align=center ><font size = 4>CAPSTONE: PROJECT PROPOSAL</font></h1>
<h1 align=center ><font size = 4>Eric Longomo, Bothell, WA, USA </font></h1>
<h1 align=center ><font size = 5>Selecting the location for brand new Fashion Boutiques in high traffic areas in Seattle, Washington, USA </font></h1>


## 1.0 Business Problem
The location of a fashion boutique is very import. This could potentially mean the difference between steady profits or a steady loss in revenue. A good understanding of the target market is key to finding the best retail location. Once the environment where the targeted audience shops, lives and works are determined, the best location that will attract these potential customers can then be selected. This makes it easy for customers to find the boutique by choosing a location that is close-by and convenient to stop in.

Established in 1970 as one of the first multi-brand boutiques in the UK, **Browns Fashion** –headquartered in Mayfair London, has a reputation as a fashion talent scout that is second to none. The company’s founder, Joan Burstein, employed Manolo Blahnik and Osman Yousefzada, and discovered Alexander McQueen and John Galliano –some of the top names in the fashion industry. Now owned by online giant Farfetch, the original South Molton Street store has powerful backing and a big digital engine, as well as a second store, **Browns East**. With a substantial e-commerce footprint, the company has begun the implentation of fashion boutique stores in major big cities as part of their omnichannel retail strategy. After rolling out stores in a few selected cities by guessing where the best locations to open, as part of their store expansion for Seattle, the company has decided to be more informed and selective, and take the time to do some research before opening a store in Seattle.

As a data scientist, I have been tasked to assist Browns in making data-driven decisions on the new locations –more suitable for the new stores in Seattle. This exploratory work constitutes a major part of their decision-making process. Then the company will internally conduct ground qualitative analyses of districts once the results of my analysis and report are reviewed.

### 1.1 Business Understanding (Discussion of the Background)
In general, most fashion boutiques are not necessarily located in the premium upmarket strips like, but rather, in high traffic areas where consumers go for shopping, restaurants and entertainment. Foursquare data are deemed very useful in making data-driven decisions about the best of those areas that will reproduce the similar success **Browns Fashion** has experienced in London. To achieve this, **Brown's** neighbourhood's latitude and longitude values in their London Boutique has been used to compile and analysis the 100 venues that are within a radius of 500 meters of their store in Mayfair, London, UK area. 

The analysis of Brown's current surounding of teir London, UK stores (**see analysis in subsection 1.2**), shows that the best locations to open new fashion retail stores may not only be where other clothing stores are located, but in fact areas that are near the following venues: 
1. Art Gallery   
1. French Restaurant   
1. Coffee Shop   
1. Juice Bar   
1. Italian Restaurant   
1. Hotel   
1. Cafés   
1. Cosmetics Shop   

Thus, opening new stores in these above enumerated locations might attract folks that frequent these place often and bring similar success experienced in London.

The analysis and recommendations for new store locations in **Seattle** will focus on general districts with these establishments, not on specific store addresses. Narrowing down the best district options derived from analysis allows for either further research to be conducted, advising agents of the chosen district, or on the ground searching for specific sites by the company's personnel.

### 1.2 Analysis of Browns Fashion current location in Mayfair, London, UK

In this subsection, we explain with added code lines, the presence of a number of venues in Seattle will guide our selection of the best location of the new store in Seattle -which will be the focus of the Data science workflow in Week 2. 

In [61]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


The Foursquare API is then used to explore the Browns Fashion store's surrounding in London.

#### Defining credentials. 

In [62]:
CLIENT_ID = 'OR2A1F3IR522FBM4SN4F3S21WNFXZTFIUHT2LEA1YJYFMV55' # your Foursquare ID
CLIENT_SECRET = 'YCKMUFHEX4CQCURKEPBS4OJ5LKBRENO3HQKX41YA4ZGEFRE3' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

Brown_latitude = latitude
Brown_longitude = longitude
neighborhood_name = 'Browns Fashion Store'


Now, let's get the top 100 venues that are in Mayfair (Browns Fashion current location in London) within a radius of 500 meters. First, let's create the GET request URL. Name your URL url.

In [63]:
# type your answer here
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    Brown_latitude, 
    Brown_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=OR2A1F3IR522FBM4SN4F3S21WNFXZTFIUHT2LEA1YJYFMV55&client_secret=YCKMUFHEX4CQCURKEPBS4OJ5LKBRENO3HQKX41YA4ZGEFRE3&v=20180605&ll=51.5134978,-0.1474765&radius=500&limit=100'

In [64]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e7a9f3c60ba08001b533eb7'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Mayfair',
  'headerFullLocation': 'Mayfair, London',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 201,
  'suggestedBounds': {'ne': {'lat': 51.517997804500006,
    'lng': -0.14025910623788412},
   'sw': {'lat': 51.5089977955, 'lng': -0.1546938937621159}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '50c4e405498eb4ce47af2f04',
       'name': 'The Foyer & Reading Room',
       'location': {'address': '49 Brook St',
        'lat': 51.512577493385585,
        'lng': -0.14766319231811606,
        'labeledLatLngs': [{'label': 'display',
          'lat': 5

In [65]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [66]:
latitudes = np.array([Brown_latitude])
longitudes = np.array([Brown_longitude])
neighborhood_name = ['Browns Fashion Store']

Venues_around_Browns = getNearbyVenues(names=neighborhood_name,
                                   latitudes=latitudes,
                                   longitudes=longitudes
                                  )

Browns Fashion Store


In [68]:
print(Venues_around_Browns.shape)
Venues_around_Browns.head()

(100, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Browns Fashion Store,51.513498,-0.147477,The Foyer & Reading Room,51.512577,-0.147663,Lounge
1,Browns Fashion Store,51.513498,-0.147477,La Petite Maison,51.5126,-0.146113,French Restaurant
2,Browns Fashion Store,51.513498,-0.147477,Claridge's,51.512656,-0.147813,Hotel
3,Browns Fashion Store,51.513498,-0.147477,JOE & THE JUICE,51.513831,-0.149524,Juice Bar
4,Browns Fashion Store,51.513498,-0.147477,Victoria's Secret,51.51317,-0.145313,Lingerie Store


In [69]:
Venues_around_Browns.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Browns Fashion Store,100,100,100,100,100,100


In [70]:
print('There are {} uniques categories.'.format(len(Venues_around_Browns['Venue Category'].unique())))

There are 51 uniques categories.


Let's analyze Brown Fashion Neighbourhood in London in details. 

In [71]:
# one hot encoding
Browns_onehot = pd.get_dummies(Venues_around_Browns[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Browns_onehot['Neighborhood'] = Venues_around_Browns['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Browns_onehot.columns[-1]] + list(Browns_onehot.columns[:-1])
Browns_onehot = Browns_onehot[fixed_columns]

Browns_onehot.head()

Unnamed: 0,Neighborhood,Art Gallery,Bakery,Boutique,Burger Joint,Café,Camera Store,Cantonese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Electronics Store,English Restaurant,Food Court,French Restaurant,Garden,Hotel,Hotel Bar,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Juice Bar,Leather Goods Store,Lingerie Store,Lounge,Men's Store,Modern European Restaurant,Nightclub,Park,Pedestrian Plaza,Pharmacy,Pizza Place,Sandwich Place,Shoe Store,Social Club,Spa,Sporting Goods Shop,Sri Lankan Restaurant,Steakhouse,Supermarket,Tea Room,Thai Restaurant,Toy / Game Store,Turkish Restaurant,Wine Bar,Wine Shop
0,Browns Fashion Store,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Browns Fashion Store,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Browns Fashion Store,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Browns Fashion Store,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Browns Fashion Store,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [72]:
Browns_onehot.shape

(100, 52)

Next, let's group venues' rows and take the mean of the frequency of occurrence of each venue category

In [73]:
Browns_grouped = Browns_onehot.groupby('Neighborhood').mean().reset_index()
Browns_grouped

Unnamed: 0,Neighborhood,Art Gallery,Bakery,Boutique,Burger Joint,Café,Camera Store,Cantonese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Electronics Store,English Restaurant,Food Court,French Restaurant,Garden,Hotel,Hotel Bar,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Juice Bar,Leather Goods Store,Lingerie Store,Lounge,Men's Store,Modern European Restaurant,Nightclub,Park,Pedestrian Plaza,Pharmacy,Pizza Place,Sandwich Place,Shoe Store,Social Club,Spa,Sporting Goods Shop,Sri Lankan Restaurant,Steakhouse,Supermarket,Tea Room,Thai Restaurant,Toy / Game Store,Turkish Restaurant,Wine Bar,Wine Shop
0,Browns Fashion Store,0.07,0.02,0.04,0.02,0.03,0.01,0.01,0.05,0.01,0.05,0.01,0.03,0.01,0.01,0.02,0.01,0.02,0.02,0.02,0.05,0.01,0.04,0.03,0.02,0.04,0.02,0.04,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01


Let's confirm the new size

In [74]:
Browns_grouped.shape

(1, 52)

Let's print each neighborhood along with the top 10 most common venues

In [75]:
num_top_venues = 10

for hood in Browns_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Browns_grouped[Browns_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Browns Fashion Store----
                venue  freq
0         Art Gallery  0.07
1      Clothing Store  0.05
2   French Restaurant  0.05
3         Coffee Shop  0.05
4            Boutique  0.04
5           Juice Bar  0.04
6  Italian Restaurant  0.04
7               Hotel  0.04
8                Café  0.03
9      Cosmetics Shop  0.03




### 1.3 Visualizing Browns Fashion store's surrounding in Mayfair, London, UK 
In this section, we create map of Browns Fashoin surrounding area in London UK using latitude and longitude values in order to explore the area and replicate as much as possible the success experienced in seattle. 

In [76]:
address = '24-27 S Molton St, Mayfair, London W1K 5RD, United Kingdom'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Brown Fashion are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Brown Fashion are 51.5134978, -0.1474765.


In [77]:
map_Brown_Location = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
lat, lng, label = latitude,longitude, address
label = folium.Popup(label, parse_html=True)
folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Brown_Location)  

# create map of New York using latitude and longitude values
map_Brown_Location = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, venue in zip(Venues_around_Browns['Venue Latitude'], Venues_around_Browns['Venue Longitude'], Venues_around_Browns['Venue Category']):
    label = '{}, {}'.format('Mayfair', venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

### 1.4 summary of the first section
In this section, we leverage Foursquare data of Brown Fashion store in london, to produce the 10 best venues surrounding their store in London. These top venues will help in the analysis in the next stage where we will use Seattle's data to determine the best locations possible for the new fashion boutiques. 

## 2. Description of the Data acquisition Process for Week 2
In this section we explain the importance of leveraging data to aid decision, and the data analysis process that will be undertaken. 
### 2.1 Importance of leveraging data
In the absence of leveraging data to aid decisions about new store locations, Browns could spend countless hours walking around districts, consulting many real estate agents with their own district biases, and end up opening in a location that is not ideal. Thus, exploring different neighbourhoods around Seattle that might potential replicate similar success they experienced in Mayfair, London will provide better answers and better solutions to potential new store locations. 

The aim is to identify the best neighbourhoods to open new stores as part of the company's plan. The results will be translated to management in a simple form that will convey the data-driven analysis for the best locations to open stores.

### 2.2 Data analysis workflow
Neighbourhoods data in Seattle were researched online. Based on the dataset obtained, the city of seattle and surrounds has been subdivised into 90 Neighbourhoods --assembled into 19 neighbourhoods groups. These data have been wrangled and cleaned and converted into a **.csv file** format suitable for analysis. Foursquare location data will be leveraged to explore or compare neighbourhoods around Seattle, identifying the high traffic areas where consumers go for shopping, dining and entertainment - the areas where the fashion brand are most interested in opening new stores, as illustrated in section 1.1.

The main task for week 2 will consist of the following: 
1.	#### Outline of Data Acquisition
    1. Neighbourhoods data for Seattle including longitude and latitude, and other related details
2.	#### Data Wrangling and Cleaning:
    1. Converting the data to a useable format.
3.	#### Data Analysis and Location Data:
    1.	Foursquare location data will be leveraged to explore or compare Neighbourhoods around Seattle. 
    1.	Data manipulation and analysis to derive subsets of the initial data.
    1.	Identifying the high traffic areas using data visualisation and statistical analysis.
4.	#### Visualization:
    1.	Data visualization using the geospatial library in Python (folium).
5.	#### Discussion
6.	#### Conclusions:
    1.	Achievements 
    1.	Recommendations and results based on the data analysis.
    1.	Discussion of any limitations and how the results can be used, and any conclusions that can be drawn.
