## Eataly Case Study 

### New Store Opening Analysis

### The purpose of this notebook is to take real locations and dummy revenue data to predict, using Machine Learning, the ideal location for a new store opening. This prediction is based on revenue and location. Like all AI predictions, it should be reviewed by a human to ensure we're leveraging AI responsibly. 

In [20]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.3.2 -> 24.0
[notice] To update, run: C:\Users\wnwanne\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [9]:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from sklearn.cluster import KMeans


# Read in Original Dataset

In [3]:
df = pd.read_csv('Eataly_Locations.csv')
df.head()

Unnamed: 0,Store,Lat,Long,Rev (annonymized)
0,"Piazza XXV Aprile 10, 20121 Milano",45.480546,9.185924,5119110
1,"Via Ermanno Fenoglietti, 14, 10126 Turin - Met...",45.040852,7.675395,7454667
2,"Piazzale XII Ottobre 1492, 00154 Rome",41.875706,12.479535,6293900
3,"Via Santa Teresa 12, 37135 Verona",45.4196,10.993312,5686253
4,"Edificio Millo Porto Antico, Calata Cattaneo 1...",44.409225,8.928206,7239911


# Pre-model Visualization

In [4]:
#create "text" column to be able to see info better on graph
df['text'] = 'Name:' + df['Store'] + ' Total Rev:' + df['Rev (annonymized)'].astype(str)

# Primary visual of all the "Chipotle Buyers"
fig = go.Figure(data=go.Scattergeo(lon=df['Long'],
                                  lat=df['Lat'],
                                  text=df['text'],
                                  mode='markers'))

fig.update_layout(
        title_text = '2023 Locations of Eatalys in World',
        showlegend = False)

fig.show()

# Data Prep: Round 2
## Feature Selections/Reduction

In [6]:
df_locations = df[['Long', 'Lat']]
df_locations.head()

Unnamed: 0,Long,Lat
0,9.185924,45.480546
1,7.675395,45.040852
2,12.479535,41.875706
3,10.993312,45.4196
4,8.928206,44.409225


# Clustering using KMeans & Sci-Kit Learn

In [13]:
#takes parameters
num_clusters = 1

kmeans = KMeans(n_clusters=num_clusters)

#begins fitting/training and testing
kmeans.fit(df_locations)
kmeans_predictions = kmeans.predict(df_locations)

#retrieve K amount of cluster centers and store in a DF
print(kmeans.cluster_centers_)
numpy_data = kmeans.cluster_centers_
cluster_dfs = pd.DataFrame(data=numpy_data, columns=["Long", 'Lat'])
cluster_dfs.head()

[[-0.94648366 40.10803862]]


Unnamed: 0,Long,Lat
0,-0.946484,40.108039


# Cluster Visualization
## "Ideal locations for my chipotle"

In [15]:
fig2 = go.Figure(data=go.Scattergeo(lon=cluster_dfs['Long'],
                             lat=cluster_dfs['Lat'],
                             mode='markers',
                             marker = dict(
                                size = 14,
                                opacity = 1,
                                reversescale = True,
                                autocolorscale = False,
                                symbol = 'square',
                                line = dict(width=1,
                                            color='rgba(102, 102, 102)'))))
fig2.update_layout(
        title_text = 'Eataly Ideal Location for 2023',
        showlegend = True)
fig2.show()

# Combined Visualization

In [19]:
fig3 = go.Figure()
fig3.add_trace(go.Scattergeo(lon=cluster_dfs['Long'],
                             lat=cluster_dfs['Lat'],
                             mode='markers',
                             name='Proposed Location(s)',
                             marker = dict(
                                size = 14,
                                opacity = 1,
                                reversescale = True,
                                autocolorscale = False,
                                symbol = 'square',
                                line = dict(width=1,
                                            color='rgba(102, 102, 102)'))))

fig3.add_trace(go.Scattergeo(lon=df['Long'],
                             lat=df['Lat'],
                             text=df['text'],
                             mode='markers',
                             name='Current Locations',
                             opacity=1,
                             marker = dict(
                                size = 4,
                                opacity = 0.7,
                                reversescale = True,
                                autocolorscale = False,
                                line = dict(width=1,
                                            color='rgba(255, 255, 255)'))))

fig3.update_layout(
        title_text = '2023 Eataly Locations and Potential New Location',
        showlegend = True)

fig3.show()