# What Strategy?
### Based on yelp challenge <a href='https://www.yelp.com.au/dataset/challenge'>dataset </a>

#### Author: Ignacio Recasens

### Abstract

To determine the business strategy that would improve chances of success the most, we performed three analyses that complemented each other. First performing sentiment analysis over customer reviews, second finding the most significant business attributes that positively correlate to a higher business rating or stars, and finally performing a **qualitative analysis through a conjoint analysis** over a survey sent to people residing mainly in Las Vegas.

This notebook presents the steps taken to perform a Conjoint Analysis.

Based on the attributes identified in the previous two sections (Sentiment Analysis and Attributes Regression) the following attributes were selected to be further analysed by a qualitative analysis - service, broth (number of options), price, menu (Ramen toppings), and portion size - through a survey prepared as part of a conjoint analysis to rank attribute importance and decide on the most attractive attribute set for our restaurant. A conjoint analysis will help us understand the tradeoffs that consumers make when they decide whether or not to eat at a Ramen restaurant.

The 5 attributes that we took into consideration and its respective levels are: 
<img src="Table1.PNG">

With a total of 48 (2x2x3x2x2) possible combinations of attributes, we decided to incorporate 10 of these combinations into our survey. The choice of “10 questions” were made while balancing identification of the effects of each attributes and ensuring that the respondents complete the survey. These 10 combinations were chosen to minimise multicollinearity between different attributes, and to maximise the D-efficiency metric via optimisation. The code used was done in SAS.

Our survey was then distributed to respondents in the US, mainly in Las Vegas, via online survey platform Typeform, asking them to rate how likely they would visit a restaurant given a specific set of attributes, on a scale of 1 to 9 (1 very unlikely, 9 highly likely).

The 10 questions for the Surve were as follows: 

        1) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with table service, pork and chicken broth options , price of $15.00 ,                                      
           a menu that has customisable Ramen toppings , and small and standard sizes?                                                      
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        2) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with table service, pork and chicken broth options , price of $15.00 ,                                      
           a menu that has standardised Ramen toppings , and small and standard sizes?                                                      
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        3) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with self service, pork and chicken broth options , price of $13.00 ,                                       
           a menu that has standardised Ramen toppings , and only standard sizes?                                                           
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        4) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with self service, pork broth (standard) only , price of $20.00 ,                                           
           a menu that has standardised Ramen toppings , and small and standard sizes?                                                     
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        5) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with table service, pork and chicken broth options , price of $20.00 ,                                      
           a menu that has customisable Ramen toppings , and only standard sizes?                                                         
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        6) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with self service, pork and chicken broth options , price of $13.00 ,                                       
           a menu that has customisable Ramen toppings , and only standard sizes?                                                           

        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        7) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with table service, pork broth (standard) only , price of $13.00 ,                                          
           a menu that has customisable Ramen toppings , and small and standard sizes?                                                     
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9

        8) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with self service, pork and chicken broth options , price of $20.00 ,                                       
           a menu that has customisable Ramen toppings , and small and standard sizes?                                                     
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9


        9) For your next Ramen restaurant visit, how likely are you to visit:                                                               
           A restaurant with self service, pork broth (standard) only , price of $15.00 ,                                           
           a menu that has customisable Ramen toppings , and only standard sizes?                                                          
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9


        10) For your next Ramen restaurant visit, how likely are you to visit:                                                              
            A restaurant with table service, pork broth (standard) only , price of $20.00 ,                                         
            a menu that has standardised Ramen toppings , and only standard sizes?                                                        
        Definitely Would Not Purchase				Definitely Would Purchase
        1	2	3	4	5	6	7	8	9



# Restaurant Chain
### Based on yelp challenge <a href='https://www.yelp.com.au/dataset/challenge'>dataset </a>

<a id='top'></a>
#### Outline: 
#### 1. <a href='#load'>Load Libraries</a>

#### 2. <a href='#survey'>Clean Survey</a>

#### 3. <a href='#parthworths'>Get Partworths by Respondent</a>

#### 4. <a href='#competitos'>Simulate Competitors</a>

#### 5. <a href='#utils'>Get Utilities and Share</a>
11. <a href='#utilities'>Utilities by respondent</a>
12. <a href='#arithmetic'>Arithmetic Share</a>
13. <a href='#logit'>Logit Share</a>
14. <a href='#choice'>Choice Share</a>

<a id='load'></a>
## 1 Load libraries

In [25]:
import pandas as pd
import numpy as np
import math
from collections import defaultdict
import itertools

from IPython.display import display

# NLP
import re
import nltk
from nltk import word_tokenize

<a id='survey'></a>
## 2 Clean Survey

The Survey was created using TypeForm. Below we load the xlsx. file that can be downloaded for all the results obtained in TypeForm. 


In [2]:
survey = pd.read_excel('C:/Users/Ignacio/Project/Conjoint_Survey.xlsx')
survey

Unnamed: 0,#,"For your next Ramen restaurant visit; how likely are you to visit: A restaurant with table service; pork and chicken broth options ; price of $15,00; a menu that has customisable Ramen toppings; and small and standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with table service; pork and chicken broth options ; price of $15,00; a menu that has standardised Ramen toppings; and small and standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with self service; pork and chicken broth options; price of $13,00; a menu that has standardised Ramen toppings; and only standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with self service; pork broth (standard) only ; price of $20,00; a menu that has standardised Ramen toppings ; and small and standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with table service; pork and chicken broth options ; price of $20,00; a menu that has customisable Ramen toppings; and only standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with self service; pork and chicken broth options; price of $13,00; a menu that has customisable Ramen toppings; and only standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with table service; pork broth (standard) only ; price of $13,00; a menu that has customisable Ramen toppings; and small and standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with self service; pork and chicken broth options; price of $20,00; a menu that has customisable Ramen toppings; and small and standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with self service; pork broth (standard) only ; price of $15,00; a menu that has customisable Ramen toppings; and only standard sizes?","For your next Ramen restaurant visit; how likely are you to visit: A restaurant with table service; pork broth (standard) only; price of $20,00; a menu that has standardised Ramen toppings; and only standard sizes?",Start Date (UTC),Submit Date (UTC),Network ID
0,08dbf69a7384c32d3c2ec2d834b0dcba,6,4,7,2,1,9,4,1,3,1,2017-11-08 12:23:53,2017-11-08 12:25:47,878b1d5959
1,591690f65e3c7171880284c5668df3ac,8,7,9,6,5,7,8,5,7,9,2017-11-08 12:40:35,2017-11-08 12:42:44,878b1d5959
2,877f91ca3fcea712a65fa59a31dc5482,5,5,3,9,8,4,4,3,6,6,2017-11-08 15:16:41,2017-11-08 15:19:34,d0ce5c7452
3,7e912efc32dd8d35a7f4b3d58254b4fd,5,4,6,6,4,6,7,4,5,3,2017-11-09 01:20:28,2017-11-09 01:21:20,bb777eb0aa
4,ce79d14d074c13dc2e67e8319e12d100,8,7,6,6,6,7,8,6,8,6,2017-11-09 01:20:30,2017-11-09 01:22:22,5b8716d979
5,1cd932994448e48b283b1450ec7a181d,6,5,4,2,2,6,7,2,5,3,2017-11-09 01:49:08,2017-11-09 01:50:29,d06a79968d
6,88cbfeb0f93b260e133ef0f446669d17,6,6,6,7,7,5,7,7,6,7,2017-11-09 01:49:44,2017-11-09 01:52:38,7b7fa4c64a
7,5de66118c1e8d828b25567e0c9e9ad26,5,5,4,3,3,4,6,5,5,5,2017-11-09 02:06:24,2017-11-09 02:07:36,1054341075
8,a42a1684372857bdc49a46f2eb57452c,5,5,4,4,3,5,4,4,3,4,2017-11-09 02:07:14,2017-11-09 02:10:22,141a732ba0
9,ce05a3715e3a6bbe383abe48e77a3a2a,6,6,6,6,6,6,6,6,6,6,2017-11-09 02:11:32,2017-11-09 02:12:01,951b23b112


In [3]:
# Get all questions, each in one row, and a column for each type
terms = ["table_service", "self_service",
        "pork_and_chicken_broth","pork_broth_standard",
        "price_15","price_13","price_20",
        "custom_menu_ramen_toppings", "standard_menu_ramen_toppings",
        "small_and_standard_sizes",  "standard_sizes"]

questions = list(survey)[1:len(list(survey))-3]

def clean_survey(questions, terms):    
    clean_tokens = []
       
    for respondent in range(len(list(survey.index))):  
                
        for q in questions:
            tokens = [respondent] + [q]

            q = q.replace("table service", "table_service")
            q = q.replace("self service", "self_service")

            q = q.replace("pork and chicken broth", "pork_and_chicken_broth")    
            q = q.replace("pork broth (standard)", "pork_broth_standard")

            q = q.replace("price of $15,00", "price_15")
            q = q.replace("price of $13,00", "price_13")
            q = q.replace("price of $20,00", "price_20")

            q = q.replace("menu that has customisable Ramen toppings", "custom_menu_ramen_toppings")
            q = q.replace("menu that has standardised Ramen toppings", "standard_menu_ramen_toppings")

            q = q.replace("small and standard sizes", "small_and_standard_sizes")
            q = q.replace("only standard sizes", "standard_sizes")       


            for token in word_tokenize(q):
                if token in terms:
                    tokens.append(token)  

            clean_tokens.append(tokens)            
            
        
    return clean_tokens                

clean_tokens = clean_survey(questions, terms)

clean_questions = pd.DataFrame(clean_tokens)
clean_questions.columns = ['respondent','question' ,'Service','Serving', 'Price', 'Menu', 'Portion']
clean_questions


Unnamed: 0,respondent,question,Service,Serving,Price,Menu,Portion
0,0,For your next Ramen restaurant visit; how like...,table_service,pork_and_chicken_broth,price_15,custom_menu_ramen_toppings,small_and_standard_sizes
1,0,For your next Ramen restaurant visit; how like...,table_service,pork_and_chicken_broth,price_15,standard_menu_ramen_toppings,small_and_standard_sizes
2,0,For your next Ramen restaurant visit; how like...,self_service,pork_and_chicken_broth,price_13,standard_menu_ramen_toppings,standard_sizes
3,0,For your next Ramen restaurant visit; how like...,self_service,pork_broth_standard,price_20,standard_menu_ramen_toppings,small_and_standard_sizes
4,0,For your next Ramen restaurant visit; how like...,table_service,pork_and_chicken_broth,price_20,custom_menu_ramen_toppings,standard_sizes
5,0,For your next Ramen restaurant visit; how like...,self_service,pork_and_chicken_broth,price_13,custom_menu_ramen_toppings,standard_sizes
6,0,For your next Ramen restaurant visit; how like...,table_service,pork_broth_standard,price_13,custom_menu_ramen_toppings,small_and_standard_sizes
7,0,For your next Ramen restaurant visit; how like...,self_service,pork_and_chicken_broth,price_20,custom_menu_ramen_toppings,small_and_standard_sizes
8,0,For your next Ramen restaurant visit; how like...,self_service,pork_broth_standard,price_15,custom_menu_ramen_toppings,standard_sizes
9,0,For your next Ramen restaurant visit; how like...,table_service,pork_broth_standard,price_20,standard_menu_ramen_toppings,standard_sizes


In [4]:
survey2 = survey[list(range(len(list(survey))-3))].T
survey2.index.names = ['question']
survey2 = survey2[1:].reset_index()
survey2 = pd.melt(survey2, ["question"], var_name="respondent", value_name="score")
survey2

Unnamed: 0,question,respondent,score
0,For your next Ramen restaurant visit; how like...,0,6
1,For your next Ramen restaurant visit; how like...,0,4
2,For your next Ramen restaurant visit; how like...,0,7
3,For your next Ramen restaurant visit; how like...,0,2
4,For your next Ramen restaurant visit; how like...,0,1
5,For your next Ramen restaurant visit; how like...,0,9
6,For your next Ramen restaurant visit; how like...,0,4
7,For your next Ramen restaurant visit; how like...,0,1
8,For your next Ramen restaurant visit; how like...,0,3
9,For your next Ramen restaurant visit; how like...,0,1


In [6]:
clean_questions["id"] = clean_questions["respondent"].astype(str) + clean_questions["question"].astype(str)
survey2["id"] = survey2["respondent"].astype(str) +survey2["question"].astype(str)
survey2 = survey2[["id", "score"]]

survey3 = pd.merge(clean_questions, survey2, on="id")
survey3["score"] = survey3["score"].astype(int)
survey3


Unnamed: 0,respondent,question,Service,Serving,Price,Menu,Portion,id,score
0,0,For your next Ramen restaurant visit; how like...,table_service,pork_and_chicken_broth,price_15,custom_menu_ramen_toppings,small_and_standard_sizes,0For your next Ramen restaurant visit; how lik...,6
1,0,For your next Ramen restaurant visit; how like...,table_service,pork_and_chicken_broth,price_15,standard_menu_ramen_toppings,small_and_standard_sizes,0For your next Ramen restaurant visit; how lik...,4
2,0,For your next Ramen restaurant visit; how like...,self_service,pork_and_chicken_broth,price_13,standard_menu_ramen_toppings,standard_sizes,0For your next Ramen restaurant visit; how lik...,7
3,0,For your next Ramen restaurant visit; how like...,self_service,pork_broth_standard,price_20,standard_menu_ramen_toppings,small_and_standard_sizes,0For your next Ramen restaurant visit; how lik...,2
4,0,For your next Ramen restaurant visit; how like...,table_service,pork_and_chicken_broth,price_20,custom_menu_ramen_toppings,standard_sizes,0For your next Ramen restaurant visit; how lik...,1
5,0,For your next Ramen restaurant visit; how like...,self_service,pork_and_chicken_broth,price_13,custom_menu_ramen_toppings,standard_sizes,0For your next Ramen restaurant visit; how lik...,9
6,0,For your next Ramen restaurant visit; how like...,table_service,pork_broth_standard,price_13,custom_menu_ramen_toppings,small_and_standard_sizes,0For your next Ramen restaurant visit; how lik...,4
7,0,For your next Ramen restaurant visit; how like...,self_service,pork_and_chicken_broth,price_20,custom_menu_ramen_toppings,small_and_standard_sizes,0For your next Ramen restaurant visit; how lik...,1
8,0,For your next Ramen restaurant visit; how like...,self_service,pork_broth_standard,price_15,custom_menu_ramen_toppings,standard_sizes,0For your next Ramen restaurant visit; how lik...,3
9,0,For your next Ramen restaurant visit; how like...,table_service,pork_broth_standard,price_20,standard_menu_ramen_toppings,standard_sizes,0For your next Ramen restaurant visit; how lik...,1


<a id='patworths'></a>
## 3 Get Partworths by respondent

In [7]:
concepts = list(survey3)
concepts.remove("respondent")
concepts.remove("question")
concepts.remove("id")
concepts.remove("score")

def get_raw_util(concepts, survey):
    raw_utils = pd.DataFrame()
    
    for i in survey["respondent"].unique():
        respondent_results = survey.loc[survey["respondent"] == i]
        respondent_utils = pd.DataFrame()
        
        for concept in concepts:
            temp = respondent_results.groupby(["respondent",concept],as_index = False)[["score"]].agg(['mean'])
            temp.columns = temp.columns.droplevel(-1)
            temp = temp.reset_index()
            temp["concept"] = [concept]*len(temp)
            temp = temp.rename(columns={concept: 'level'})
            min_val = min(temp["score"])
            temp["scaled within"] = temp["score"] - min_val
            max_val = max(temp["scaled within"])
            temp["max_scaled_within"] = 0
            temp.loc[temp["scaled within"] == max_val, 'max_scaled_within'] = max_val 

            respondent_utils = pd.concat([respondent_utils,temp])
        
        sum_max_val = sum(respondent_utils["max_scaled_within"])
        respondent_utils["scaled across"] = 100 * respondent_utils["scaled within"] / sum_max_val
        
        
        raw_utils = pd.concat([raw_utils, respondent_utils])
    
    return raw_utils[["respondent", "concept", "level", "score", "scaled within", "scaled across"]]

raw_utils = get_raw_util(concepts, survey3)
raw_utils

Unnamed: 0,respondent,concept,level,score,scaled within,scaled across
0,0,Service,self_service,4.400000,1.200000,11.900826
1,0,Service,table_service,3.200000,0.000000,0.000000
0,0,Serving,pork_and_chicken_broth,4.666667,2.166667,21.487603
1,0,Serving,pork_broth_standard,2.500000,0.000000,0.000000
0,0,Price,price_13,6.666667,5.416667,53.719008
1,0,Price,price_15,4.333333,3.083333,30.578512
2,0,Price,price_20,1.250000,0.000000,0.000000
0,0,Menu,custom_menu_ramen_toppings,4.000000,0.500000,4.958678
1,0,Menu,standard_menu_ramen_toppings,3.500000,0.000000,0.000000
0,0,Portion,small_and_standard_sizes,3.400000,0.000000,0.000000


In [116]:
utilities = raw_utils.pivot(index='respondent', columns='level')['scaled across'].reset_index()
utilities = utilities.dropna(axis=0, how='any') # Some respondents gave the same answer to all questions.
utilities

level,respondent,custom_menu_ramen_toppings,pork_and_chicken_broth,pork_broth_standard,price_13,price_15,price_20,self_service,small_and_standard_sizes,standard_menu_ramen_toppings,standard_sizes,table_service
0,0,4.958678,21.487603,0.0,53.719008,30.578512,0.0,11.900826,0.0,0.0,7.933884,0.0
1,1,0.0,0.0,14.184397,37.234043,23.049645,0.0,0.0,0.0,23.049645,12.765957,12.765957
2,2,0.0,0.0,26.536313,0.0,27.932961,47.486034,0.0,0.0,12.569832,3.351955,10.055866
3,3,10.121457,0.0,10.121457,50.607287,10.121457,0.0,19.433198,9.716599,0.0,0.0,0.0
4,4,24.663677,0.0,8.96861,26.90583,44.843049,0.0,0.0,10.762332,0.0,0.0,10.762332
5,5,19.886364,0.0,1.420455,58.238636,52.556818,0.0,0.0,6.818182,0.0,0.0,13.636364
6,6,0.0,0.0,22.875817,0.0,0.0,39.215686,0.0,15.686275,6.535948,0.0,15.686275
7,7,13.736264,0.0,13.736264,21.978022,32.967033,0.0,0.0,19.78022,0.0,0.0,19.78022
8,8,0.0,20.833333,0.0,20.833333,20.833333,0.0,0.0,21.428571,8.928571,0.0,7.142857
10,10,18.248175,0.0,0.0,72.992701,36.49635,0.0,4.379562,0.0,0.0,4.379562,0.0


In [117]:
utilities = utilities.T.reset_index()[1:]
utilities["attribute_name"] = ["Menu", "Serving", "Serving", "Price", "Price", "Price", "Service", "Portion", "Menu", "Portion", "Service" ]
cols = list(utilities)[-1:] + list(utilities)[:-1]
df = utilities[cols]
df

Unnamed: 0,attribute_name,level,0,1,2,3,4,5,6,7,...,40,41,42,43,44,45,46,47,48,49
1,Menu,custom_menu_ramen_toppings,4.958678,0.0,0.0,10.121457,24.663677,19.886364,0.0,13.736264,...,11.547344,13.274336,8.395522,7.952286,9.499136,9.940358,3.405995,17.064846,8.865248,0.0
2,Serving,pork_and_chicken_broth,21.487603,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,17.321016,24.336283,13.059701,22.862823,9.499136,14.910537,20.435967,25.59727,4.432624,13.157895
3,Serving,pork_broth_standard,0.0,14.184397,26.536313,10.121457,8.96861,1.420455,22.875817,13.736264,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Price,price_13,53.719008,37.234043,0.0,50.607287,26.90583,58.238636,0.0,21.978022,...,60.046189,46.460177,60.634328,59.642147,56.131261,65.606362,40.190736,40.955631,61.170213,59.210526
5,Price,price_15,30.578512,23.049645,27.932961,10.121457,44.843049,52.556818,0.0,32.967033,...,32.332564,24.336283,30.783582,35.785288,14.680484,33.797217,21.117166,10.238908,18.617021,0.0
6,Price,price_20,0.0,0.0,47.486034,0.0,0.0,0.0,39.215686,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,49.342105
7,Service,self_service,11.900826,0.0,0.0,19.433198,0.0,0.0,0.0,0.0,...,5.542725,7.964602,15.671642,4.771372,14.507772,4.771372,22.888283,0.0,10.638298,3.947368
8,Portion,small_and_standard_sizes,0.0,0.0,0.0,9.716599,10.762332,6.818182,15.686275,19.78022,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Menu,standard_menu_ramen_toppings,0.0,23.049645,12.569832,0.0,0.0,0.0,6.535948,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,19.736842
10,Portion,standard_sizes,7.933884,12.765957,3.351955,0.0,0.0,0.0,0.0,0.0,...,5.542725,7.964602,2.238806,4.771372,10.362694,4.771372,13.079019,2.047782,14.893617,3.947368


<a id='competitors'></a>
## 4 Simulate Competitors

In [82]:
# GET COMPETITORS

def get_combinations(df):
    attributes = df["attribute_name"].unique()
    attributes_dict = defaultdict(list)
    
    for attribute in attributes:
        attributes_dict[attribute] = df.loc[df["attribute_name"] == attribute]["level"].unique()

    for k, v in attributes_dict.items():
        level_assignment = "levels_" + str(k) + " = v"
        exec(level_assignment)
            
    combinations_txt = ""        
    for k in attributes_dict.keys():
        combinations_txt += "levels_" + str(k) + "," 
        
    output = pd.DataFrame(list(itertools.product(*eval(combinations_txt[:-1]))))

    output.columns = attributes
    
    return output

competitors = get_combinations(df)
competitors["Competitor"] = "Competitor_" + competitors.index.astype(str)
cols = list(competitors)[-1:] + list(competitors)[:-1]
competitors = competitors[cols]
competitors = competitors.T
competitors.columns = competitors.iloc[0]
competitors = competitors[1:]

competitors

Competitor,Competitor_0,Competitor_1,Competitor_2,Competitor_3,Competitor_4,Competitor_5,Competitor_6,Competitor_7,Competitor_8,Competitor_9,...,Competitor_38,Competitor_39,Competitor_40,Competitor_41,Competitor_42,Competitor_43,Competitor_44,Competitor_45,Competitor_46,Competitor_47
Menu,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,custom_menu_ramen_toppings,...,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings,standard_menu_ramen_toppings
Serving,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,pork_and_chicken_broth,...,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard,pork_broth_standard
Price,price_13,price_13,price_13,price_13,price_15,price_15,price_15,price_15,price_20,price_20,...,price_13,price_13,price_15,price_15,price_15,price_15,price_20,price_20,price_20,price_20
Service,self_service,self_service,table_service,table_service,self_service,self_service,table_service,table_service,self_service,self_service,...,table_service,table_service,self_service,self_service,table_service,table_service,self_service,self_service,table_service,table_service
Portion,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,...,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes,small_and_standard_sizes,standard_sizes


<a id='utils'></a>
## 5 Get Utilities

<a id='utilities'></a>
### 5.1 Calculate Utility by Respondent

In [199]:
def calc_respondent_util(df, competitors):
    respondents = list(df)[2:]
    final_output = pd.DataFrame()
        
    for respondent in respondents:
        output = pd.DataFrame()
            
        for competitor in list(competitors):
            temp_competitor = competitors[[competitor]]
            temp_competitor.columns = ["level"]
            respondent_competitor = pd.merge(temp_competitor, \
                                             df[["level",respondent]],\
                                             on="level",
                                             how = "left")
            
            respondent_competitor["Competitor"] = competitor
            cols = list(respondent_competitor)[-1:] + list(respondent_competitor)[:-1]
            respondent_competitor = respondent_competitor[cols]
            
            output = pd.concat([output, respondent_competitor], axis=0)

        output = output.groupby(["Competitor"],as_index = False)[[respondent]].agg('sum')
        competitor_col = output["Competitor"]
        
        final_output = pd.concat([final_output, output[[respondent]]], axis=1)
    
    final_output["Competitor"] = competitor_col
    cols = list(final_output)[-1:] + list(final_output)[:-1]
    
    return final_output[cols]
    
final_output = calc_respondent_util(df, competitors)
final_output

Unnamed: 0,Competitor,0,1,2,3,4,5,6,7,8,...,40,41,42,43,44,45,46,47,48,49
0,Competitor_0,92.066116,37.234043,0.0,89.878543,62.331839,84.943182,15.686275,55.494505,63.095238,...,94.457275,92.035398,97.761194,95.228628,89.637306,95.228628,86.920981,83.617747,85.106383,76.315789
1,Competitor_1,100.0,50.0,3.351955,80.161943,51.569507,78.125,0.0,35.714286,41.666667,...,100.0,100.0,100.0,100.0,100.0,100.0,100.0,85.665529,100.0,80.263158
2,Competitor_10,26.446281,12.765957,57.541899,19.838057,46.188341,40.340909,70.588235,53.296703,49.404762,...,28.86836,37.610619,21.455224,30.815109,18.998273,24.850895,23.841962,56.996587,13.297872,62.5
3,Competitor_11,34.380165,25.531915,60.893855,10.121457,35.426009,33.522727,54.901961,33.516484,27.97619,...,34.411085,45.575221,23.69403,35.586481,29.360967,29.622266,36.920981,59.044369,28.191489,66.447368
4,Competitor_12,70.578512,51.41844,26.536313,100.0,71.300448,86.363636,38.562092,69.230769,42.261905,...,77.136259,67.699115,84.701493,72.365805,80.138169,80.318091,66.485014,58.020478,80.673759,63.157895
5,Competitor_13,78.512397,64.184397,29.888268,90.283401,60.538117,79.545455,22.875817,49.450549,20.833333,...,82.678984,75.663717,86.940299,77.137177,90.500864,85.089463,79.564033,60.068259,95.567376,67.105263
6,Competitor_14,58.677686,64.184397,36.592179,80.566802,82.06278,100.0,54.248366,89.010989,49.404762,...,71.593533,59.734513,69.029851,67.594433,65.630397,75.54672,43.59673,72.354949,70.035461,59.210526
7,Competitor_15,66.61157,76.950355,39.944134,70.850202,71.300448,93.181818,38.562092,69.230769,27.97619,...,77.136259,67.699115,71.268657,72.365805,75.993092,80.318091,56.675749,74.40273,84.929078,63.157895
8,Competitor_16,47.438017,37.234043,54.469274,59.51417,89.237668,80.681818,38.562092,80.21978,42.261905,...,49.422633,45.575221,54.850746,48.508946,38.687392,48.508946,47.411444,27.303754,38.120567,3.947368
9,Competitor_17,55.371901,50.0,57.821229,49.797571,78.475336,73.863636,22.875817,60.43956,20.833333,...,54.965358,53.539823,57.089552,53.280318,49.050086,53.280318,60.490463,29.351536,53.014184,7.894737


In [208]:
final_output.to_csv('final_output.csv')


<a id='arithmetic'></a>
### 5.2 Calculate Arithmetic Share

In [286]:
def get_aritmetic_share(df):
    respondents = list(df)[1:]
    output = pd.DataFrame()
    output["Competitor"] = df["Competitor"]
    output_total = output.copy()
    
    for respondent in respondents:
        output[respondent] = df[respondent]/sum(df[respondent])
    
    output_total["Sum_Share"] = np.sum(output, axis = 1)
    output_total["Real_Share"] = 100*round(output_total["Sum_Share"] / sum(output_total["Sum_Share"]),5)
        
    return output, output_total
        
Aritmetic_Shares, Aritmetic_Shares_total  = get_aritmetic_share(final_output)
Aritmetic_Shares
Aritmetic_Shares_total.sort_values(by = "Real_Share", ascending = False)

Unnamed: 0,Competitor,Sum_Share,Real_Share
12,Competitor_2,1.5426,3.214
0,Competitor_0,1.526004,3.179
23,Competitor_3,1.525538,3.178
1,Competitor_1,1.508941,3.144
19,Competitor_26,1.360246,2.834
6,Competitor_14,1.348649,2.81
17,Competitor_24,1.34365,2.799
20,Competitor_27,1.343184,2.798
4,Competitor_12,1.332053,2.775
7,Competitor_15,1.331587,2.774


In [256]:
Aritmetic_Shares.to_csv('Aritmetic_Shares.csv')
Aritmetic_Shares_total.to_csv('Aritmetic_Shares_total.csv')


<a id='logit'></a>
### 5.3 Calculate Logit Share

In [287]:
def get_logit_share(df):
    respondents = list(df)[1:]
    output = pd.DataFrame()
    output["Competitor"] = df["Competitor"]
    output_total = output.copy()
    
    for respondent in respondents:
        output[respondent] = np.exp(df[respondent])    
    
    return get_aritmetic_share(output)
       
Logit_Shares, Logit_Shares_total  = get_logit_share(final_output)
Logit_Shares
Logit_Shares_total.sort_values(by = "Real_Share", ascending = False)

Unnamed: 0,Competitor,Sum_Share,Real_Share
1,Competitor_1,16.68958,34.77
12,Competitor_2,5.644008,11.758
44,Competitor_6,4.919488,10.249
0,Competitor_0,2.981136,6.211
10,Competitor_18,2.25557,4.699
18,Competitor_25,1.936534,4.034
24,Competitor_30,1.767948,3.683
23,Competitor_3,1.593919,3.321
4,Competitor_12,1.218971,2.54
33,Competitor_39,1.124993,2.344


In [281]:
Logit_Shares.to_csv('Logit_Shares.csv')
Logit_Shares_total.to_csv('Logit_Shares_total.csv')

<a id='choice'></a>
### 5.4 Calculate Choice Share

In [349]:
def get_prob_share(df):
    respondents = list(df)[1:]
    output = df.copy()
    
    for respondent in respondents:
        output.loc[(output[respondent] == max(output[respondent])), str(respondent)] = 1
        output.loc[(output[respondent] != max(output[respondent])), str(respondent)] = 0
        
        if sum(output[str(respondent)])>1:
            output[str(respondent)] = output[str(respondent)]/sum(output[str(respondent)])
        
    output = output[list(output)[49:]]
    output["Competitor"] = df["Competitor"]
    cols = list(output)[-1:] + list(output)[:-1]
    output = output[cols]

    return get_aritmetic_share(output)

Choice_Shares, Choice_Shares_total  = get_prob_share(final_output)
Choice_Shares
Choice_Shares_total.sort_values(by = "Real_Share", ascending = False)

Unnamed: 0,Competitor,Sum_Share,Real_Share
1,Competitor_1,17.0,35.417
12,Competitor_2,5.625,11.719
44,Competitor_6,5.0,10.417
0,Competitor_0,3.0,6.25
10,Competitor_18,2.25,4.688
18,Competitor_25,2.0,4.167
24,Competitor_30,1.75,3.646
23,Competitor_3,1.625,3.385
33,Competitor_39,1.125,2.344
6,Competitor_14,1.125,2.344


In [351]:
Choice_Shares.to_csv('Choice_Shares.csv')
Choice_Shares_total.to_csv('Choice_Shares_total.csv')


The combination of "Competitor 1" have the highest Market Share both in the Logit as well as in the Choice model. Because of this this differentiation strategy is chosen: 



<img src="selected.PNG">
