# Marketing Data Science Modeling
## Measuring and Modeling Individual Preferences

**Conjoint measurement**, a critical tool of marketing data science, focuses on buyers or the demand side of markets. Primary applications of conjoint analysis fall under the headings of new product design and pricing research.

A linear model fit to preference rankings is an example of traditional conjoint analysis, a modeling technique designed to show how product attributes affect purchasing decisions. Conjoint analysis is really conjoint measurement. Marketing analysts present product profiles to consumers. Product profiles are defined by their attributes. By ranking, rating, or choosing products,
consumers reveal their preferences for products and the corresponding attributes that define products. The computed attribute importance values and part-worths associated with levels of attributes represent measurements that are obtained as a group or jointly—thus the name conjoint analysis. The task—ranking, rating, or choosing—can take many forms.

When doing conjoint analysis, we utilize sum contrasts, so that the sum of the fitted regression coefficients across the levels of each attribute is zero. The fitted regression coefficients represent conjoint measures of utility called part-worths. Part-worths reflect the strength of individual consumer preferences for each level of each attribute in the study. Positive part-worths add to a product’s value in the mind of the consumer. Negative part-worths subtract from that value. When we sum across the part-worths of a product, we obtain a measure of the utility or benefit to the consumer.

### Libraries

In [1]:
# prepare for Python version 3x features and functions
from __future__ import division, print_function

# import packages for analysis and modeling
import pandas as pd                    # data frame operations
import numpy as np                     # arrays and math functions
import statsmodels.api as sm           # statistical models (including regression)
import statsmodels.formula.api as smf  # R-like model specification
from patsy.contrasts import Sum
from enum import unique

import warnings
warnings.filterwarnings("ignore")

### Read Data

In [2]:
conjoint_data_frame = pd.read_csv('data/mobile_services_ranking.csv')
conjoint_data_frame.head()

Unnamed: 0,brand,startup,monthly,service,retail,apple,samsung,google,ranking
0,"""AT&T""","""$100""","""$100""","""4G NO""","""Retail NO""","""Apple NO""","""Samsung NO""","""Nexus NO""",11
1,"""Verizon""","""$300""","""$100""","""4G NO""","""Retail YES""","""Apple YES""","""Samsung YES""","""Nexus NO""",12
2,"""US Cellular""","""$400""","""$200""","""4G NO""","""Retail NO""","""Apple NO""","""Samsung YES""","""Nexus NO""",9
3,"""Verizon""","""$400""","""$400""","""4G YES""","""Retail YES""","""Apple NO""","""Samsung NO""","""Nexus NO""",2
4,"""Verizon""","""$200""","""$300""","""4G NO""","""Retail NO""","""Apple NO""","""Samsung YES""","""Nexus YES""",8


In [3]:
conjoint_data_frame.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 9 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   brand    16 non-null     object
 1   startup  16 non-null     object
 2   monthly  16 non-null     object
 3   service  16 non-null     object
 4   retail   16 non-null     object
 5   apple    16 non-null     object
 6   samsung  16 non-null     object
 7   google   16 non-null     object
 8   ranking  16 non-null     int64 
dtypes: int64(1), object(8)
memory usage: 1.2+ KB


### Fit the Model

In [4]:
# Set up sum contrasts for effects coding as needed for conjoint analysis using C(effect, Sum) 
# notation within main effects model specification
main_effects_model = 'ranking ~ C(brand, Sum) + C(startup, Sum) +  \
    C(monthly, Sum) + C(service, Sum) + C(retail, Sum) + C(apple, Sum) + \
    C(samsung, Sum) + C(google, Sum)'

# Fit linear regression model using main effects only (no interaction terms)
main_effects_model_fit = \
    smf.ols(main_effects_model, data = conjoint_data_frame).fit()

print(main_effects_model_fit.summary()) 

                            OLS Regression Results                            
Dep. Variable:                ranking   R-squared:                       0.999
Model:                            OLS   Adj. R-squared:                  0.989
Method:                 Least Squares   F-statistic:                     97.07
Date:                Mon, 11 Jul 2022   Prob (F-statistic):             0.0794
Time:                        18:56:16   Log-Likelihood:                 10.568
No. Observations:                  16   AIC:                             8.864
Df Residuals:                       1   BIC:                             20.45
Df Model:                          14                                         
Covariance Type:            nonrobust                                         
                                      coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------------------------
Intercept 

### Build Part-Worth Information Attribute

In [5]:
conjoint_attributes = ['brand', 'startup', 'monthly', 'service', 'retail', 'apple', 'samsung', 'google']

# build part-worth information one attribute at a time
level_name = []
part_worth = []
part_worth_range = []
end = 1  # initialize index for coefficient in params
for item in conjoint_attributes:
    nlevels = len(list(conjoint_data_frame[item].unique()))
    level_name.append(list(conjoint_data_frame[item].unique())) 
    begin = end 
    end = begin + nlevels - 1
    new_part_worth = list(main_effects_model_fit.params[begin:end])
    new_part_worth.append((-1) * sum(new_part_worth))  
    part_worth_range.append(max(new_part_worth) - min(new_part_worth))  
    part_worth.append(new_part_worth)   
    # end set to begin next iteration
    
# compute attribute relative importance values from ranges
attribute_importance = []
for item in part_worth_range:
    attribute_importance.append(round(100 * (item / sum(part_worth_range)),2))
    
# user-defined dictionary for printing descriptive attribute names     
effect_name_dict = {'brand' : 'Mobile Service Provider', \
    'startup' : 'Start-up Cost', 'monthly' : 'Monthly Cost', \
    'service' : 'Offers 4G Service', 'retail' : 'Has Nearby Retail Store', \
    'apple' : 'Sells Apple Products', 'samsung' : 'Sells Samsung Products', \
    'google' : 'Sells Google/Nexus Products'}  
 
# report conjoint measures to console 
index = 0  # initialize for use in for-loop
for item in conjoint_attributes:
    print('\nAttribute:', effect_name_dict[item])
    print('    Importance:', attribute_importance[index])
    print('    Level Part-Worths')
    for level in range(len(level_name[index])):
        print('       ',level_name[index][level], round(part_worth[index][level],2))       
    index = index + 1


Attribute: Mobile Service Provider
    Importance: 2.38
    Level Part-Worths
        "AT&T" -0.0
        "Verizon" -0.25
        "US Cellular" 0.0
        "T-Mobile" 0.25

Attribute: Start-up Cost
    Importance: 7.14
    Level Part-Worths
        "$100" 0.75
        "$300" -0.0
        "$400" -0.0
        "$200" -0.75

Attribute: Monthly Cost
    Importance: 51.19
    Level Part-Worths
        "$100" 5.0
        "$200" 2.0
        "$400" -1.25
        "$300" -5.75

Attribute: Offers 4G Service
    Importance: 16.67
    Level Part-Worths
        "4G NO" -1.75
        "4G YES" 1.75

Attribute: Has Nearby Retail Store
    Importance: 2.38
    Level Part-Worths
        "Retail NO" 0.25
        "Retail YES" -0.25

Attribute: Sells Apple Products
    Importance: 2.38
    Level Part-Worths
        "Apple NO" 0.25
        "Apple YES" -0.25

Attribute: Sells Samsung Products
    Importance: 10.71
    Level Part-Worths
        "Samsung NO" -1.12
        "Samsung YES" 1.12

Attribute: Sells Go