# Conjoint Analysis

<p>There are 5 stages for conducting the analysis:</p>
<ol>
<li> defining the profiles - understand what are the characteristics that you want to compare and what are the levels of each of those features.
<li> survey - use one of the survey tools, e.g. Google forms, to ask your customers to rank/rate the listed profiles (attention: not all theoretically possible profiles should be listed, choose only those that are reasonable).
<li> data transformation - use a data analysis tool (Excvel in our case) to trnasform the data into "analysis friendly" form (0s and 1s in our case).
<li> <b>estimation - use the transformed data to estimate utilities (using Python and statsmodels library in our case).</b>
<li> calculation of importance - use the results of the analysis to calculate the importanc of each feature.
</ol>

The estimation can be conducted using different econometric/data science algorithms (Linear Regression, PLS, ordered logit, ANOVA etc.).  This notebooks shows how to perform analysis using **Linear Regression** technique, as it's one of the simpliest methods, yet still the most popular technique used.

# Linear Regression

In [1]:
from statsmodels.formula.api import ols

In [3]:
import pandas as pd
Conjoint_data=pd.read_excel("conjoint_data.xlsx", sheetname="Inputs")
Conjoint_data.head()

  **kwds)


Unnamed: 0,Rank,A1,A2,A3,B1,B2,C1,C2,C3
0,2,1,0,0,1,0,1,0,0
1,3,1,0,0,1,0,0,1,0
2,1,1,0,0,1,0,0,0,1
3,5,1,0,0,0,1,1,0,0
4,6,1,0,0,0,1,0,1,0


In [4]:
# now, let's specify the linear model using hte imported ols() function
# the function gets two arguments: the model specification and the data used
model_ols = ols(formula="Rank ~ A1 + A2 + A3 +B1 + B2 + C1 + C2 + C3", data=Conjoint_data)

In [7]:
# let's used the model above to fit it to our data
our_results = model_ols.fit()

In [8]:
# it fitted, thus, we can see the summary of the results now
our_results.summary()

  "anyway, n=%i" % int(n))


0,1,2,3
Dep. Variable:,Rank,R-squared:,1.0
Model:,OLS,Adj. R-squared:,1.0
Method:,Least Squares,F-statistic:,3.524e+30
Date:,"Fri, 14 Sep 2018",Prob (F-statistic):,1.4599999999999998e-180
Time:,21:24:12,Log-Likelihood:,569.98
No. Observations:,18,AIC:,-1128.0
Df Residuals:,12,BIC:,-1123.0
Df Model:,5,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,4.3846,5.7e-16,7.69e+15,0.000,4.385,4.385
A1,-4.5385,1.76e-15,-2.58e+15,0.000,-4.538,-4.538
A2,1.4615,1.76e-15,8.31e+14,0.000,1.462,1.462
A3,7.4615,1.76e-15,4.24e+15,0.000,7.462,7.462
B1,0.6923,1.27e-15,5.46e+14,0.000,0.692,0.692
B2,3.6923,1.27e-15,2.91e+15,0.000,3.692,3.692
C1,1.4615,1.76e-15,8.31e+14,0.000,1.462,1.462
C2,2.4615,1.76e-15,1.4e+15,0.000,2.462,2.462
C3,0.4615,1.76e-15,2.62e+14,0.000,0.462,0.462

0,1,2,3
Omnibus:,3.315,Durbin-Watson:,0.609
Prob(Omnibus):,0.191,Jarque-Bera (JB):,1.43
Skew:,0.29,Prob(JB):,0.489
Kurtosis:,1.747,Cond. No.,9.92e+16


In [9]:
# for conjoint analysis what we are interested in most are the estimated coefficients/parameters
# so let's get the parameters and save them as a new variable
coef = our_results.params

In [10]:
# the type of this variable is not a DataFrame. Let's convert it.
coef_DF = pd.DataFrame(coef)

In [11]:
#once it is a DataFrame, we can already save it and continue to analyse in Excel
coef_DF.to_excel("coef.xlsx")