In [23]:
import pandas as pd
from simpleconjoint import cbc

## Reading and transforming dataframe

The csv data contains a conjoint example about TVs, this data was recopilated in Chile (attributes and levels are in Spanish).
It also had a no-choice alternative (column None)

<hr>

### Conjoint data:

Attributes: Levels
- Marca: `LG`, `Sony`, `Samsung`
- Sistema Operativo: `Apple TV`, `Android TV`, `Propio de la marca`
- Tecnología: `Iluminación Inteligente`, `Camara Frontal`
- Resolución: `QLED 4k Ultra HD (3840x2160)`, `Full HD (1920x1080)`, `NanoCell 4k Ultra HD (3840x2160)`, `8K Full Ultra HD (7680x4320)`, `OLED 4k Ultra HD (3840x2160)`
- Precio: `499999.0`, `599999.0`, `699999.0`, `799999.0`

<hr>

For version `0.0.1`, the cbc functions expect the following structure for the column names of attribute levels (covariates):
If it has more than one level: Attribute + _ + Level -> `Attribute_Level`
If it just the attribute/level itself: `Attribute`

For example, the level `LG` from the attribute `Marca` would be -> `Marca_LG`

In [24]:
df = pd.read_excel("cbc_data.xlsx")
df

Unnamed: 0,Num Encuestado,Tarea,Alt,Marca,Sistema Operativo,Tecnología,Resolución,Precio,None,Seleccionada
0,2,1,1,LG,Propio de la marca,Iluminación Inteligente,QLED 4k Ultra HD (3840x2160),799999.0,0,0
1,2,1,2,Sony,Apple TV,Iluminación Inteligente,Full HD (1920x1080),499999.0,0,0
2,2,1,3,Samsung,Android TV,Camara Frontal,NanoCell 4k Ultra HD (3840x2160),499999.0,0,0
3,2,1,4,Samsung,Apple TV,Camara Frontal,8K Full Ultra HD (7680x4320),699999.0,0,0
4,2,1,5,,,,,,1,1
...,...,...,...,...,...,...,...,...,...,...
14995,324,10,1,Sony,Apple TV,Camara Frontal,Full HD (1920x1080),499999.0,0,0
14996,324,10,2,Samsung,Apple TV,Camara Frontal,QLED 4k Ultra HD (3840x2160),599999.0,0,0
14997,324,10,3,Samsung,Android TV,Iluminación Inteligente,OLED 4k Ultra HD (3840x2160),699999.0,0,1
14998,324,10,4,LG,Propio de la marca,Iluminación Inteligente,QLED 4k Ultra HD (3840x2160),799999.0,0,0


In [25]:
# Getting dummy variables (could be effect-coding/sum-to-zero too for hmnl)
df_dum = pd.get_dummies(df, columns=["Marca", "Sistema Operativo", "Tecnología", "Resolución", "Precio"])

# This data has NaN for the attribute columns on the no-choice alternative.
# Because columns (covariates/levels) aren't present for the no-choice column -> 0
df_dum = df_dum.fillna(0)
df_dum

Unnamed: 0,Num Encuestado,Tarea,Alt,None,Seleccionada,Marca_LG,Marca_Samsung,Marca_Sony,Sistema Operativo_Android TV,Sistema Operativo_Apple TV,...,Tecnología_Iluminación Inteligente,Resolución_8K Full Ultra HD (7680x4320),Resolución_Full HD (1920x1080),Resolución_NanoCell 4k Ultra HD (3840x2160),Resolución_OLED 4k Ultra HD (3840x2160),Resolución_QLED 4k Ultra HD (3840x2160),Precio_499999.0,Precio_599999.0,Precio_699999.0,Precio_799999.0
0,2,1,1,0,0,1,0,0,0,0,...,1,0,0,0,0,1,0,0,0,1
1,2,1,2,0,0,0,0,1,0,1,...,1,0,1,0,0,0,1,0,0,0
2,2,1,3,0,0,0,1,0,1,0,...,0,0,0,1,0,0,1,0,0,0
3,2,1,4,0,0,0,1,0,0,1,...,0,1,0,0,0,0,0,0,1,0
4,2,1,5,1,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14995,324,10,1,0,0,0,0,1,0,1,...,0,0,1,0,0,0,1,0,0,0
14996,324,10,2,0,0,0,1,0,0,1,...,0,0,0,0,0,1,0,1,0,0
14997,324,10,3,0,1,0,1,0,1,0,...,1,0,0,0,1,0,0,0,1,0
14998,324,10,4,0,0,1,0,0,0,0,...,1,0,0,0,0,1,0,0,0,1


## Count analysis

The `count()` function performs a simple analysis to "count" how many times the attribute levels or attribute level combinations were selected over the total appearances it had, returning a two dictionaries with dataframes.
- The first one is for the attributes sent (with attribute as keys).
- And the second one for the attribute combinations (with attribute as array repr).

In choice-based conjoint, when an alternative is chosen/selected is set as `1`, `0` if it was not. For this reason, we need to know which dataframe column is for the chosen/selected alternatives -> `col_chosen` parameter, by default it goes with the column `Chosen`, for this example data, the column is `Seleccionada`

The `attribute` parameter expects a list with the attribute name, in this example it will be `Marca`, `Tecnología` and `Resolución`. It's an optional parameter.

The `attribute_combitations` parameter expects a list of lists, being each list the attribute combinations desired, for example: `[["Marca", "Precio"], ["Resolución", "Tecnología"]]`. It's an optional parameter.

You can also use the arg `to_excel=True` to save everything to a single excel named `Count_`+datetime`.xlsx`, having sheets for each attribute or attribute combinations

In [53]:
# Example variables mentioned before
col_chosen="Seleccionada"
attributes = ["Marca", "Tecnología", "Resolución"]
marca_tecnologia = ["Marca", "Tecnología"]
resolucion_tecnologia = ["Resolución", "Tecnología"]
attribute_combinations = [marca_precio, resolucion_tecnologia]

# Calling the count analysis
count_attributes, count_combinations = cbc.count(
    df=df_dum,
    col_chosen=col_chosen,
    attributes=attributes,
    attribute_combinations=attribute_combinations
)

### Count Analysis Results

As it was mentioned before, the count analysis returns 2 dictionaries with dataframes.
- count_attributes dictionary will have the attribute as keys, e.g: `count_attributes["Marca"]`
- count_combinations dictionary will have the list repr() as keys, e.g: `count_combinations[repr(marca_tecnologia)]`

Count Result is equal to `Times Chosen` / `Total Appearances`

In [48]:
count_attributes["Marca"]

Unnamed: 0,Attribute Level,Count Result,Times Chosen,Times Not Chosen,Total Appearances
0,Marca_LG,0.21046,841.0,3155.0,3996.0
1,Marca_Samsung,0.25999,1041.0,2963.0,4004.0
2,Marca_Sony,0.196,784.0,3216.0,4000.0


In [49]:
count_attributes["Tecnología"]

Unnamed: 0,Attribute Level,Count Result,Times Chosen,Times Not Chosen,Total Appearances
0,Tecnología_Camara Frontal,0.219557,1318.0,4685.0,6003.0
1,Tecnología_Iluminación Inteligente,0.224779,1348.0,4649.0,5997.0


In [55]:
count_combinations[repr(marca_tecnologia)]

Unnamed: 0,Count Result,Marca,Tecnología,Times Chosen,Times Not Chosen,Total Appearances
0,0.212958,LG,Camara Frontal,424.0,1567.0,1991.0
1,0.20798,LG,Iluminación Inteligente,417.0,1588.0,2005.0
2,0.259831,Samsung,Camara Frontal,522.0,1487.0,2009.0
3,0.26015,Samsung,Iluminación Inteligente,519.0,1476.0,1995.0
4,0.185721,Sony,Camara Frontal,372.0,1631.0,2003.0
5,0.206309,Sony,Iluminación Inteligente,412.0,1585.0,1997.0


## Hierarchical models

Everyone knows that different opinions are common between different individuals, meaning we can't expect multiple individuals to perform the same or choose the same product. 

In Non-Hierarchical models it is assumed that customer choices are probabilistic, meaning their actions are conditioned to probabilities per alternative.

On the other hand, Hierarchical models handle this problem by estimating the vector or part-worths per individual instead, that means each individual is considered unique, where percentages of his consumption will be assigned, based on his previous answers.

This is normally implemented by using bayesian inference and Markov Chain Monte Carlo for estimating priors and likelihood.
One of the most used hierarchical models is the HB or `Hierarchical Bayes` that's implemented by Sawtooth Software or the `Hierarchical Bayes Multinomial Logit` in ChoiceModelR.

### CBC Hierarchical Multinomial Logit model in simpleconjoint

The hmnl function in simpleconjoint uses the `pystan` package and the model proposed in the tutorial for R stan package: https://github.com/ksvanhorn/ART-Forum-2017-Stan-Tutorial

The model is the following:

```
data {
  int<lower=2> C; // # of alternatives (choices) in each scenario
  int<lower=1> K; // # of covariates of alternatives
  int<lower=1> R; // # of respondents
  int<lower=1> S; // # of scenarios per respondent
  int<lower=0> G; // # of respondent covariates 
  int<lower=1,upper=C> Y[R, S]; // observed choices
  matrix[C, K] X[R, S]; // matrix of attributes for each obs
  matrix[G, R] Z; // vector of covariates for each respondent
}

parameters {
  matrix[K, R] Beta;
  matrix[K, G] Theta;
  corr_matrix[K] Omega;
  vector<lower=0>[K] tau;
}
transformed parameters {
  cov_matrix[K] Sigma = quad_form_diag(Omega, tau);
}
model {
  //priors
  to_vector(Theta) ~ normal(0, 10);
  tau ~ cauchy(0, 2.5); 
  Omega ~ lkj_corr(2);
  //likelihood
  for (r in 1:R) {
    Beta[,r] ~ multi_normal(Theta*Z[,r], Sigma);	
    for (s in 1:S)
      Y[r,s] ~ categorical_logit(X[r,s]*Beta[,r]);
  }
}
```
How this model works:
- Beta is the vector with individual level part-worths / utilities to be estimated using a multivariate normal distribution: https://en.wikipedia.org/wiki/Multivariate_normal_distribution
- Theta is the population means of part-worths with a normal distribution.
- tau is the variance vector of part-worths (on a cauchy distribution)
- Sigma is the covariance matrix: https://mc-stan.org/docs/2_18/stan-users-guide/multivariate-hierarchical-priors-section.html
> The function `quad_form_diag` is defined so that `quad_form_diag(Sigma, tau)` is equivalent to `diag_matrix(tau) * Sigma * diag_matrix(tau)`, where `diag_matrix(tau)` returns the matrix with tau on the diagonal and zeroes off diagonal.
- Omega is the correlation matrix (uses lkj_corr: https://mc-stan.org/docs/2_18/functions-reference/lkj-correlation.html)

![HMNL Bayesian Inference](https://raw.githubusercontent.com/ksvanhorn/ART-Forum-2017-Stan-Tutorial/master/3_hmnl/images/Bayesian_Inference.png)

### The hmnl() function

The hmnl() function expects a dataframe with the cbc data + extra configuration arguments (check function docstring for full description) and uses the `pystan` library (that does most of the job) with the Hierarchical Multinomial Logit model previously mentioned.

Since the choice based conjoint models compare the alternatives on each task based on the what were the choices the respondants made, to perform the cbc analysis we need to know which alternatives were present and what each respondant (by ids) chose on each task, meaning our dataframe should contain those columns too. These are the parameters that are used to know this columns:
- `col_resp_id` for the Respondant IDs column name
- `col_alt` for Alternatives column name
- `col_task` for the Task column name
- `col_chosen` for the Chosen column name

It also includes a parameter for the no-choice column name `col_none`, but for this model it's being estimated along with the other attribute levels, there are another ways and models used to separately estimate this no-choice alternative. 

Like it was mentioned before, hierarchical models work with MCMC (Markov Chain Monte Carlo), so the function will need the amount of sampling iterations desired (the length of the chain) that are used for the stan model (hmnl model), the amount of samples used to reach convergence or to have sufficient precision may depend on the complexity of the data and model. Along with the iterations parameter, there are others that are used by the `pystan` library functions/classes:
- `iterations`: Integer. Amount of iterations or length of the chain, 2000 by default.
- `warmups`: Integer. Burn-in or throw in iterations. iterations//2 by default.
- `algorithm`: String. One of the alogrithms that are implemented in the Stan library. "NUTS" by default. Possible ones : "HMC", "NUTS" and "Fixed_param".
- `seed`: The seed, a positive integer for random number generation. By default, seed is random.randint(0, MAX_UINT)
- `verbose`: bool. Indicates whether intermediate output should be piped to the console. False by default.
- `control` : dict, Optional. A dictionary of parameters to control the sampler's behavior.
- `n_jobs`: Integer. Sample in parallel. If -1 all CPUs are used. -1 by default.

For further explanation on the parameters read the pystan docs on the StanModel: https://pystan2.readthedocs.io/en/latest/_modules/pystan/model.html#StanModel 

Note: `n_jobs` should be <= `chains`, this is used to use cpus on multiple chains at the same time. Multithreading is an experimental feature on pystan 2.18+ so it's not enabled in the `hmnl()` function.

#### HMNL_Result object

The hmnl() function will return an `HMNL_Result` initialized object, having the following attributes:
- `stan_fit`: fit result of the hmnl model.
- `attributes`: conjoint attributes list 
- `covariates`: conjoint covariates list

To get the full result of this you should print the stan_fit attribute, e.g: if you saved the return in a variable named `result` -> `print(result.stan_fit)` will show the full summary.

To get the stan summary dict -> `result.summary`

To get the dataframe of the individual utilities -> `result.individual_utilities` or `result.get_individual_utilities()`

To get the dataframe of the individual importances -> `result.individual_importances` or `result.get_individual_importances()`

In [58]:
# Using the hmnl function with the example data:
# Number of iterations should be a lot higher but remember it's just an example.

result = cbc.hmnl(
    df=df_dum,
    col_resp_id="Num Encuestado",
    col_task="Tarea",
    col_none="None",
    col_chosen="Seleccionada",
    iterations=10,
)

INFO:pystan:COMPILING THE C++ CODE FOR MODEL HMNL_39a01ae177bc9856402f5ad061504c36 NOW.
To run all diagnostics call pystan.check_hmc_diagnostics(fit)


In [59]:
# getting full summary
print(result.stan_fit)

Inference for Stan model: HMNL_39a01ae177bc9856402f5ad061504c36.
4 chains, each with iter=10; warmup=5; thin=1; 
post-warmup draws per chain=5, total post-warmup draws=20.

                mean se_mean      sd    2.5%     25%      50%     75%   97.5%  n_eff   Rhat
Beta[1,1]       0.42     nan    1.35   -1.81   -0.66      1.0    1.39     1.5    nan    inf
Beta[2,1]       0.14     nan    1.48   -1.33    -1.3     0.13    1.59    1.63    nan    inf
Beta[3,1]      -0.38     nan    1.38    -1.9   -1.59    -0.64    0.88    1.66    nan    inf
Beta[4,1]       0.54     nan    1.37   -1.65   -0.56     0.91    1.56    1.98    nan    inf
Beta[5,1]       -0.8     nan    1.26   -1.96    -1.9    -1.16    0.37    1.08    nan    inf
Beta[6,1]       -0.3     nan    0.82   -1.37    -1.1    -0.21    0.48    0.59    nan    inf
Beta[7,1]      -0.47     nan    1.36   -1.96   -1.73    -0.64    0.84    1.38    nan    inf
Beta[8,1]      -0.51     nan    1.32   -1.63   -1.41    -1.05    0.49    1.68    nan    inf

In [60]:
# The pystan fit also contains the numpy array of betas, thetas, omegas, taus and sigmas that each chain got.
betas = result.stan_fit["Beta"] # same with "Theta", "tau", "Omega" and "Sigma"
betas

array([[[ 1.26017184e+00, -1.96306764e+00,  1.73001052e+00, ...,
         -7.01132433e-01,  1.26336426e+00,  1.26208597e+00],
        [-1.32777983e+00,  1.73275914e+00,  5.55774180e-01, ...,
          9.16169710e-01,  4.40954578e-01,  1.58767590e+00],
        [ 1.65978877e+00, -3.54798136e-01,  1.54142037e+00, ...,
          1.31346479e+00, -1.39332934e+00,  3.62797441e-01],
        ...,
        [-1.56477435e+00, -3.67752336e-01, -7.25228552e-02, ...,
          1.91733020e+00,  4.71255132e-01,  1.68406574e+00],
        [ 3.43949100e-01,  5.92202309e-01, -9.01347241e-01, ...,
          1.80116993e+00, -1.38028072e+00,  6.88160449e-01],
        [-1.17177119e+00, -1.89023858e+00, -8.13925552e-01, ...,
         -9.94932827e-01, -1.50763189e+00, -4.49725862e-01]],

       [[ 1.26017184e+00, -1.96306764e+00,  1.73001052e+00, ...,
         -7.01132433e-01,  1.26336426e+00,  1.26208597e+00],
        [-1.32777983e+00,  1.73275914e+00,  5.55774180e-01, ...,
          9.16169710e-01,  4.40954578e

In [62]:
# pystan summary dictionaries (contains multiple dictionaries with the info shown using print(result.stan_fit))
result.summary

OrderedDict([('summary',
              array([[ 4.22529355e-01,             nan,  1.35108654e+00, ...,
                       1.49570979e+00,             nan,             inf],
                     [ 1.41811189e-01,             nan,  1.47705370e+00, ...,
                       1.63426690e+00,             nan,             inf],
                     [-3.79845498e-01,             nan,  1.38285866e+00, ...,
                       1.65978877e+00,             nan,             inf],
                     ...,
                     [-1.84191465e-01,             nan,  4.03391735e-01, ...,
                       1.73299486e-01,             nan,             inf],
                     [ 3.85426563e+00,             nan,  6.44558367e+00, ...,
                       1.47267437e+01,             nan,             inf],
                     [-7.78931677e+15,             nan,  1.06846916e+16, ...,
                      -5.06624735e+13,             nan,             inf]])),
             ('c_summary',
       

In [63]:
# Getting individual utilities dataframe:
result.individual_utilities

Unnamed: 0,None,Marca_LG,Marca_Samsung,Marca_Sony,Sistema Operativo_Android TV,Sistema Operativo_Apple TV,Sistema Operativo_Propio de la marca,Tecnología_Camara Frontal,Tecnología_Iluminación Inteligente,Resolución_8K Full Ultra HD (7680x4320),Resolución_Full HD (1920x1080),Resolución_NanoCell 4k Ultra HD (3840x2160),Resolución_OLED 4k Ultra HD (3840x2160),Resolución_QLED 4k Ultra HD (3840x2160),Precio_499999.0,Precio_599999.0,Precio_699999.0,Precio_799999.0
0,0.422529,0.141811,-0.379845,0.536891,-0.799371,-0.299745,-0.465877,-0.510306,-1.029486,-0.507437,-0.245029,0.598097,0.284648,0.437688,0.424588,0.476773,-0.685411,-1.416025
1,-0.480981,0.397049,-0.097990,-0.346576,0.915995,-0.916916,0.368397,-0.041270,0.328475,0.932995,0.109756,-0.831217,-0.486698,0.026650,0.192681,0.469961,-0.078010,0.094423
2,0.043986,0.288800,-0.578635,-0.047913,-1.164502,0.183762,1.477070,-1.095643,-0.480285,0.194812,0.165785,-0.391633,0.522304,0.547222,0.586576,-0.166498,-0.146695,-0.906856
3,0.578662,-0.597212,-0.869828,0.667883,-0.412220,0.133132,-0.476374,-0.349108,-0.318632,-0.740738,-0.031433,0.010013,-0.252923,-0.443277,0.144176,0.133447,-0.846633,-0.879293
4,0.217012,-0.157841,0.719173,-0.797021,-0.122807,0.202380,0.439612,-0.199732,0.424435,-0.113506,0.820740,0.411707,-0.126022,-0.160087,-0.194163,0.558653,0.592204,-0.463932
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
295,-0.587993,-0.646514,-0.790249,0.113764,-0.198945,-0.260935,0.522722,-0.247145,0.260422,-0.958200,0.230458,-0.169935,-0.148945,-1.095610,-0.008054,0.711424,0.530655,0.805634
296,-0.272748,-0.943018,0.934242,-0.133874,0.382917,-1.122665,0.505663,0.078866,-1.299630,0.299137,1.266096,0.563219,0.230685,-0.303885,0.111611,0.525466,0.748631,-0.643779
297,0.812540,0.518097,0.830598,-0.059918,0.282810,-0.328784,0.250701,-0.337058,-0.416269,-0.509843,0.674339,-0.409828,0.271333,-0.503234,-0.112221,0.270008,-0.085453,-0.245279
298,0.375071,0.337506,-0.835599,0.524931,-0.288430,0.907016,0.524134,0.543544,-0.014001,0.575580,1.112010,0.817586,1.224597,0.199555,0.305399,0.975868,-0.403326,-0.091846


In [64]:
# Getting individual importances dataframe:
result.individual_importances

Unnamed: 0,Resolución,Sistema Operativo,Marca,Precio,Tecnología
0,0.224070,0.101264,0.185805,0.383633,0.105228
1,0.335499,0.348564,0.141415,0.104207,0.070314
2,0.143191,0.402884,0.132299,0.227774,0.093852
3,0.189971,0.154231,0.389105,0.258981,0.007712
4,0.206937,0.118660,0.319890,0.222826,0.131688
...,...,...,...,...,...
295,0.305899,0.180775,0.208538,0.187702,0.117086
296,0.200087,0.207524,0.239249,0.177457,0.175683
297,0.360944,0.186416,0.271433,0.157062,0.024144
298,0.185771,0.216654,0.246573,0.249956,0.101046


In [65]:
# If you want to get the attribute importances -> it's always the mean of the individual_importances
result.individual_importances.mean()

Resolución           0.267038
Sistema Operativo    0.192515
Marca                0.193088
Precio               0.224339
Tecnología           0.123020
dtype: float64

### Simulating scenario share of preference with individual utilities for HMNL_Result

You can also simulate the share of preferences for an scenario with the method `simulate_share_of_preference(scenario)` where `scenario` is the simulated scenario dataframe. This simulation is done by calculating getting the probability of each alternative for each individual then getting the average.

For each individual the probability of an alternative is calculated with the multinomial logit formula:
![Probability of choosing an alternative](attachment:image.png)

Where `U` is the total utility for that alternative (sum of each utility for the level present in the alternative).

In [70]:
# Estimating share of preference of an scenario
# For example let's take the first 3 alternatives of the firt participant

scenario = df_dum.iloc[:3]
scenario = scenario.drop(columns=["Num Encuestado", "Tarea", "Alt", "Seleccionada"])
scenario

Unnamed: 0,None,Marca_LG,Marca_Samsung,Marca_Sony,Sistema Operativo_Android TV,Sistema Operativo_Apple TV,Sistema Operativo_Propio de la marca,Tecnología_Camara Frontal,Tecnología_Iluminación Inteligente,Resolución_8K Full Ultra HD (7680x4320),Resolución_Full HD (1920x1080),Resolución_NanoCell 4k Ultra HD (3840x2160),Resolución_OLED 4k Ultra HD (3840x2160),Resolución_QLED 4k Ultra HD (3840x2160),Precio_499999.0,Precio_599999.0,Precio_699999.0,Precio_799999.0
0,0,1,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,1
1,0,0,0,1,0,1,0,0,1,0,1,0,0,0,1,0,0,0
2,0,0,1,0,1,0,0,1,0,0,0,1,0,0,1,0,0,0


In [74]:
scenario_simulation = result.simulate_share_of_preference(scenario=scenario)
scenario_simulation

Unnamed: 0,None,Marca_LG,Marca_Samsung,Marca_Sony,Sistema Operativo_Android TV,Sistema Operativo_Apple TV,Sistema Operativo_Propio de la marca,Tecnología_Camara Frontal,Tecnología_Iluminación Inteligente,Resolución_8K Full Ultra HD (7680x4320),Resolución_Full HD (1920x1080),Resolución_NanoCell 4k Ultra HD (3840x2160),Resolución_OLED 4k Ultra HD (3840x2160),Resolución_QLED 4k Ultra HD (3840x2160),Precio_499999.0,Precio_599999.0,Precio_699999.0,Precio_799999.0,Exp(Utility),ShareOP
0,0,1,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,1,99.31335,0.331045
1,0,0,0,1,0,1,0,0,1,0,1,0,0,0,1,0,0,0,96.581255,0.321938
2,0,0,1,0,1,0,0,1,0,0,0,1,0,0,1,0,0,0,104.105395,0.347018


In [75]:
scenario_simulation["ShareOP"]

0    0.331045
1    0.321938
2    0.347018
Name: ShareOP, dtype: float64