## Experimental design and partial factorials

### Generate a partial factorial design using information from the response table shown.

The smallest appropriate partial factorial has 8 trials (i.e., the solution with 8 trials has a perfect D-efficiency score equal to 1, no correlation between the variables (i.e., a determinant of 1), and is balanced (i.e., `trial` and `gift` both occur in 4 trials, `speed` and `power` both occur in 4 trials, and `USD150`, `USD160`, `USD170`, and `USD180` each occur in 2 trials).   

In [2]:
%reload_ext rpy2.ipython

In [3]:
%%R
result <- radiant.design::doe(
  factors = c(
    "price; USD150; USD160; USD170; USD180", 
    "message; speed; power", 
    "promotion; trial; gift"
  ),
  seed = 1234
)
summary(result, eff = TRUE, part = FALSE, full = FALSE)

Experimental design
# trials for partial factorial: 8 
# trials for full factorial   : 16 
Random seed                   : 1234 

Attributes and levels:
price: USD150, USD160, USD170, USD180 
message: speed, power 
promotion: trial, gift 

Design efficiency:
 Trials D-efficiency Balanced
      6        0.135    FALSE
      7        0.082    FALSE
      8        1.000     TRUE

Partial factorial design correlations:
** Note: Variables are assumed to be ordinal **
          price message promotion
price         1       0         0
message       0       1         0
promotion     0       0         1

Estimable effects from partial factorial design:

  price|USD160
  price|USD170
  price|USD180
  message|power
  promotion|gift
  price|USD160:message|power
  price|USD170:message|power 


We can re-estimate the design, specifically for only 8 trials/profiles.

In [4]:
%%R
result <- radiant.design::doe(
      factors = c(
    "price; USD150; USD160; USD170; USD180", 
    "message; speed; power", 
    "promotion; trial; gift"
  ), 
  trials = 8, 
  seed = 1234
)
summary(result, eff = TRUE, part = TRUE, full = TRUE)
readr::write_csv(result$part, path = "bizware-partial-factorial.csv")

Experimental design
# trials for partial factorial: 8 
# trials for full factorial   : 16 
Random seed                   : 1234 

Attributes and levels:
price: USD150, USD160, USD170, USD180 
message: speed, power 
promotion: trial, gift 

Design efficiency:
 Trials D-efficiency Balanced
      8        1.000     TRUE

Partial factorial design correlations:
** Note: Variables are assumed to be ordinal **
          price message promotion
price         1       0         0
message       0       1         0
promotion     0       0         1

Partial factorial design:
 trial  price message promotion
     1 USD150   speed     trial
     4 USD150   power      gift
     5 USD160   speed     trial
     8 USD160   power      gift
    10 USD170   speed      gift
    11 USD170   power     trial
    14 USD180   speed      gift
    15 USD180   power     trial

Estimable effects from partial factorial design:

  price|USD160
  price|USD170
  price|USD180
  message|power
  promotion|gift
  price|USD16

Recall that multiple partial factorials may exist that "solve" the experimental design problem.

## Estimate a logistic regression based on the response table shown and predict response for all profiles

Open the `bizware.xls` file. After loading the data into Jupyter we first need to create a _positive_ and _negative response_ variables. Then we can `melt` the data and estimate the logistic regression using (frequency) weights. As stated in the Excel file, assume the sample size for each cell was 2,000


Load information from **data/bizware.xls**:

In [None]:
import pandas as pd
import statsmodels.formula.api as smf
from statsmodels.genmod.families import Binomial
from statsmodels.genmod.families.links import logit
import pyrsm as rsm

bizware = pd.read_excel("data/bizware.xls", nrows=8)
bizware.dtypes

Create a dataset from the bixware data that can be used for estimation

In [None]:
bizware_melt = 

The results from the logistic regression models are given below.

In [None]:
form = "resp_yes ~ " + " + ".join(evar)
lr = smf.glm(
    formula=form,
    family=Binomial(link=logit()),
    freq_weights=bizware_melt.freq,
    data=bizware_melt,
).fit()
rsm.or_ci(lr)

The prediction below only produces the 'partial' results. Check if you can see the difference between these predictions and those from the predictions further below.

In [None]:
bizware_melt["prediction_partial"] = lr.predict(bizware_melt)
bizware_melt

The easiest way to generate predictions for all possible profiles (trials) if to use the `levels_list` and `expand_grid` functions in pyrsm. Use the logistic regression object created above and use the newly created dataset to generate the desired predictions

See also the following tutorial video https://youtu.be/lk3ufN2igOo