A typical model is described by the following equation
$$\text{outcome}_t=A+B\times\text{Predictor}_t+C\times\text{Controls}_t$$
<br>
Examples:
<ul>
<li>Predicting revenue by customer with on-line presence, controls are age and income.
<li>Predicting the response to the car offer, controls are demographics
</ul>
<br>
<b>Assumption: The impact of the predicting variable does not depend on the control variables. Is it reasonable?</b>
<br><br>
We will learn how to relax this assumption.


<b>Let’s consider the determinants of executive pay as an example for interpreting interaction effects</b>

Questions:<br>
<ol>
<li>What is the relationship between years of work experience and executive pay?</li>
<li>Does the executive pay for MBAs and non-MBAs differ?</li>
<li>Is the relationship between experience and executive pay different for MBAs and non-MBAs?
    <ul><li>is there an interaction between experience and having an MBA?</li></ul>
</li>
</ol>
To Answer:
<ul>
    <li>Sample of 1000 executives</li>
<li>salary:                 Salary of executive</li>
<li>experience:         Years of work experience</li>
<li>MBA:                   MBA degree of executive (0 if no MBA, 1 if MBA)</li>
</ul>


In [1]:
from google.colab import drive
drive.mount('/content/drive',force_remount=False)
import os
os.chdir("/content/drive/MyDrive/Teaching/2022-2023/Python/shared")

ModuleNotFoundError: No module named 'google.colab'

In [None]:
import pandas
import mba263
import matplotlib.pyplot as plt

In [None]:
data = pandas.read_csv('data/salary.csv')

In [None]:
data.head(10)

<b>What is the relationship between years of work experience and executive pay?</b>
<br><br>
$$\text{salary}_t=A+B\times\text{experience}_t$$
<br>

In [None]:
mba263.regress(data['salary'],data['experience']).summary()

In [None]:
result1=mba263.regress(data['salary'],data['experience'])
data['prediction1']=result1.predict()
data.plot.scatter(x='experience',y='salary')
datasorted=data.sort_values('experience')
plt.plot(datasorted['experience'],datasorted['prediction1'],'r')

<b>Experience is related to education. More experienced people have more education, so let's try
to keep education fixed.</b>
<br><br>
$$\text{salary}_t=A+B\times\text{experience}_t+C\times\text{MBA}_t$$
<br>

In [None]:
mba263.regress(data['salary'],data[ ['experience','mba'] ]).summary()

Now we can plot this model.

First we will save the output in <code> result2</code>, and store the fitted/predicted values in the original data frame.

To visualize this, first we plot the raw data: x axis is experience, y axis is salary.

Next we create 3 new data frames: one that sorts on experience, then two that have MBA or no MBA.

These new data frames allow us to plot the average prediction by exprience for MBA and no MBA groups as two separate lines.

Why do we sort the data? Python draws a line through all the points in order - if we don't plot them in order, it won't show us a nice fitted line.

In [None]:
result2=mba263.regress(data['salary'],data[ ['experience','mba'] ])
data['prediction2']=result2.predict()
data.plot.scatter(x='experience',y='salary')
datasorted=data.sort_values('experience')
datasorted_mba=datasorted[datasorted['mba']==1]
datasorted_nonmba=datasorted[datasorted['mba']==0]
plt.plot(datasorted_mba['experience'],datasorted_mba['prediction2'],'r')
plt.plot(datasorted_nonmba['experience'],datasorted_nonmba['prediction2'],'g')

In [None]:
data['mba_experience']=data['mba']*data['experience']

<b>Let's allow each year of experience have different effect on salary for MBAs and non-MBAs.</b>
<br><br>
$$\text{salary}_t=A+B\times\text{experience}_t+C\times\text{MBA}_t+D\times\text{MBA}_t\times\text{experience}_t$$
<br>

Above, we manually create this <i> interaction variable </i> by taking the product of <code> data['mba'] </code> and <code> data['experience'] </code>

Then we can look at the regression result:

In [None]:
mba263.regress(data['salary'],data[ ['experience','mba','mba_experience'] ]).summary()

We'll repeat the plotting approach from above - this time noe that our predictions have different slopes.

In [None]:
result3=mba263.regress(data['salary'],data[ ['experience','mba','mba_experience'] ])
data['prediction3']=result3.predict()
data.plot.scatter(x='experience',y='salary')
datasorted=data.sort_values('experience')
datasorted_mba=datasorted[datasorted['mba']==1]
datasorted_nonmba=datasorted[datasorted['mba']==0]
plt.plot(datasorted_mba['experience'],datasorted_mba['prediction3'],'r')
plt.plot(datasorted_nonmba['experience'],datasorted_nonmba['prediction3'],'g')

In [None]:
result3.summary()

What values do interacted terms actually take?

In [None]:
data[['experience']].describe()

In [None]:
data.groupby('mba')['mba_experience'].describe()

$$\text{salary}_t=A+B\times\text{experience}_t+C\times\text{MBA}_t+D\times\text{MBA}_t\times\text{experience}_t$$
<br><br>

How do we get fitted values?

non-MBA -> Salary = A + C * 0+ B * Experience + D * 0 = A + B * Experience<br>
MBA -> Salary = A + C * 1+ B * Experience+ D * 1 * Experience = (A+C) + (B + D) * Experience<br>


We might also consider interactions with gender - average salary might be lower for women, conditional on experience.

In [None]:
mba263.regress(data['salary'],data['female']).summary()

In [None]:
data['mba_female']=data['female']*data['mba']

In [None]:
mba263.regress(data['salary'],data[ ['female','mba','mba_female'] ]).summary()

male, non-MBA -> Salary = a + f * 0 + mba * 0 + mba_female * 0 = a <br>
male, MBA -> Salary = a + f * 0 + mba * 1 + mba_female * 0 = a + mba <br>
female, non-MBA -> Salary = a + f * 1 + mba * 0 + mba_female * 0 = a + f <br>
female, MBA -> Salary = a + f * 1 + mba * 1 + mba_female * 1 = a + f + mba + mba_female <br>


<ol>
    <li>Interaction between a continuous variable and a dummy:</li>
<ul>
    <li>Is experience associated with the salaries for MBAs and non-MBAs</li>
differently? Is there an interaction between experience (a
continuous variable) and having an MBA (a dummy variable)?
</li></ul>
<li>
    Interaction between two dummies:
</li>
<ul><li>Is the association between having an MBA and the salaries for men and women different? Is
there an interaction between having an MBA (a dummy variable) and
gender (a dummy variable)?
</li></ul>
<li>    
Interaction between two continuous variables:
</li>
<ul><li>Do wealthy people respond the same way to price changes? Is there
an interaction between price (continuous variable) and wealth
(continuous variable)?
</li></ul>
</ol>



### Example: House price prediction

We'll start with running the regression:

$$\text{price}_i=A+B\times\text{rooms}_i+C\times\text{crime}_i+D\times\text{stratio}_i$$
<br><br>

Housing prices depend on the number of rooms in the house, the local crime rate, and the student-teacher ratio (quality of schools).

In [None]:
prices=pandas.read_csv('data/house_prices.csv')
prices.describe()

In [None]:
mba263.regress(prices['price'],prices[['rooms','crime','stratio']]).summary()

What do we see from these results? House prices are higher when the house is larger, and lower when crime is higher or class sizes are larger in local schools.

We might think that the quality of schools affects the premium of a larger house. For example, larger families may place extra value on good schools.


In such a case, we'd run a regression like

$$\text{price}_i=A+B\times\text{rooms}_i+C\times\text{crime}_i+D\times\text{stratio}_i + E\times \text{rooms}_i \times \text{stratio}_i$$

Then, we can think of the effect of an additional on prices as

$$\frac{\Delta \text{price}}{\Delta \text{rooms}} = B + E\times \text{stratio}_i $$

In [None]:
prices['rooms_ratio']=prices['rooms']*prices['stratio']

In [None]:
price_res=mba263.regress(prices['price'],prices[['rooms','crime','stratio','rooms_ratio']])
price_res.summary()

# We need to be careful interpreting the coefficients on rooms, stratio, and their interaction!

Now, the baseline stratio coefficient is interpreted only as the effect of stratio when rooms=0.... does this occur in data?

In [None]:
plt.hist(prices['rooms'],bins=10)
plt.xlabel('Number of Rooms in House')
plt.ylabel('Number of Observations')
plt.plot()

If we want to know the average effect of an additional student per class, we can say:

$$\text{Avg}(\frac{\Delta \text{price}}{\Delta \text{stratio}}) = D + E\times \text{Avg(rooms)} $$


In [None]:
price_res.params[3] + price_res.params[4]*prices['rooms'].mean()

Similarly, the average effect of number of rooms on prices is not measured just with B, but also with the effect of average class size.

$$\text{Avg}(\frac{\Delta \text{price}}{\Delta \text{rooms}}) = D + E\times \text{Avg(stratio)} $$


In [None]:
price_res.params[1] + price_res.params[4]*prices['stratio'].mean()

We might also compute these marginal effects of one variable at a particular level of the interacted variable (rather than at the mean).

### Let's try this with purchase data!

In [None]:
df=pandas.read_csv('data/logit_interactions.csv')
df.describe()

We have 3 dependent variables: Price (randomly assigned in this example), Snow (whether customer lives somewhere with snowy winters), and Ad (whether customer saw an ad). We have a binary/dummy variable for purchase. This is a product targeted for drivers in snowy winter.


In [None]:
res_logit=mba263.logit(df['purch'],df['price'])
df['predicted']=res_logit.predict()
res_logit.summary()

Let's see if the purchase frequency is the same in different winter environments and between ad and no-ad customers

In [None]:
df.groupby(['snow'])['purch'].mean()

In [None]:
df.groupby(['ad'])['purch'].mean()

Maybe we should control for whether the customer lives somewhere snowy and whether they saw advertising...

In [None]:
res_logit2=mba263.logit(df['purch'],df[['price','snow','ad']])
res_logit2.summary()

In [None]:
mba263.odds_ratios(res_logit2)

We might also think that price sensitivity is differing in snowy areas and/or if customers are advertised to:
we need to construct these interactions...

In [None]:
df.groupby(['snow'])['ad'].value_counts()

The firm only runs ads in snowy areas: so the ad effect is the interaction effect already!

In [None]:
df['snow_price']=df['price']*df['snow']
df['ad_price']=df['price']*df['ad']


In [None]:
res_logit3=mba263.logit(df['purch'],df[['price','snow_price','ad_price','snow','ad']])
res_logit3.summary()

### Let's return to our donation example from class 8:

Now, we have experimental treatments. Households were approached for donation with three types of offers:

- Voluntary contribution

- Small Gift

- Large Gift

First, let's run the simple regression of treatment types on donation amount

In [None]:
df=pandas.read_csv('data/donor.csv')
df.head()

In [None]:
result_0=mba263.regress(df['Donation'],df[['Small_gift','Large_gift']])
result_0.summary()

How to interpret?

Offering a small gift does not enhance donation volume, but offering a large gift does

#### Interaction with being an existing customer

We saw before that households who had been approached before and donated - the "Warm List" of existing customers - donated at higher rates. Let's try understanding the interaction between prior donation status and the effect of different soliciation mechanisms

First we'll need to create these interaction terms:

In [None]:
df['Warm_VCM']=df['Warm_List']*df['VCM']
df['Warm_small']=df['Warm_List']*df['Small_gift']
df['Warm_large']=df['Warm_List']*df['Large_gift']

We created a third interaction term - `Warm_VCM` which is the interaction between the voluntary contribution mechanism (no gift) and having donated in the past.

In [None]:
result_1=mba263.regress(df['Donation'],df[['Small_gift','Large_gift','Warm_VCM','Warm_small','Warm_large']])
result_1.summary()

The constant term is the cold-list voluntary contribtuion mechanism mean. We could see this without a regression:

In [None]:
df.loc[(df['Warm_List']==0)&(df['Small_gift']==0)&(df['Large_gift']==0)]['Donation'].mean()

How can we interpret the rest of the coefficients?

$$\text{donation}_i=A+B\times\text{Small Gift}_i+C\times\text{Large Gift}_i+
D\times\text{Warm List}_i \times \text{VCM}_i + \\ E\times\text{Warm List}_i \times \text{Small Gift}_i +
F\times\text{Warm List}_i \times \text{Large Gift}_i$$

<br><br>

How do we get fitted values?

Cold List -> Donation = A + B * Small Gift + C * Large Gift + D * 0  + E * 0 + F * 0 = A + B * Small Gift + C* Large Gift<br>
Warm List -> Donation = A + B * Small Gift + C * Large Gift + D * VCM  + E * Small Gift + F * Large Gift  = A + (B+E) * Small Gift + (C+F)* Large Gift + D * VCM<br>


<br>
Or, for example,

Cold List, Small Gift -> Donation = A + B <br>
Warm List, Small Gift -> A + B +E

Next: <b> Does Solicitor Characteristics Matter? </b>

In [None]:
result_2=mba263.regress(df['Donation'],df[['Small_gift','Large_gift','Warm_VCM','Warm_small','Warm_large', 
                                           'Assertive','Social','Efficacy','Performance','Confidence']])
result_2.summary()