## Import libraries

In [1]:
import wooldridge as woo
import pandas as pd
import numpy as np
import statsmodels.api as sm

## Load dataset

In [2]:
df = woo.data("charity")
df.head()

Unnamed: 0,respond,gift,resplast,weekslast,propresp,mailsyear,giftlast,avggift
0,0,0,0,143.0,0.3,2.5,10,10.0
1,0,0,0,65.428574,0.3,2.5,10,10.0
2,0,0,1,13.142858,0.3,2.5,10,10.0
3,0,0,0,120.14286,0.3,2.5,10,10.0
4,1,10,0,103.85714,0.2,2.5,10,10.0


## (i) Average donation and % of people with gift = 0

In [3]:
avg_gift = df['gift'].mean()
pct_zero = (df['gift'] == 0).mean() * 100
print(f"Average donation: {avg_gift:.2f} guilders")
print(f"Percentage of people with no donation: {pct_zero:.2f}%")

Average donation: 7.44 guilders
Percentage of people with no donation: 60.00%


## (ii) Average, minimum and maximum number of mailings per year (mailsyear)

In [4]:
avg_mails = df['mailsyear'].mean()
min_mails = df['mailsyear'].min()
max_mails = df['mailsyear'].max()
print(f"Average mailings per year: {avg_mails:.2f}")
print(f"Minimum mailings: {min_mails}, Maximum mailings: {max_mails}")

Average mailings per year: 2.05
Minimum mailings: 0.25, Maximum mailings: 3.5


## (iii) Regression: $gift = β0 + β1*mailsyear + u$

In [5]:
X = sm.add_constant(df['mailsyear'])
y = df['gift']
model = sm.OLS(y, X).fit()
print("Regression results:")
print(model.summary())

Regression results:
                            OLS Regression Results                            
Dep. Variable:                   gift   R-squared:                       0.014
Model:                            OLS   Adj. R-squared:                  0.014
Method:                 Least Squares   F-statistic:                     59.65
Date:                Mon, 29 Sep 2025   Prob (F-statistic):           1.40e-14
Time:                        16:34:34   Log-Likelihood:                -17602.
No. Observations:                4268   AIC:                         3.521e+04
Df Residuals:                    4266   BIC:                         3.522e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          2.0141      0.739

## (iv) Interpretation of slope coefficient

Each additional mailing increases average donations by 2.65 guilders.

If each mailing costs 1 guilder, the expected net gain is beta1 - 1.

However, this does not mean *every* mailing is profitable, since some people donate nothing or less than the mailing cost.

## (v) Smallest predicted contribution

In [6]:
df['gift_pred'] = model.predict(X)
min_pred = df['gift_pred'].min()
print(f"Smallest predicted gift: {min_pred:.2f} guilders")

Smallest predicted gift: 2.68 guilders
