# Bonferroni Exercise 

Consider a case where we are performing validation of an agent-based model examining the hunt for U-Boats in the Bay of Biscay during WWII. The output from the model is contained in the code cells below.

There are 2 metrics of interest: U-Boats sighted and U-Boats killed. The model was run under 2 separate scenarios to capture the ability of the model to represent the technological innovations employed by the Axis and Allied powers during this campaign. Each scenario was replicated 20 times. 

-----------------------

_(Select a cell by clicking it, then use Ctrl+Enter or Shift+Enter to run that cell. You can also use the Run button in the toolbar.)_

In [2]:
import pandas as pd
from io import StringIO
from scipy import stats

In [3]:
csv_data = """Replication,Scenario 1 Kills,Scenario 1 Sightings,Scenario 2 Kills,Scenario 2 Sightings
1,2,108,28,287
2,5,129,26,332
3,3,129,28,304
4,3,150,30,318
5,4,128,28,345
6,2,143,38,358
7,3,147,39,341
8,3,130,43,347
9,5,184,28,357
10,5,168,37,373
11,4,102,34,312
12,3,159,37,334
13,2,107,25,282
14,2,116,27,288
15,4,131,31,334
16,4,120,37,343
17,3,120,29,357
18,4,149,30,376
19,5,156,34,344
20,4,130,32,338
"""

csv_StringIO = StringIO(csv_data)
df = pd.read_csv(csv_StringIO)
df.head()

Unnamed: 0,Replication,Scenario 1 Kills,Scenario 1 Sightings,Scenario 2 Kills,Scenario 2 Sightings
0,1,2,108,28,287
1,2,5,129,26,332
2,3,3,129,28,304
3,4,3,150,30,318
4,5,4,128,28,345


___________

__Task:__ Brainstorm methods for validating the model, with an emphasis on statistical tests. Write your thoughts in the cell below. Indicate any additional data you might need.

Conducting a t-test to determine the probability of the two scenarios having statistically significant differences in thier results.


----------------

<div style="height:350px">&nbsp</div>

For this activity, we will validate the model via confidence intervals based on the t-test, using the $\alpha = 0.1$ level of significance. We'll need reference values to validate against. See the "True Values" below.

In [4]:
true_df = pd.DataFrame([[3, 135, 32, 319]], 
                       columns=['Scenario 1 Kills', 'Scenario 1 Sightings', 'Scenario 2 Kills', 'Scenario 2 Sightings'], 
                       index=['True Values'])
true_df

Unnamed: 0,Scenario 1 Kills,Scenario 1 Sightings,Scenario 2 Kills,Scenario 2 Sightings
True Values,3,135,32,319


Recall the equation for a confidence interval:

$\bar{Y} \pm t_{\alpha/2,n-1}\frac{S}{\sqrt{n}}$

Defining some terms we need for our confidence intervals...

In [5]:
alpha = 0.10
R = 20  # number of replications (a.k.a. `n`)

In [6]:
tcrit = stats.t.ppf(1 - alpha/2, R-1)  # equivalent to Excel's =T.INV.2T(alpha, R-1)

print(tcrit)

1.729132811521367


And we could start calculating the confidence intervals one at a time.

To help stay organized and consistent, we can use a function to build confidence intervals, one column of data at a time.

In [7]:
from math import sqrt

def build_confidence_interval(data, alpha):
    """Returns the (lower, upper) bounds of a 100*(1-alpha)% confidence interval, based on the t-test, and the half-width.
    
    Args:
      data (pd.Series) - the input data as a specific column from a dataframe
      alpha (float) - level of significance between 0 and 1
      
    Returns: lower bound, upper bound, interval half-width
      
    """
    R = len(data)
    tcrit = stats.t.ppf(1 - alpha/2, R-1)
    
    Y_bar = data.mean()
    S = data.std()  # this function the sample st.dev by default
    
    H = tcrit * S / sqrt(R)
    
    lower, upper = Y_bar - H, Y_bar + H
    
    return lower, upper, H    

In [8]:
build_confidence_interval(df['Scenario 1 Kills'], alpha)

(3.0935134304106535, 3.9064865695893465, 0.4064865695893465)

__Task:__ Construct confidence intervals for the remaining features of the data.


In [10]:
build_confidence_interval(df['Scenario 1 Sightings'], alpha)

(127.01012237019306, 143.58987762980698, 8.289877629806956)

In [11]:
build_confidence_interval(df['Scenario 2 Kills'], alpha)

(30.104700307483252, 33.995299692516745, 1.945299692516746)

In [12]:
build_confidence_interval(df['Scenario 2 Sightings'], alpha)

(322.95855776711164, 344.04144223288836, 10.541442232888329)

__Task:__ What do we think about the model and its validity? At what total confidence are we making that decision?

Scenerio 2 sightings and Scenario 2 kills fall jsut outside the coreesponding confidence intervals.
The overall confidence is 60%.

----------------

<div style="height:250px">&nbsp</div>

Recall our friend Bonferroni _(BCNN Eq. 12.16, p. 476)_. In order to make our model validity determination at the desired 90% confidence level, we need to change some things.

__Task:__ Construct joint confidence intervals on all four data features.

__Task:__ Now, what do we think about the model and its validity? At what total confidence are we making that decision?

How does your interpretation of these results differ from the initial (non-Bonferroni) results?

_your thoughts here (double click the cell to start editing)_
