---
author:
 - Elizabeth Czarniak (CZARNIA_ELIZ@bentley.edu)
 - Nathan Carter (ncarter@bentley.edu)
---

We'll use R's dataset `EuStockMarkets` to do an example. This dataset has
information on the daily closing prices of 4 European stock indices.
We're going to compare the variability of Germany's DAX and France's CAC
closing prices.

Let's load the dataset.  (See how to quickly load some sample data.)
If using your own data, place it into the `sample1` and `sample2` variables
instead of using the code below.

In [4]:
from rdatasets import data
import pandas as pd

# Load in the EuStockMarkets data and place it in a pandas DataFrame
EuStockMarkets = data('EuStockMarkets')
df = pd.DataFrame(EuStockMarkets[['DAX', 'CAC']])

# Choose the two columns we want to analyze
# (You can replace the two lines below with your actual data.)
sample1 = df['DAX']
sample2 = df['CAC']

For all tests below, we will use $\alpha=0.05$ as our Type I Error Rate, but any
value between 0.0 and 1.0 can be used.

### Two-tailed test

We can use a two-tailed test to test whether the two population variances are
equal.  Specifically, the null hypothesis will be:

$$H_0: \frac{\sigma_1^2}{\sigma_2^2} = 1$$

In [12]:
from scipy import stats
sample1_df = len(sample1) - 1                   # degrees of freedom
sample2_df = len(sample2) - 1                   # degrees of freedom
test_statistic = sample1.var() / sample2.var()  # test statistic
stats.f.sf(test_statistic, dfn = sample1_df, dfd = sample2_df)*2  # p-value

7.729079251495416e-151

Our $p$-value is smaller than our chosen alpha, so we have sufficient evidence
to reject the null hypothesis. The ratio of the variance of the closing prices
on Germany's DAX and France's CAC is significantly different than 1, so the
variances are not equal.

### Right-tailed test

In a right-tailed test, the null hypothesis is that the ratio is less than or
equal to 1.  This is equivalent to asking if $\sigma_1^2 \le \sigma_2^2$.

$$H_0: \frac{\sigma_1^2}{\sigma_2^2} \le 1$$

We repeat below some of the code above to make each example easy to copy and paste.

In [13]:
from scipy import stats
sample1_df = len(sample1) - 1                   # degrees of freedom
sample2_df = len(sample2) - 1                   # degrees of freedom
test_statistic = sample1.var() / sample2.var()  # test statistic
stats.f.sf(test_statistic, dfn = sample1_df, dfd = sample2_df)  # p-value

3.864539625747708e-151

Our $p$-value is smaller than our chosen alpha, so we have sufficient evidence
to reject the null hypothesis. The ratio of the variance of the closing prices
on Germany's DAX and France's CAC is significantly greater than 1, so the
variance of closing prices on Germany's DAX is greater than that of closing
prices on France's CAC.

To test whether $\sigma_1^2 \ge \sigma_2^2$, simply swap the roles of the two
data columns in the above code.