## Problem Statement

**A survey was done, after the draft Education Policy 2020 was published in a country, with 578 college teachers. Each of them was asked whether they voted for the ruling party in 2019 or not and whether they are in favour of or against the NEP. The following table shows the result. Does it show evidence that favouring NEP is independent of voting for the ruling party?**

|  | Favours NEP | Against NEP | Total |
| --- | --- | --- | --- |
| Voted for ruling party | 205 | 30 | 235
| Did not vote for ruling party | 64 | 279 | 343
|Total| 269| 309| 578


By observing the data, we can see that most of the people who voted for the ruling party (around 205 out of 235) are in favour of NEP whereas the people who did not vote for the ruling party (around 279 out of 343) are against NEP. Let's perform a hypothesis test to see if there are enough statistical evidence to support our observation.

## Step 1: Define null and alternate hypotheses

$H_0:$ Opinion on NEP is independent of voting for the ruling party

$H_a:$ Opinion on NEP is NOT independent of voting for the ruling party

## Step 2: Select Appropriate test

This is a problem of Chi-square test of independence, concerning the two independent categorical variables, opinion on NEP (in favour of/against the policy) and voting preference (voted/did not vote for ruling party).

## Step 3: Decide the significance level

Here, we select α= 0.05

## Step 4: Collect and prepare data

## Import the necessary libraries

In [31]:
import numpy as np
import pandas as pd
from   scipy.stats import chi2_contingency   # For Chi-Square test 

## Reading the data into the DataFrame

In [32]:
df = pd.read_csv('NEP.csv')
df

Unnamed: 0,-,Favours NEP,Against NEP
0,Voted for ruling party,205,30
1,Did not vote for ruling party,64,279


In [33]:
# prepare the data by dropping the first column
data = df.drop(df.columns[0], axis = 1)

## Step 5: Calculate the p-value

In [34]:
# use chi2_contingency() to find the p-value
chi2, pval, dof, exp_freq = chi2_contingency(data)
# print the p-value
print('The p-value is', pval)

The p-value is 1.1307328231776248e-58


## Step 6: Compare the p-value with $\alpha$

In [36]:
# print the conclusion based on p-value
if pval < 0.05:
    print(f'As the p-value {pval} is less than the level of significance, we reject the null hypothesis.')
else:
    print(f'As the p-value {pval} is greater than the level of significance, we fail to reject the null hypothesis.')

As the p-value 1.1307328231776248e-58 is less than the level of significance, we reject the null hypothesis.


## Step 7:  Draw inference

Since the pvalue is < 0.05, we reject the null hypothesis. Hence, we have enough statistical evidence to say that opinion on NEP is NOT independent of voting for the ruling party.

## Insight

Opinion on NEP is NOT independent of voting for the ruling party. 