# Objective

First, interpret the outputs of the previous milestone and make an inference about the hypothesis. Then apply Bonferroni correction to the results and interpret them again regarding the adjusted results.

# Instructions
- Print out the p-values for each day along with the averages of the engagement rates.
- Let’s assume the significance level is 0.05. Compare the average tweet engagement rate of Friday to that of other days, considering this p-value significance level.
- Find out how we can apply Bonferroni correction in this case and apply it to obtain an adjusted significance level.
- Interpret the p-values, but this time, use the adjusted significance level.

In [1]:
import json
import pandas as pd

### Load data

In [2]:
df = pd.read_csv('./engagement_per_week_per_day.csv')

In [3]:
with open('./p_values.json', 'r') as f:
    p_values = json.load(f)

In [4]:
p_values_and_engagement_rates = (
    pd.concat(
        [df.drop(columns=['week_number']).groupby(lambda x: 'engagement_rate').mean(),
         pd.DataFrame(p_values, index=['p_value'])])
    [['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']]
)

p_values_and_engagement_rates

Unnamed: 0,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday
engagement_rate,0.161692,0.207954,0.16821,0.112925,0.177519,0.1152,0.097094
p_value,0.5096,0.355433,0.7225,0.009367,,0.0129,0.002667


In [5]:
other_days = [c for c in p_values_and_engagement_rates.columns if c != 'Friday']

### Interpret the results

#### Using significance level of 0.05 without any correction

In [6]:
SIGNIFICANCE_LEVEL = 0.05
print('Comparing the engagement rates of Friday with the other days...')
print(f'''
'Friday':
  - Engagement rate: {p_values_and_engagement_rates['Friday']['engagement_rate']:.2f}
''')

for d in other_days:
    eng_rate = p_values_and_engagement_rates[d]['engagement_rate']
    p_value = p_values_and_engagement_rates[d]['p_value']
    significant = p_value <= SIGNIFICANCE_LEVEL
    emphasis = '*' if significant else ''
    print(f'''
{d}:
  - Engagement rate: {eng_rate:.2f}
  - p-value:         {p_value:.3f}
  - {emphasis}The difference is {'NOT ' if not significant else ''}significant{emphasis}''')

Comparing the engagement rates of Friday with the other days...

'Friday':
  - Engagement rate: 0.18


Monday:
  - Engagement rate: 0.16
  - p-value:         0.510
  - The difference is NOT significant

Tuesday:
  - Engagement rate: 0.21
  - p-value:         0.355
  - The difference is NOT significant

Wednesday:
  - Engagement rate: 0.17
  - p-value:         0.723
  - The difference is NOT significant

Thursday:
  - Engagement rate: 0.11
  - p-value:         0.009
  - *The difference is significant*

Saturday:
  - Engagement rate: 0.12
  - p-value:         0.013
  - *The difference is significant*

Sunday:
  - Engagement rate: 0.10
  - p-value:         0.003
  - *The difference is significant*


#### Using Bonferroni correction for multiple testing

In [7]:
corrected_significance_level = SIGNIFICANCE_LEVEL / len(other_days)
print(f'After applying the Bonferroni correction, the new significance level is {corrected_significance_level:.3f}')
print()

After applying the Bonferroni correction, the new significance level is 0.008



In [8]:
print('Comparing the engagement rates of Friday with the other days...')
print(f'''
'Friday':
  - Engagement rate: {p_values_and_engagement_rates['Friday']['engagement_rate']:.2f}
''')

for d in other_days:
    eng_rate = p_values_and_engagement_rates[d]['engagement_rate']
    p_value = p_values_and_engagement_rates[d]['p_value']
    significant = p_value <= corrected_significance_level
    emphasis = '*' if significant else ''
    print(f'''
{d}:
  - Engagement rate: {eng_rate:.2f}
  - p-value:         {p_value:.3f}
  - {emphasis}The difference is {'NOT ' if not significant else ''}significant{emphasis}''')

Comparing the engagement rates of Friday with the other days...

'Friday':
  - Engagement rate: 0.18


Monday:
  - Engagement rate: 0.16
  - p-value:         0.510
  - The difference is NOT significant

Tuesday:
  - Engagement rate: 0.21
  - p-value:         0.355
  - The difference is NOT significant

Wednesday:
  - Engagement rate: 0.17
  - p-value:         0.723
  - The difference is NOT significant

Thursday:
  - Engagement rate: 0.11
  - p-value:         0.009
  - The difference is NOT significant

Saturday:
  - Engagement rate: 0.12
  - p-value:         0.013
  - The difference is NOT significant

Sunday:
  - Engagement rate: 0.10
  - p-value:         0.003
  - *The difference is significant*


Now, using the corrected significance level, we can only say that Friday is better than Sunday. 