**Hypothesis Testing**

The dataset under examination consists of code review comments, each tagged with binary labels indicating whether the comment is considered "social" or "anti-social," as well as whether it is labeled as "toxic" or "non-toxic in source paper."

The primary research question revolves around understanding if there is a significant difference in the distribution of these social categorizations between toxic and non-toxic comments. In statistical terms, we are testing the null hypothesis that there is no significant differences between the two categorical variables: "social" and "non-toxic." Conversely, the alternative hypothesis posits that there is indeed a significant difference in the distribution of social categorizations between non-toxic and toxic comments.

In [2]:
import pandas as pd

df = pd.read_csv('/content/drive/MyDrive/codeReview/hypothesisTesting/mergedToxicAntisocial.csv')
df.head()

Unnamed: 0,description,Personal attacks,Threats or intimidation,Mockery,Lack of specificity,Discouragement without guide,Disregard for other time or boundaries,Unconscious bias,Dismissive attitude,Excessive control,Social,Toxic,AntiSocial,NonToxic
0,/It may happen that a service die/A service ma...,0,0,0,1,0,0,0,0,0,0,0,1,1
1,"@zhiyan, thanks for helping explanation. Overa...",0,0,0,0,0,0,0,0,0,1,0,0,1
2,all the code you have inline below should be r...,0,0,0,0,0,0,0,0,0,1,0,0,1
3,All you do in the interrupt handler is call wa...,1,0,1,1,0,1,0,0,0,0,0,1,1
4,Are you sure this leads to a color that makes ...,0,0,0,0,0,0,0,0,0,1,0,0,1




1.   Define Hypotheses:

  Null Hypothesis (H0): There is no significant difference in the distribution of "social" and "anti-social" comments between the non-toxic and toxic groups.
  Alternative Hypothesis (H1): There is a significant difference in the distribution of "social" and "anti-social" comments between the non-toxic and toxic groups.

2. Select a Significance Level (α):

  We use 0.05 for α. This represents the probability of rejecting the null hypothesis when it is true.

3. Choose a Statistical Test:

  The Chi-square test is a statistical test used to determine if there is a significant association between two categorical variables. It is a non-parametric test, meaning it doesn't make assumptions about the distribution of the data.

  *   Contingency Table:

    The data is organized into a contingency table, which is a two-dimensional table that displays the frequency (count) of each combination of the two categorical variables. In the context of our hypothesis, the table look like this:
    
    



In [None]:
from scipy.stats import chi2_contingency

def chiSquareHypothesis(data):
    # Create a contingency table
    contingency_table = pd.crosstab(data['Social'], data['NonToxic'])
    # Perform Chi-square test
    chi2_stat, p_value, dof, expected = chi2_contingency(contingency_table)
    return p_value

In [None]:
chiSquarePValue = chiSquareHypothesis(df)
'{:.5f}'.format(chiSquarePValue)

'0.00002'

In [None]:
# Set significance level
alpha = 0.05

if chiSquarePValue < alpha:
    print(
        "Reject the null hypothesis. There is a significant difference in the distribution of 'social' and 'anti-social' comments between non-toxic and toxic groups.")
else:
    print(
        "Fail to reject the null hypothesis. There is no significant difference in the distribution of 'social' and 'anti-social' comments between non-toxic and toxic groups.")


Reject the null hypothesis. There is a significant difference in the distribution of 'social' and 'anti-social' comments between non-toxic and toxic groups.


In [5]:
from statsmodels.stats.proportion import proportions_ztest

# Count the number of toxic and anti-social comments
toxic_count = df['Toxic'].sum()
antisocial_count = df['AntiSocial'].sum()
print(toxic_count)
print(antisocial_count)
# Number of observations (comments)
n_observations = df.shape[0]

# Proportion of toxic comments among anti-social comments under the null hypothesis
p_null = 1.0

# Perform one-sample proportion z-test
z_stat, p_value = proportions_ztest(count=toxic_count, nobs=antisocial_count, value=p_null, alternative='smaller')

# Output the results
print(f'Z-statistic: {z_stat}')
print(f'P-value: {p_value}')

# Check the significance at a 0.05 level
alpha = 0.05
if p_value < alpha:
    print('Reject the null hypothesis: Not all anti-social comments are toxic')
else:
    print('Fail to reject the null hypothesis: All anti-social comments are toxic')

199
171
Z-statistic: nan
P-value: nan
Fail to reject the null hypothesis: All anti-social comments are toxic


  std_diff = np.sqrt(var_)
