**Aman Dubal T076**

Practical 5:
ANOVA (Analysis of Variance)
1. Perform one-way ANOVA to compare means across multiple groups.
2. Conduct post-hoc tests to identify significant differences between group means.

# Practical 5

In [None]:
#Import Libraries
import pandas as pd
from scipy.stats import f_oneway
from textblob import TextBlob
import statsmodels.api as sm
from statsmodels.formula.api import ols

In [None]:
df = pd.read_csv('/content/final_dataset.csv')
df['sentiment'] = df['text'].apply(lambda x: TextBlob(x).sentiment.polarity)
print(df)

                                                     text     emotion  \
0       i feel rather funny ending with so many dupes ...         fun   
1                          i feel surprised by the result    surprise   
2                         i am officially feeling festive     neutral   
3       i suddenly found myself standing before this w...    surprise   
4       i look at the meager pile of food i purchased ...  enthusiasm   
...                                                   ...         ...   
106350  i used to feel strongly about how much i hated...        hate   
106351  i feel like i just got a spirit booster this r...    surprise   
106352  i could come up with is that i was really feel...       anger   
106353  i find it really it helps to have an outfit of...      relief   
106354  i can t help feeling surprised by his sudden call    surprise   

        sentiment  
0        0.375000  
1        0.100000  
2        0.000000  
3       -0.095238  
4       -0.175000  
...

One-Way ANOVA

In [None]:
groups = [df[df['emotion'] == e]['sentiment'] for e in df['emotion'].unique()]

f_stat, p_value = f_oneway(*groups)

print("F-Statistic:", f_stat)
print("P-Value:", p_value)

F-Statistic: 3455.581342151677
P-Value: 0.0


Two-Way ANOVA

In [None]:
def sentiment_group(score):
    if score < 0:
        return "Negative"
    elif score <= 0.2:
        return "Neutral"
    else:
        return "Positive"

df['sentiment_group'] = df['sentiment'].apply(sentiment_group)

print(df.head())

                                                text     emotion  sentiment  \
0  i feel rather funny ending with so many dupes ...         fun   0.375000   
1                     i feel surprised by the result    surprise   0.100000   
2                    i am officially feeling festive     neutral   0.000000   
3  i suddenly found myself standing before this w...    surprise  -0.095238   
4  i look at the meager pile of food i purchased ...  enthusiasm  -0.175000   

  sentiment_group  
0        Positive  
1         Neutral  
2         Neutral  
3        Negative  
4        Negative  


In [None]:

model = ols('sentiment ~ C(emotion) + C(sentiment_group) + C(emotion):C(sentiment_group)', data=df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

print(anova_table)

                                    sum_sq        df              F  PR(>F)
C(emotion)                      252.740256      10.0     852.410176     0.0
C(sentiment_group)             7192.058734       2.0  121282.302638     0.0
C(emotion):C(sentiment_group)   177.907197      20.0     300.011378     0.0
Residual                       3152.455272  106322.0            NaN     NaN


Post-Hoc for Emotion

In [None]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd

tukey_emotion = pairwise_tukeyhsd(endog=df['sentiment'], groups=df['emotion'], alpha=0.05)
print("\nPost-hoc Results for Emotion:\n")
print(tukey_emotion)



Post-hoc Results for Emotion:

    Multiple Comparison of Means - Tukey HSD, FWER=0.05     
  group1     group2   meandiff p-adj   lower   upper  reject
------------------------------------------------------------
     anger      empty   0.1072    0.0   0.091  0.1234   True
     anger enthusiasm   0.3516    0.0  0.3373  0.3659   True
     anger        fun   0.3087    0.0  0.2944   0.323   True
     anger  happiness   0.3638    0.0  0.3494  0.3781   True
     anger       hate   -0.196    0.0 -0.2103 -0.1816   True
     anger       love   0.4526    0.0  0.4383   0.467   True
     anger    neutral    0.184    0.0  0.1697  0.1983   True
     anger     relief   0.2378    0.0  0.2234  0.2521   True
     anger    sadness   0.0918    0.0  0.0774  0.1061   True
     anger   surprise   0.1561    0.0  0.1417  0.1704   True
     empty enthusiasm   0.2444    0.0  0.2282  0.2607   True
     empty        fun   0.2015    0.0  0.1852  0.2177   True
     empty  happiness   0.2566    0.0  0.2403  0.2728

Post-Hoc for Sentiment Group

In [None]:
tukey_sentiment_group = pairwise_tukeyhsd(endog=df['sentiment'], groups=df['sentiment_group'], alpha=0.05)
print("\nPost-hoc Results for Sentiment Group:\n")
print(tukey_sentiment_group)



Post-hoc Results for Sentiment Group:

 Multiple Comparison of Means - Tukey HSD, FWER=0.05 
 group1   group2  meandiff p-adj lower  upper  reject
-----------------------------------------------------
Negative  Neutral   0.3969   0.0 0.3937 0.4002   True
Negative Positive   0.7631   0.0 0.7599 0.7663   True
 Neutral Positive   0.3661   0.0 0.3629 0.3694   True
-----------------------------------------------------
