In [1]:
import pandas as pd
from scipy import stats
import statsmodels.formula.api as smf
import statsmodels.api as sm

In [3]:
df = pd.read_csv("/content/teachers_rating_data.csv")
df.head()

Unnamed: 0,prof,Gender,Age,Tenure,Evaluation,Students,Beauty,CourseLevel
0,Prof52,Male,59,No,4.38,69,3.1,Lower
1,Prof93,Male,59,No,3.12,71,4.2,Lower
2,Prof15,Male,51,No,4.83,38,4.0,Lower
3,Prof72,Male,36,Yes,3.88,79,3.4,Lower
4,Prof61,Female,47,No,3.48,56,2.8,Upper


# Q1. Regression with T-test: Using the teachers rating data set, does gender affect teaching evaluation rates?

In [5]:
model = smf.ols('Evaluation ~ Gender', data=df).fit()
print(model.summary())

                            OLS Regression Results                            
Dep. Variable:             Evaluation   R-squared:                       0.043
Model:                            OLS   Adj. R-squared:                  0.035
Method:                 Least Squares   F-statistic:                     5.271
Date:                Mon, 27 Oct 2025   Prob (F-statistic):             0.0235
Time:                        13:11:26   Log-Likelihood:                -103.31
No. Observations:                 120   AIC:                             210.6
Df Residuals:                     118   BIC:                             216.2
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                     coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------
Intercept          3.9025      0.081     48.

Based on the regression results (F(1,118) = 5.27, p = 0.0235), there is a statistically significant difference in teaching evaluation ratings between male and female instructors.

On average, male instructors scored about 0.24 points higher than female instructors on teaching evaluations. Because the p-value is below 0.05, this difference is unlikely to be due to random variation, suggesting a real gender effect in the evaluation scores.

# Q2. Regression with ANOVA: Using the teachers' rating data set, does beauty score for instructors differ by age?

In [6]:
df['age_group'] = pd.cut(df['Age'], bins=[20, 30, 40, 50, 60, 70],
                         labels=['20s', '30s', '40s', '50s', '60s'])
model = smf.ols('Beauty ~ C(age_group)', data=df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
print(anova_table)


                 sum_sq     df         F    PR(>F)
C(age_group)   1.254207    4.0  0.678339  0.567023
Residual      53.619261  116.0       NaN       NaN




An ANOVA test was conducted to determine whether beauty scores differ by age group.
The results were not statistically significant, F(4,116) = 0.68, p = 0.567.
This indicates that instructors’ beauty scores do not significantly differ across age groups.

# Q3. Correlation: Using the teachers' rating dataset, Is teaching evaluation score correlated with beauty score?

In [7]:
import statsmodels.api as sm

X = df['Beauty']
y = df['Evaluation']

X = sm.add_constant(X)

model = sm.OLS(y, X).fit()
predictions = model.predict(X)

print(model.summary())


                            OLS Regression Results                            
Dep. Variable:             Evaluation   R-squared:                       0.033
Model:                            OLS   Adj. R-squared:                  0.025
Method:                 Least Squares   F-statistic:                     4.089
Date:                Mon, 27 Oct 2025   Prob (F-statistic):             0.0454
Time:                        13:18:10   Log-Likelihood:                -103.89
No. Observations:                 120   AIC:                             211.8
Df Residuals:                     118   BIC:                             217.4
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          4.6174      0.289     15.988      0.0

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

The model was statistically significant, F(1,118) = 4.09, p = 0.045, with an R² of 0.033.
Beauty score was a significant negative predictor of teaching evaluations (β = -0.159, p = 0.045), indicating that as instructors’ beauty scores increase, their evaluation ratings slightly decrease.
However, since the model explains only about 3.3% of the variance in evaluation scores, the practical impact of beauty on teaching evaluations appears to be very small.