**Toxic comment classification - Varify Model Performance against Debias Method**

- Split both the original dataset and the dataset which has the sensitive words marked into training and test sets.
- Apply tf-idf feature representation method.
- Train both datasets with RandomForestClassifier.

In [7]:
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.ensemble import RandomForestClassifier

In [8]:
# Load the datasets
df_train1 = pd.read_csv('processed_train_data.csv')  # Original data without marking sensitive words
df_train2 = pd.read_csv('ready_train.csv')  # Data with sensitive words marking

# Vectorize lemmas for dataset 1
vectorizer = TfidfVectorizer()
X1 = vectorizer.fit_transform(df_train1['lemmas'])
y1 = df_train1[['toxic', 'severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate']]

# Split data into train and test sets for dataset 1
X_train1, X_test1, y_train1, y_test1 = train_test_split(X1, y1, test_size=0.3, random_state=42)

# Vectorize lemmas for dataset 2
X2 = vectorizer.fit_transform(df_train2['lemmas'])
y2 = df_train2[['toxic', 'severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate']]

# Split data into train and test sets for dataset 2
X_train2, X_test2, y_train2, y_test2 = train_test_split(X2, y2, test_size=0.3, random_state=42)

# Train and evaluate models for both datasets
clf = RandomForestClassifier()

In [9]:
# Train and evaluate model for dataset 1
clf.fit(X_train1, y_train1)
y_pred1 = clf.predict(X_test1)
print("Classification report for dataset 1:")
print(classification_report(y_test1, y_pred1))

Classification report for dataset 1:
              precision    recall  f1-score   support

           0       0.91      0.58      0.71      4582
           1       0.46      0.07      0.12       486
           2       0.91      0.62      0.73      2556
           3       0.53      0.06      0.11       136
           4       0.81      0.46      0.59      2389
           5       0.72      0.06      0.11       432

   micro avg       0.88      0.51      0.65     10581
   macro avg       0.72      0.31      0.39     10581
weighted avg       0.85      0.51      0.63     10581
 samples avg       0.05      0.04      0.05     10581



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [10]:
# Train and evaluate model for dataset 2
clf.fit(X_train2, y_train2)
y_pred2 = clf.predict(X_test2)
print("\nClassification report for dataset 2:")
print(classification_report(y_test2, y_pred2))


Classification report for dataset 2:
              precision    recall  f1-score   support

           0       0.90      0.58      0.71      4582
           1       0.45      0.06      0.11       486
           2       0.90      0.62      0.74      2556
           3       0.50      0.07      0.13       136
           4       0.79      0.46      0.58      2389
           5       0.58      0.03      0.06       432

   micro avg       0.87      0.51      0.64     10581
   macro avg       0.69      0.31      0.39     10581
weighted avg       0.84      0.51      0.62     10581
 samples avg       0.05      0.04      0.05     10581



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
