In [1]:
import pandas as pd

df = pd.read_csv('/content/cleaned_comments_df.csv')
display(df.head())
display(df.info())

Unnamed: 0,id,comment_text,toxic,severe_toxic,obscene,threat,insult,identity_hate,clean,char_length
0,0000997932d777bf,explanationwhy edit make username hardcore met...,0,0,0,0,0,0,True,264
1,000103f0d9cfb60f,' aww ! match background colour ' seemingly st...,0,0,0,0,0,0,True,112
2,000113f07ec002fd,"hey man , ' really try edit war . ' guy consta...",0,0,0,0,0,0,True,233
3,0001b41b1c6bb37e,""" morei ' make real suggestions improvement - ...",0,0,0,0,0,0,True,622
4,0001d958c54c6e35,", sir , hero . chance remember page ' ?",0,0,0,0,0,0,True,67


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 159571 entries, 0 to 159570
Data columns (total 10 columns):
 #   Column         Non-Null Count   Dtype 
---  ------         --------------   ----- 
 0   id             159571 non-null  object
 1   comment_text   159563 non-null  object
 2   toxic          159571 non-null  int64 
 3   severe_toxic   159571 non-null  int64 
 4   obscene        159571 non-null  int64 
 5   threat         159571 non-null  int64 
 6   insult         159571 non-null  int64 
 7   identity_hate  159571 non-null  int64 
 8   clean          159571 non-null  bool  
 9   char_length    159571 non-null  int64 
dtypes: bool(1), int64(7), object(2)
memory usage: 11.1+ MB


None

**Reasoning**:
Handle missing values, convert to list, and apply TF-IDF vectorization to the comment text data as per the instructions.



In [3]:
from sklearn.feature_extraction.text import TfidfVectorizer

df['comment_text'] = df['comment_text'].fillna('')
comment_list = df['comment_text'].tolist()

tfidf_vectorizer = TfidfVectorizer(min_df=3,
                                   max_features=None,
                                   strip_accents='unicode',
                                   analyzer='word',
                                   token_pattern=r'\w{1,}',
                                   ngram_range=(1, 3),
                                   use_idf=True,
                                   smooth_idf=True,
                                   sublinear_tf=True)

tfidf_matrix = tfidf_vectorizer.fit_transform(comment_list)

**Reasoning**:
Split the features (tfidf_matrix) and the target variables into training and testing sets using train_test_split.



In [4]:
from sklearn.model_selection import train_test_split

target_variables = ['toxic', 'severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate']
y = df[target_variables]

X_train, X_test, y_train, y_test = train_test_split(tfidf_matrix, y, test_size=0.2, random_state=42)

display(X_train.shape)
display(X_test.shape)
display(y_train.shape)
display(y_test.shape)

(127656, 396665)

(31915, 396665)

(127656, 6)

(31915, 6)

## Model selection

### Subtask:
Choose a suitable classification model that is likely to outperform logistic regression, such as a Naive Bayes classifier, Support Vector Machine (SVM), or a deep learning model like a recurrent neural network (RNN) or transformer.


In [5]:
print("Two classification model types that are likely to outperform logistic regression on this text classification task are:")
print("\n1. Multinomial Naive Bayes:")
print("   Justification: Naive Bayes classifiers, particularly Multinomial Naive Bayes, are well-suited for text classification tasks due to their simplicity and efficiency. They work well with high-dimensional sparse data like TF-IDF matrices. While based on a naive assumption of independence, they often perform surprisingly well in practice and can serve as a strong baseline or even outperform more complex models in certain scenarios, making them a good candidate to potentially outperform logistic regression.")
print("\n2. Linear Support Vector Machine (SVM):")
print("   Justification: Linear SVMs are powerful linear classifiers that are effective in high-dimensional spaces. They aim to find the hyperplane that maximally separates the different classes. SVMs are known for their good performance on text classification tasks and can handle large feature sets like those generated by TF-IDF. The ability of SVMs to find an optimal separating hyperplane can lead to better generalization compared to logistic regression, making them a strong contender to achieve higher accuracy.")

Two classification model types that are likely to outperform logistic regression on this text classification task are:

1. Multinomial Naive Bayes:
   Justification: Naive Bayes classifiers, particularly Multinomial Naive Bayes, are well-suited for text classification tasks due to their simplicity and efficiency. They work well with high-dimensional sparse data like TF-IDF matrices. While based on a naive assumption of independence, they often perform surprisingly well in practice and can serve as a strong baseline or even outperform more complex models in certain scenarios, making them a good candidate to potentially outperform logistic regression.

2. Linear Support Vector Machine (SVM):
   Justification: Linear SVMs are powerful linear classifiers that are effective in high-dimensional spaces. They aim to find the hyperplane that maximally separates the different classes. SVMs are known for their good performance on text classification tasks and can handle large feature sets like th

## Train model


In [7]:
from sklearn.naive_bayes import MultinomialNB

for label in y_train.columns:
    model = MultinomialNB()
    model.fit(X_train, y_train[label])
    print(f"Model trained for label: {label}")

Model trained for label: toxic
Model trained for label: severe_toxic
Model trained for label: obscene
Model trained for label: threat
Model trained for label: insult
Model trained for label: identity_hate


## Evaluate model

In [8]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

results = {}

for label in y_test.columns:
    print(f"Evaluating model for label: {label}")
    # The model variable from the previous step holds the last trained model.
    # Need to retrain or load models if not already stored in a list/dict
    # For simplicity in this example, we'll retrain the model for each label.
    # In a real scenario, you would store trained models.
    model = MultinomialNB()
    model.fit(X_train, y_train[label])


    y_pred = model.predict(X_test)

    accuracy = accuracy_score(y_test[label], y_pred)
    precision = precision_score(y_test[label], y_pred, zero_division=0)
    recall = recall_score(y_test[label], y_pred, zero_division=0)
    f1 = f1_score(y_test[label], y_pred, zero_division=0)

    # For AUC, we need probability scores
    try:
        y_pred_proba = model.predict_proba(X_test)[:, 1]
        auc = roc_auc_score(y_test[label], y_pred_proba)
    except AttributeError:
        auc = "N/A (Model does not support predict_proba)"
        print(f"  AUC not calculated for {label} as model lacks predict_proba.")


    results[label] = {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        'auc': auc
    }

    print(f"  Accuracy: {accuracy:.4f}")
    print(f"  Precision: {precision:.4f}")
    print(f"  Recall: {recall:.4f}")
    print(f"  F1-score: {f1:.4f}")
    if auc != "N/A (Model does not support predict_proba)":
        print(f"  AUC: {auc:.4f}")
    print("-" * 30)

# Optional: Calculate and print average metrics (e.g., macro F1-score)
# Note: Macro averaging is sensitive to class imbalance. Weighted average might be more appropriate.
# Calculate macro F1 for comparison across labels.
macro_f1_sum = 0
auc_sum = 0
auc_count = 0

for label, metrics in results.items():
    macro_f1_sum += metrics['f1_score']
    if isinstance(metrics['auc'], float):
        auc_sum += metrics['auc']
        auc_count += 1

average_macro_f1 = macro_f1_sum / len(results)
average_auc = auc_sum / auc_count if auc_count > 0 else "N/A"

print(f"\nAverage Macro F1-score across all labels: {average_macro_f1:.4f}")
if average_auc != "N/A":
     print(f"Average AUC across labels (where calculable): {average_auc:.4f}")


Evaluating model for label: toxic
  Accuracy: 0.9217
  Precision: 0.9115
  Recall: 0.2022
  F1-score: 0.3310
  AUC: 0.8895
------------------------------
Evaluating model for label: severe_toxic
  Accuracy: 0.9896
  Precision: 0.2105
  Recall: 0.0125
  F1-score: 0.0235
  AUC: 0.8853
------------------------------
Evaluating model for label: obscene
  Accuracy: 0.9532
  Precision: 0.8553
  Recall: 0.1551
  F1-score: 0.2626
  AUC: 0.8887
------------------------------
Evaluating model for label: threat
  Accuracy: 0.9974
  Precision: 0.0000
  Recall: 0.0000
  F1-score: 0.0000
  AUC: 0.7973
------------------------------
Evaluating model for label: insult
  Accuracy: 0.9522
  Precision: 0.7418
  Recall: 0.0836
  F1-score: 0.1503
  AUC: 0.8777
------------------------------
Evaluating model for label: identity_hate
  Accuracy: 0.9903
  Precision: 0.0000
  Recall: 0.0000
  F1-score: 0.0000
  AUC: 0.7888
------------------------------

Average Macro F1-score across all labels: 0.1279
Average

## Hyperparameter tuning (optional)

In [9]:
from sklearn.model_selection import GridSearchCV
from sklearn.naive_bayes import MultinomialNB

# Define the parameter grid to search
param_grid = {'alpha': [0.01, 0.1, 0.5, 1.0, 5.0, 10.0]}

# Choose a scoring metric (macro F1 is good for imbalanced data)
# We will tune for each label individually due to the multi-label nature
scoring_metric = 'f1_macro'

tuned_models = {}
tuning_results = {}

# Tune for each label
for label in y_train.columns:
    print(f"Tuning hyperparameters for label: {label}")

    # Instantiate the Multinomial Naive Bayes model
    nb_model = MultinomialNB()

    # Instantiate GridSearchCV
    grid_search = GridSearchCV(estimator=nb_model,
                               param_grid=param_grid,
                               scoring=scoring_metric,
                               cv=5,  # 5-fold cross-validation
                               verbose=1,
                               n_jobs=-1) # Use all available cores

    # Fit GridSearchCV to the training data for the current label
    grid_search.fit(X_train, y_train[label])

    # Store the best model and best parameters
    tuned_models[label] = grid_search.best_estimator_
    tuning_results[label] = {
        'best_params': grid_search.best_params_,
        'best_score': grid_search.best_score_
    }

    print(f"Finished tuning for label: {label}")
    print(f"Best parameters: {tuning_results[label]['best_params']}")
    print(f"Best cross-validation score ({scoring_metric}): {tuning_results[label]['best_score']:.4f}")
    print("-" * 30)


Tuning hyperparameters for label: toxic
Fitting 5 folds for each of 6 candidates, totalling 30 fits
Finished tuning for label: toxic
Best parameters: {'alpha': 0.01}
Best cross-validation score (f1_macro): 0.8190
------------------------------
Tuning hyperparameters for label: severe_toxic
Fitting 5 folds for each of 6 candidates, totalling 30 fits
Finished tuning for label: severe_toxic
Best parameters: {'alpha': 0.01}
Best cross-validation score (f1_macro): 0.6850
------------------------------
Tuning hyperparameters for label: obscene
Fitting 5 folds for each of 6 candidates, totalling 30 fits
Finished tuning for label: obscene
Best parameters: {'alpha': 0.01}
Best cross-validation score (f1_macro): 0.8293
------------------------------
Tuning hyperparameters for label: threat
Fitting 5 folds for each of 6 candidates, totalling 30 fits
Finished tuning for label: threat
Best parameters: {'alpha': 0.01}
Best cross-validation score (f1_macro): 0.5746
------------------------------
Tuni

In [10]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

tuned_results_test = {}

# Evaluate the tuned models on the test set
for label, model in tuned_models.items():
    print(f"Evaluating tuned model for label: {label}")

    y_pred = model.predict(X_test)

    accuracy = accuracy_score(y_test[label], y_pred)
    precision = precision_score(y_test[label], y_pred, zero_division=0)
    recall = recall_score(y_test[label], y_pred, zero_division=0)
    f1 = f1_score(y_test[label], y_pred, zero_division=0)

    # For AUC, we need probability scores
    try:
        y_pred_proba = model.predict_proba(X_test)[:, 1]
        auc = roc_auc_score(y_test[label], y_pred_proba)
    except AttributeError:
        auc = "N/A (Model does not support predict_proba)"
        print(f"  AUC not calculated for {label} as model lacks predict_proba.")

    tuned_results_test[label] = {
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        'auc': auc
    }

    print(f"  Accuracy: {accuracy:.4f}")
    print(f"  Precision: {precision:.4f}")
    print(f"  Recall: {recall:.4f}")
    print(f"  F1-score: {f1:.4f}")
    if auc != "N/A (Model does not support predict_proba)":
        print(f"  AUC: {auc:.4f}")
    print("-" * 30)

# Optional: Calculate and print average metrics (e.g., macro F1-score) for tuned models
macro_f1_sum_tuned = 0
auc_sum_tuned = 0
auc_count_tuned = 0

for label, metrics in tuned_results_test.items():
    macro_f1_sum_tuned += metrics['f1_score']
    if isinstance(metrics['auc'], float):
        auc_sum_tuned += metrics['auc']
        auc_count_tuned += 1

average_macro_f1_tuned = macro_f1_sum_tuned / len(tuned_results_test)
average_auc_tuned = auc_sum_tuned / auc_count_tuned if auc_count_tuned > 0 else "N/A"

print(f"\nAverage Macro F1-score across all labels (Tuned Models): {average_macro_f1_tuned:.4f}")
if average_auc_tuned != "N/A":
     print(f"Average AUC across labels (where calculable, Tuned Models): {average_auc_tuned:.4f}")


Evaluating tuned model for label: toxic
  Accuracy: 0.9444
  Precision: 0.7758
  Recall: 0.5900
  F1-score: 0.6703
  AUC: 0.9341
------------------------------
Evaluating tuned model for label: severe_toxic
  Accuracy: 0.9853
  Precision: 0.3303
  Recall: 0.4517
  F1-score: 0.3816
  AUC: 0.9634
------------------------------
Evaluating tuned model for label: obscene
  Accuracy: 0.9679
  Precision: 0.7403
  Recall: 0.6216
  F1-score: 0.6758
  AUC: 0.9478
------------------------------
Evaluating tuned model for label: threat
  Accuracy: 0.9938
  Precision: 0.0699
  Recall: 0.1351
  F1-score: 0.0922
  AUC: 0.9510
------------------------------
Evaluating tuned model for label: insult
  Accuracy: 0.9638
  Precision: 0.6648
  Recall: 0.5737
  F1-score: 0.6159
  AUC: 0.9430
------------------------------
Evaluating tuned model for label: identity_hate
  Accuracy: 0.9859
  Precision: 0.2459
  Recall: 0.2551
  F1-score: 0.2504
  AUC: 0.9216
------------------------------

Average Macro F1-sco

## Classify comments


In [11]:
# 1. Define a list of new comments
new_comments = [
    "This is a great comment!",
    "You are an idiot, this is terrible.",
    "I hate everything about this.",
    "What a wonderful day it is.",
    "You should kill yourself."
]

# 2. Use the fitted tfidf_vectorizer to transform the new comments
new_comments_tfidf = tfidf_vectorizer.transform(new_comments)

# 3. Iterate through the tuned_models dictionary and predict
classification_results = {}

for i, comment in enumerate(new_comments):
    classification_results[comment] = {}
    for label, model in tuned_models.items():
        # Predict the toxicity label for the transformed new comments
        prediction = model.predict(new_comments_tfidf[i])
        classification_results[comment][label] = prediction[0] # prediction is an array

# 4. Store or print the classification results
print("Classification Results for New Comments:")
for comment, predictions in classification_results.items():
    print(f"\nComment: '{comment}'")
    for label, prediction in predictions.items():
        print(f"  {label}: {'Positive' if prediction == 1 else 'Negative'}")


Classification Results for New Comments:

Comment: 'This is a great comment!'
  toxic: Negative
  severe_toxic: Negative
  obscene: Negative
  threat: Negative
  insult: Negative
  identity_hate: Negative

Comment: 'You are an idiot, this is terrible.'
  toxic: Positive
  severe_toxic: Negative
  obscene: Positive
  threat: Negative
  insult: Negative
  identity_hate: Negative

Comment: 'I hate everything about this.'
  toxic: Negative
  severe_toxic: Negative
  obscene: Negative
  threat: Negative
  insult: Negative
  identity_hate: Negative

Comment: 'What a wonderful day it is.'
  toxic: Negative
  severe_toxic: Negative
  obscene: Negative
  threat: Negative
  insult: Negative
  identity_hate: Negative

Comment: 'You should kill yourself.'
  toxic: Positive
  severe_toxic: Negative
  obscene: Positive
  threat: Positive
  insult: Negative
  identity_hate: Negative


## Summary:

### Data Analysis Key Findings

*   The dataset contains 159,571 comments with associated toxicity labels (`toxic`, `severe_toxic`, `obscene`, `threat`, `insult`, `identity_hate`), a `clean` label, and `char_length`.
*   Text data was preprocessed and converted into numerical representations using TF-IDF vectorization with specific parameters (min\_df=3, ngram\_range=(1, 3)).
*   The data was split into training (80%) and testing (20%) sets.
*   Multinomial Naive Bayes was chosen as a suitable model for this multi-label text classification task, trained independently for each of the six toxicity labels.
*   Initial evaluation showed varying performance across labels, with better results for more frequent classes (`toxic`, `obscene`) and low F1-scores for rare classes (`severe_toxic`, `threat`, `identity_hate`) despite reasonable AUCs.
*   Hyperparameter tuning for the Multinomial Naive Bayes models using `GridSearchCV` found an optimal `alpha` of 0.01 across all labels within the tested range.
*   Evaluation of the tuned models on the test set showed an average Macro F1-score of approximately 0.4477 and an average AUC of approximately 0.9435.
*   The tuned models were successfully used to classify new comments, providing predictions for each toxicity label.

### Insights or Next Steps

*   The significant class imbalance strongly impacts the F1-score, particularly for rare toxicity types. Further steps could explore techniques to address this imbalance, such as oversampling minority classes or using different evaluation metrics like Weighted F1-score or metrics specifically designed for imbalanced data.
*   While AUC is high, indicating good discriminatory power, the low F1 for rare classes suggests the model has difficulty setting a threshold for positive predictions. Investigating different prediction probability thresholds or using models better suited for highly imbalanced multi-label classification could be beneficial.
