# Toxicity level of text model

We need to predict toxicity level for text which we will translate.

## Downloading data

It's better to take initial data to train our model, because it doesn't need to have only height level toxic text.

In [1]:
import pandas as pd
import zipfile

with zipfile.ZipFile("../data/raw/filtered_paranmt.zip", "r") as zip_ref:
    with zip_ref.open("filtered.tsv") as file:
        df = pd.read_csv(file, sep='\t')


In [2]:
tox_df = df[['reference', 'ref_tox']].copy()
tox_df = tox_df[:40000]

In [3]:
tox_df.head()

Unnamed: 0,reference,ref_tox
0,"If Alkar is flooding her with psychic waste, t...",0.014195
1,Now you're getting nasty.,0.065473
2,"Well, we could spare your life, for one.",0.213313
3,"Ah! Monkey, you've got to snap out of it.",0.053362
4,I've got orders to put her down.,0.009402


## Data preprocess

The text data is stored in the 'X' variable, representing the reference texts, and the corresponding toxicity labels are stored in the 'y' variable. We use the TfidfVectorizer from scikit-learn to transform the text data into numerical features, specifically TF-IDF vectors, which represent the importance of words in each document relative to their importance in the entire corpus

In [4]:
from sklearn.model_selection import train_test_split

X = tox_df['reference']  # тексты
y = tox_df['ref_tox']    # уровни токсичности

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.125, random_state=42)


In [5]:
from sklearn.feature_extraction.text import TfidfVectorizer

tfidf_vectorizer = TfidfVectorizer(max_features=5000, stop_words='english')
X_train_tfidf = tfidf_vectorizer.fit_transform(X_train)
X_test_tfidf = tfidf_vectorizer.transform(X_test)
X_train_tfidf = X_train_tfidf.astype('float32')



## Model training

The model is a feedforward neural network with three layers, employing ReLU activation functions for the first two layers and a sigmoid activation in the final layer. It is designed for binary classification tasks, where it takes TF-IDF vector inputs and outputs a probability of class membership between 0 and 1.

In [6]:
import tensorflow as tf
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(X_train_tfidf.shape[1],)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])


In [7]:
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])


In [8]:
train_dataset = tf.data.Dataset.from_tensor_slices((X_train_tfidf.toarray(), y_train))
test_dataset = tf.data.Dataset.from_tensor_slices((X_test_tfidf.toarray(), y_test))


In [9]:
batch_size = 32
epochs = 20

model.fit(train_dataset.batch(batch_size), epochs=epochs, validation_data=test_dataset.batch(batch_size))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.src.callbacks.History at 0x142e22f2430>

## Predict and check it

In [10]:
y_pred = model.predict(X_test_tfidf.toarray())



Post-processing the model's predictions and evaluating its accuracy.

In [11]:
from sklearn.metrics import accuracy_score

y_test_binary = (y_test > 0.5).astype(int)
y_pred_binary = (y_pred > 0.5).astype(int)

accuracy = accuracy_score(y_test_binary, y_pred_binary)
print(f'Accuracy: {accuracy:.2f}')


Accuracy: 0.74


## Saving model

Now we can predict the level of toxicity of any text with an accuracy of 0.74, we will save the model to use it in the future

In [12]:
model.save('../models/toxLevelModel.h5')

  saving_api.save_model(
