In this section, I import the core libraries for the pipeline. I use Pandas and Regex for data handling and cleaning, while Scikit-Learn manages the train-test split. Finally, I load the necessary TensorFlow and Keras modules to tokenize inputs and build the LSTM neural network architecture designed to classify the comments.

In [None]:
import pandas as pd
import re
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Dropout

I load the dataset from the CSV file into a dataframe. Then, I display the first few rows to verify that the data loaded correctly and to understand the structure of the columns I will be working with.

In [None]:
# load the data into a dataframe
df = pd.read_csv('cyberbullying_tweets.csv')

# show the first rows
df.head()

Unnamed: 0,tweet_text,cyberbullying_type
0,"In other words #katandandre, your food was cra...",not_cyberbullying
1,Why is #aussietv so white? #MKR #theblock #ImA...,not_cyberbullying
2,@XochitlSuckkks a classy whore? Or more red ve...,not_cyberbullying
3,"@Jason_Gio meh. :P thanks for the heads up, b...",not_cyberbullying
4,@RudhoeEnglish This is an ISIS account pretend...,not_cyberbullying


During the training process, I noticed the model lacked sufficient data to detect antisemitism. To address this gap, I generated synthetic examples of hate speech targeting this specific group using AI. I am now adding these samples to the dataset to ensure the model learns to recognize and classify this type of harassment effectively.

In [None]:
# define hate speech data for class balancing
new_tweets = [
    "i hate jews", "kill all jews", "jews are evil",
    "burn the synagogue", "death to israel", "dirty kike",
    "hitler was right", "gas the jews", "jews control everything",
    "nazi pride", "jews are rats", "end the jewish race",
    "filthy jews", "spit on jews", "jews destroy nations",
    "antisemitism is good", "shoot the jews", "jews are parasites",
    "ban all jews", "die jew", "jews are scum",
    "white power", "destroy zionists", "jews are cancer",
    "jews are the problem", "horrible jews", "disgusting kike",
    "inferior jews", "jews lie", "trust no jew",
    "jews are killers", "subhuman jews", "cursed jews",
    "vile jews", "sick jews", "stupid jews",
    "ugly jews", "jewish pigs", "get out jews",
    "jews ruin everything", "stop the jews", "jews are the enemy",
    "we need hitler", "pure hate for jews", "jews suck",
    "anti jew", "kill the kikes", "holocaust was fake",
    "jews should not exist", "the world would be better without jews", "jews poison society",
    "jews are a disease", "eradicate the jews", "jews deserve extinction",
    "jews are vermin", "jews hide behind money", "jews manipulate governments",
    "jews own the media", "jews spread corruption", "jews should be expelled",
    "no place for jews here", "jews are a plague", "jews rot nations from inside",
    "jews are behind every war", "jews should burn", "jews deserve pain",
    "jews must be eliminated", "jews are monsters", "jews ruin cultures",
    "jews destroy countries", "jews are traitors", "jews are filth",
    "jews infect everything", "jews are not human", "jews should disappear",
    "jews deserve hate", "jews are toxic", "jews are disgusting creatures",
    "jews are a mistake", "jews should be wiped out", "jews are pure evil",
    "jews deserve punishment", "jews must be erased", "jews are worthless",
    "jews should suffer", "jews are parasites on humanity", "jews bring decay",
    "jews should be removed", "jews are trash", "jews deserve death",
    "jews are criminals by nature", "jews corrupt society", "jews are the root of evil",
    "jews are a curse", "jews should be hunted", "jews are subhuman scum"
]


# create dataframe with religion label
extra_df = pd.DataFrame({
    'tweet_text': new_tweets,
    'cyberbullying_type': ['religion'] * len(new_tweets)
})

# merge with original data and shuffle
df = pd.concat([df, extra_df], ignore_index=True)
df = df.sample(frac=1, random_state=42).reset_index(drop=True)

print(f"added {len(new_tweets)} rows. total: {len(df)}")

added 96 rows. total: 47788


I clean the dataset by removing null values and duplicates. Then, I print the new size to see how many rows are left.

In [None]:
print('Nulos:', df.isnull().sum())
print('Duplicados:', df.duplicated().sum())

df = df.dropna() # remove null values
df = df.drop_duplicates() # remove duplicates


print('Data limpia:', df.shape)

Nulos: tweet_text            0
cyberbullying_type    0
dtype: int64
Duplicados: 36
Data limpia: (47752, 2)


I define a function to clean the text by removing URLs, mentions, and special characters, while converting everything to lowercase. Then, I apply this function to the entire column and display the results to check how the cleaned text looks.

In [None]:
# function to clean dirty text
def limpiar_texto(text):
    text = str(text).lower() # convert everything to lowercase
    text = re.sub(r'http\S+|www\S+|https\S+', '', text) # remove URLs
    text = re.sub(r'@\w+', '', text) # remove users
    text = re.sub(r'[^\w\s]', '', text) # remove weird characters
    return text.strip()

# apply the cleaning to the whole dataframe
df['text_clean'] = df['tweet_text'].apply(limpiar_texto)

# check how it looks now
df[['tweet_text', 'text_clean']].head()

Unnamed: 0,tweet_text,text_clean
0,@sschinke @1lb_cake @DiscordianKitty @Grummz S...,sam i appreciate you trying to correct people ...
1,RT @k_halvy22 Women shouldn't be football anno...,rt women shouldnt be football announcers nots...
2,[For non-religious reasons] I tried this with ...,for nonreligious reasons i tried this with a h...
3,To the girls that bullied me in middle school ...,to the girls that bullied me in middle school ...
4,Was thinking about this today. He's like the h...,was thinking about this today hes like the hig...


I download the list of common useless words (stopwords) using NLTK. Then, I create a function to remove these words from the text to focus only on the meaningful content. Finally, I apply this filter to the dataset and check the results.

In [None]:
import nltk
from nltk.corpus import stopwords

# download the stopwords dictionary
nltk.download('stopwords')
stop_words = set(stopwords.words('english'))

# function to filter those words
def sacar_stopwords(text):
    words = text.split() # split into words
    filtered = [w for w in words if w not in stop_words] # filter out stopwords
    return " ".join(filtered) # join back together

# apply to df
df['text_clean'] = df['text_clean'].apply(sacar_stopwords)

# see how it looks now
df[['tweet_text', 'text_clean']].head()

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Unnamed: 0,tweet_text,text_clean
0,@sschinke @1lb_cake @DiscordianKitty @Grummz S...,sam appreciate trying correct people please dr...
1,RT @k_halvy22 Women shouldn't be football anno...,rt women shouldnt football announcers notsexis...
2,[For non-religious reasons] I tried this with ...,nonreligious reasons tried hoodie highschool g...
3,To the girls that bullied me in middle school ...,girls bullied middle school high school hope f...
4,Was thinking about this today. He's like the h...,thinking today hes like high school bully surr...


I configure the tokenizer to turn words into numbers. I initially set the vocabulary size to 5,000, but it missed too much information. After some trials, I found that 20,000 was the most effective size. Finally, I convert the text into sequences and pad them to ensure they all have the same length.

In [None]:
# set limits
vocab_size = 20000
max_length = 100

# create the tokenizer
tokenizer = Tokenizer(num_words=vocab_size, oov_token='<OOV>')
tokenizer.fit_on_texts(df['text_clean'])

# turn text into number sequences
sequences = tokenizer.texts_to_sequences(df['text_clean'])

# pad with zeros so all tweets have the same length
padded = pad_sequences(sequences, maxlen=max_length, padding='post', truncating='post')

# check how the first tweet looks after converting
print('Original text:', df['text_clean'][0])
print('Converted to numbers:', padded[0])

Original text: sam appreciate trying correct people please drop mention
Converted to numbers: [2155 1360  153 1044    7  180 1089  969    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0]


I initialize the LabelEncoder to convert the text categories into numbers that the model can understand. Then, I transform the target labels and print the classes to verify the mapping.

In [None]:
from sklearn.preprocessing import LabelEncoder

# init the encoder
encoder = LabelEncoder()

# turn text labels into numbers
labels = encoder.fit_transform(df['cyberbullying_type'])

# check the classes and their numbers
print('Clases:', encoder.classes_)
print('Numeros:', labels[:10])

Clases: ['age' 'ethnicity' 'gender' 'not_cyberbullying' 'other_cyberbullying'
 'religion']
Numeros: [4 2 0 0 0 4 1 3 1 4]


I use the train_test_split function to divide the data, setting aside 80% for training the model and 20% for testing its performance. I also use a random state to ensure the results are reproducible and print the shapes to verify the final distribution.

In [None]:
# split 80% for training, 20% for testing
X_train, X_test, y_train, y_test = train_test_split(padded, labels, test_size=0.2, random_state=42)

# check the shapes
print('Entrenamiento:', X_train.shape)
print('Test:', X_test.shape)

Entrenamiento: (38201, 100)
Test: (9551, 100)


First i clear the memory session to start fresh. I build the model using an Embedding layer and a Bidirectional LSTM to read tweets in both directions. For training, I initially tried 5 epochs but noticed overfitting, and 3 epochs felt insufficient, so I settled on 4 as the best balance. Finally, I compile the model with the Adam optimizer and start the training process.

In [None]:
# clean up memory
tf.keras.backend.clear_session()

# build the model
model = Sequential()
model.add(Embedding(input_dim=20000, output_dim=64, input_length=100))

# bidirectional: reads the tweet forwards and backwards
model.add(tf.keras.layers.Bidirectional(LSTM(64)))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(6, activation='softmax'))

# compile
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# train
history = model.fit(X_train, y_train, epochs = 4, validation_data=(X_test, y_test))



Epoch 1/4
[1m1194/1194[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m163s[0m 131ms/step - accuracy: 0.6343 - loss: 0.8497 - val_accuracy: 0.8135 - val_loss: 0.4407
Epoch 2/4
[1m1194/1194[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m143s[0m 120ms/step - accuracy: 0.8496 - loss: 0.3623 - val_accuracy: 0.8306 - val_loss: 0.4123
Epoch 3/4
[1m1194/1194[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m201s[0m 119ms/step - accuracy: 0.8942 - loss: 0.2713 - val_accuracy: 0.8258 - val_loss: 0.4581
Epoch 4/4
[1m1194/1194[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 117ms/step - accuracy: 0.9112 - loss: 0.2231 - val_accuracy: 0.8122 - val_loss: 0.5550


I implement a stemming process and a prediction function to test the model with real-world examples. I want to clarify that I distance myself from the content of these test comments as they are used strictly for scientific purposes and to check accuracy. Finally, I run several tests to see how the model classifies various phrases and to check its confidence levels.

In [None]:
from nltk.stem import SnowballStemmer

# initial setup
nltk.download('stopwords')
stop_words = set(stopwords.words('english'))
stemmer = SnowballStemmer('english')

def sacar_stopwords(text):
    return " ".join([word for word in text.split() if word not in stop_words])

def cortar_palabras(text):
    return " ".join([stemmer.stem(word) for word in text.split()])

def predecir_odio(frase_original):
    # processing
    clean = limpiar_texto(frase_original)
    clean = sacar_stopwords(clean)
    clean = cortar_palabras(clean)

    # prediction
    seq = tokenizer.texts_to_sequences([clean])
    padded = pad_sequences(seq, maxlen=100, padding='post', truncating='post')

    prediccion = model.predict(padded)
    indice = prediccion.argmax()
    etiqueta = encoder.inverse_transform([indice])[0]
    confianza = prediccion.max() * 100

    print(f"Frase: '{frase_original}'")
    print(f"Detectado: {etiqueta.upper()} (Seguridad: {confianza:.1f}%)")
    print("-" * 40)

# tests
print("test")
predecir_odio("I hate you, you are stupid and ugly")
predecir_odio("Muslims are terrorists")
predecir_odio("I love pizza and playing roblox with friends")
predecir_odio("You are a gay freak")
predecir_odio("Go back to your country fucking mexican")
predecir_odio("Podra esta ia de mierda entender comentarios en español o es estupida?")
predecir_odio("we have to kill all the jews in the world")
predecir_odio("Hitler was right, we have to kill this kind of people")
predecir_odio("Sorry for the previous comments, they have a scientific purpose :p")

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


test
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 648ms/step
Frase: 'I hate you, you are stupid and ugly'
Detectado: OTHER_CYBERBULLYING (Seguridad: 72.8%)
----------------------------------------
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 77ms/step
Frase: 'Muslims are terrorists'
Detectado: RELIGION (Seguridad: 99.5%)
----------------------------------------
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 76ms/step
Frase: 'I love pizza and playing roblox with friends'
Detectado: NOT_CYBERBULLYING (Seguridad: 64.6%)
----------------------------------------
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 71ms/step
Frase: 'You are a gay freak'
Detectado: GENDER (Seguridad: 85.2%)
----------------------------------------
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 74ms/step
Frase: 'Go back to your country fucking mexican'
Detectado: ETHNICITY (Seguridad: 87.6%)
----------------------------------------
[1m1/

I save the trained model and the necessary preprocessing files to use them later. I first exported the model in .h5 format, but it gave me two hours of headaches due to compatibility issues during web deployment. To fix this, I came back and exported it in the .keras format instead, which works perfectly for my auditor. Finally, I use pickle to save the tokenizer and encoder to ensure the data is processed the same way every time.

In [None]:
import pickle

# model
model.save('model_bullying.keras')

# tokenizer
with open('tokenizer.pickle', 'wb') as h:
    pickle.dump(tokenizer, h, protocol=pickle.HIGHEST_PROTOCOL)

# encoder
with open('encoder.pickle', 'wb') as h:
    pickle.dump(encoder, h, protocol=pickle.HIGHEST_PROTOCOL)

print("generated files")

generated files
