Skip to content

seroetr/Spam_MultiLabel_Text_Classification

Repository files navigation

Spam Text Classification

  • Spam Text Classification using Machine Learning Models

Multi-Label Text Classification

Hate Speech Classification

  • In Tensorflow, Convolutional Neural Network is used to perform hate speech classification.
  • Dataset hate_speech_data.csv can be reached via https://raw.githubusercontent.com/laxmimerit/hate_speech_dataset/master/data.csv.
  • First data preprocessing is realized. After that, since the dataset is not huge, only one layer CNN model is built.
  • Using from tensorflow.keras.preprocessing.text import Tokenizer, numerical values are assigned to words in dictionary format so that each word has its own word_index.
  • Each sentence will be converted into a sequence where each word is replaced by its number in the word index using tokenizer.texts_to_sequences(sentences).