Skip to content

kh4nh12/ViSoMeCens

Repository files navigation

ViSoMeCens: Vietnamese Social Media Censorship Application

This repository is used to store the codes of our paper "Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data".

The paper is available at: https://arxiv.org/abs/2206.00524

Please use the *.ipynb files in these folders to execute.

References

  1. PhoBERT: Pre-trained language models for Vietnamese - https://github.com/VinAIResearch/PhoBERT
  2. Convolutional Neural Networks for Sentence Classification - https://github.com/yoonkim/CNN_sentence
  3. Apache spark: a unified engine for big data processing - https://spark.apache.org/docs/3.1.1

Further Usage

For any usage related to all codes and data used from our repository, please cite our following paper:

@article{quoc2023vietnamese,
  title={Vietnamese hate and offensive detection using PhoBERT-CNN and social media streaming data},
  author={Quoc Tran, Khanh and Trong Nguyen, An and Hoang, Phu Gia and Luu, Canh Duc and Do, Trong-Hop and Van Nguyen, Kiet},
  journal={Neural Computing and Applications},
  volume={35},
  number={1},
  pages={573--594},
  year={2023},
  publisher={Springer}
}

For any questions, please contact our corresponding author: Mr. Khanh Quoc Tran at 18520908@gm.uit.edu.vn.