Identifying and Categorizing Offensive Language on Twitter

Offensive language, hate speech and cyberbullying have become increasing more pervasive in social media. Individuals frequently take advantage of the perceived anonymity on social media platforms, to engage in brash and disrespectful behaviour that many of them would not consider in real life. The goal of this project is to use a hierarchical model to not only identify tweets/messages with offensive language but categorize the type and the target of offensive messages on social media.

How To

Create virutal env and install dependencies

conda create -n [ENV] python=3.7
conda activate [ENV]
pip install -r requirements.txt
wget http://nlp.stanford.edu/data/glove.twitter.27B.zip

Visit the following notebooks
- EDA : Exploratory analysis and visualizations
- Preprocessing: Data Cleaning, Feature Engineering and more
- NBSVM: NBSVM classifier for all 3 sub-tasks
- LSTM: LSTM classifier for all 3 sub-tasks
- CNN Text (Simplified): Simplified version of CNN Text proposed by Kim with one single input channel
- CNN Text (OG): Original architecture of CNN Text by Kim with multichannel inputs
Full report for implementation details, results, conclusion here

FAQ

Please reach out to arsaikia@iu.edu for feedback and suggestions

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data/OLID		data/OLID
documents		documents
notebooks		notebooks
preprocessed		preprocessed
results		results
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identifying and Categorizing Offensive Language on Twitter

How To

FAQ

About

Releases

Packages

Languages

arunavsk/OffenseEval2019

Folders and files

Latest commit

History

Repository files navigation

Identifying and Categorizing Offensive Language on Twitter

How To

FAQ

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages