HateSpeechDetection

Introduction

This project is part of the text analytics course at Heidelberg university. The goal of this project is to classify social media posts on hate speech using text analytics methods.

This Repo contains all files of the project.

The documentation is located in the docs folder. Within this folder you can find among other documents the project proposal and the project report.
The project's source code is located in the src folder, the tests in the tests folder and the code coverage in the htmlcov folder.
The assignments of the lecture are located in the assignments folder and are not directly connected to this project.

Project team

Christopher Klammt
Felix Hausberger
Nils Krehl

Setup Instructions

Run the project

Install Python 3.7
If the operating system is Windows, install the Microsoft build tools für C++ (needed for fastText installation)
Install pipenv
```
pip install pipenv
```
Install all the dependencies defined in the Pipfile
```
pipenv install --dev
```
Enter the virtual environment of pipenv
```
pipenv shell
```
Download and add the original datasets (Automated Hate Speech Detection and the Problem of Offensive Language, Hate speech dataset from a white supremacist forum) The resulting directory structure should look like the following:
Run the program (on our computers this takes about 10 min)
```
pipenv run main
```
Run the tests
```
pipenv run test && pipenv run report
```
Leave the virtual environment of pipenv
```
exit
```

Normally all needed dependencies are downloaded automatically. If this is not the case, try the following:

sudo pipenv run spacy download en (Assignment 2)
sudo pipenv run nltk.downloader vader_lexicon
sudo pipenv run nltk.downloader averaged_perceptron_tagger

Development setup for the project

Set up the git hook scripts
```
 pre-commit install
```

Run the assignments

For running the assignments further dependencies are needed:

pdftotext (additional os dependencies needed) (Assignment 1)

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
.github/workflows		.github/workflows
assignments		assignments
docs		docs
htmlcov		htmlcov
src		src
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
setup.py		setup.py

License

fidsusj/HateSpeechDetection

Folders and files

Latest commit

History

Repository files navigation

HateSpeechDetection

Introduction

Project team

Setup Instructions

Run the project

Development setup for the project

Run the assignments

About

Resources

License

Stars

Watchers

Forks

Languages