This project contains all the data to analyze a text, whether it is positive or negative.
These instructions will get you a copy of the project on your local machine for development.
The project is created with the programming language Python3 and the manager pipenv.
To install all the necessary dependencies, the following command must be executed in the project.
# create virutal environment
python -m venv .venv
# activate virtual environment
.venv\Scripts\activate.bat # on windows
source .venv/bin/activate # on unix or macos
# installing necessary dependencies
pipenv install --dev
# optional: installing trained spacy pipeline
pipenv run python -m spacy download en_core_web_trfYou can have a first look at the data if you execute the following command:
python first_look.pyIf you want to pre-process the raw data, you can run one of the pre-processing scripts for the tweets or the film reviews.
python preprocess_tweets.py
python preprocess_reviews.pyThe preprocessed data can be analysed with the ensemble methods and the raw data with the huggingface transformer models. Afterwards, the results and graphics can be saved with the option -s.
python analysis_ensemble.py <-s>
python analysis_transformer.py <-s>The cache for the downloaded models of the transformers can be set with the following environment variable:
TRANSFORMERS_CACHE=../.cache/huggingface/