PuReGoMe scripts contains the python and bash scripts of the PuReGoMe project of the Netherlands eScience Center and the University of Utrecht, except for the script involving querying the data based on content, those are available in a separate repository PuReGoMe queries.
The most important script is text-unique.py which is being used for removing duplicate tweets from the data.
pip install -r requirements.txt
pip install .
pytest tests
Erik Tjong Kim Sang e.tjongkimsang@esciencecenter.nl