The development setup consists of two parts. The first part is needed for running the project. The second part is needed for committing changes to the project.
This project uses conda
and poetry
for dependency management and building.
The setup is done as follows:
-
Create a new conda environment
conda env create -f env.yml
-
Activate the environment
conda activate data-2022
-
Tell poetry to not create a virtualenv
poetry config --local virtualenvs.create false
-
Install all dependencies
poetry install
-
Extract the data
tar -xvf data.tar.xz
-
Download the pre-trained fasttext embeddings: https://drive.google.com/drive/folders/1a9llDhoM6zD-sOKiM0AdSxDYq2-15PJD and put them in data/raw
-
Run the experiments
dvc repro