You can install the library with pip. It will install Python dependencies and the library itself.
pip install .
Perform the following additional steps to complete the installation.
# On Mac Osx
pip install poppler
# On Linux
sudo apt-get install poppler-utils
- copora : europarl_raw
- model : punkt
# In a bash terminal
NLTK_DATA = '/path to data dir/nltk_data'
mkdir -p $NLTK_DATA
python -m nltk.downloader punkt -d $NLTK_DATA
python -m nltk.downloader europarl_raw -d $NLTK_DATA
# Set to dowloaded location
export NLTK_DATA=/path to the NLTK data folder
bash /library path/examples/bash/run_data_preparation.sh
To build a docker image of asrt:
docker build -t asrt .
To run run_data_preparation.py
using docker image, where your test file is
docker-example/research.txt
:
docker run -d -v $PWD/docker-example:/usr/local/asrt/data asrt -i data/research.txt -o /usr/local/asrt/data
# Or with the same pdf as above example:
docker run -d -v $PWD/docker-example:/usr/local/asrt/data asrt -i data/Research.pdf -o /usr/local/asrt/data
The output will then be in docker-example/sentences_*.txt