Exploring the Pāli Canon with Machine Learning
Use of Anaconda is recommended.
- Create an Anaconda environment:
conda create --name canon python=3
- Activate the environment:
conda activate canon
- Install Anaconda packages:
conda install nltk tensorflow scikit-learn matplotlib
- Install non-Anaconda packages:
pip install bs4
python process.py to download the ATI archive (if needed) and process text into
python analyze.py to train and evaluate a word vector model on the processed sentences. This will produce some examples for evaluation and save a
tsne.png file that contains the t-SNE plot for the results. Subsequent runs will use saved model data, unless the
data/model directory is deleted.