This repository belongs to our workshop publication at AI4LAC@JCDL2024. You find a short version of our article on the workshop webpage and a technical report (long version) at arXiv. The links will be added as soon as they are available.
We cannot share the actual Narrative Service database which is required for the recommendation process (legal and space reasons). Please note that the repository has a dependency to our Narrative Service whose code is publicly available on GitHub. Unfortunately, we cannot publish the pre-processed document data (the extracted graph information). However, we hope that our implementation may shed light on details which are not described in the paper.
- First Stages: FSConcept, FSNode and FSCore
- CoreOverlap and GraphRec is coreoverlap + BM25
- Splade
- BM25 Index Creation and BM25 First Stage and BM25 ReScoring
- Edge Scoring and Node Scoring
- Evaluation and Analysis Scripts
- Explanation Generation and Recommender App
This project requires to clone it with sub repositories:
Therefore, please clone the project via:
git clone --recurse-submodules https://github.com/HermannKroll/NarrativeRecommender.git
Update the repository:
git pull --recurse-submodules
Create a virtual python 3.8 environment via, e.g., conda:
conda create -n narrec python=3.8
Activate the environment
conda activate narrec
Install all Python requirements:
pip install -r requirements.txt
The splade requirement has to be installed via
pip install -r requirements_splade.txt
Make always be sure that if you run any of our scripts, you activated your conda environment and set the Python Path.
conda activate narrec
export PYTHONPATH="/home/USER/NarrativeRecommender/src/:/home/USER/NarrativeRecommender/lib/NarrativeIntelligence/src/:/home/USER/NarrativeRecommender/lib/NarrativeAnnotation/src/:/home/USER/NarrativeRecommender/lib/KGExtractionToolbox/src/"
First download required data:
cd ~/NarrativeRecommender/lib/NarrativeAnnotation/
bash download_data.sh
Next build required entity translation indexes.
pythn ~/NarrativeRecommender/lib/NarrativeAnnotation/src/narrant/build_all_indexes.py
The project requires a connection to our database. Therefore, please bind your local postgres port to the server port via:
ssh -N -f -L localhost:5432:localhost:5432 USER@DB_SERVER_IP
If you use PyCharm for development purposes, make sure that you mark src and the src directories of the three modules as "Sources Root". Then PyCharm is able to complete and run code.
Make sure that your Python path is set and the correct environment is activated.
Run the edge debug server via:
cd src/narrec/scoring/flask/
flask --app edge_debug run
Run the recommendation server via:
cd src/narrec/
flask --app recommender_app run
Bind the server port to your local port:
ssh -N -f -L localhost:5000:localhost:5000 USER@SERVER_IP
Open a browser:
http://localhost:5000/37895839
Extract document ids for each benchmark via extract Genomic IDs, extract RELISH IDs, and extract PM2020 IDs These scripts will extract document id for Genomics 2005 and relish.