Benchmarking experiments of different news recommender systems using GeNeG and its corresponding news corpus.
- Content-based recommnders
- TF-IDF
- Word2vec
- Transformer
- Collaborative filtering recommenders
- Alternating Least Squares (ALS)
- Knowledge-aware recommenders
- RippleNet
- DKN
Configurations for directories, filepaths, and some model parameters can be set in config.py
.
Train a model
python -m src.train
Evaluate a model
python -m src.evaluate
Prepare data for RippleNet
python -m src.preprocess_ripplenet
Train and evaluate RippleNet
python -m src.run_ripplenet
Prepare data for DKN
python -m src.preprocess_dkn
Preprocess news data and train Word2vec model
python -m src.dkn_news_preprocess
Preprocess entity data and train TransX model
python -m src.prepare_data_for_transx
python -m src.transx.train_transe (note: you can also choose other KGE methods)
python -m src.dkn_kg_preprocess
Train and evaluate DKN
python -m src.run_dkn
The data necessary to run the models can be found in the data
folder.
The article's content is required to train the content-based recommender systems and to compute recommendations. A sample of the news corpus is available in the data/articles.csv
file. Due to copyright policies, this sample does not contain the abstract and body of the articles.
A full version of the news corpus is available upon request.
This code is implemented in Python 3. The requirements can be installed from requirements.txt
.
pip3 install -r requirements.txt
The code is licensed under the MIT License. The data and knowledge graph files are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Parts of the code were originally forked and adapted from:
We owe many thanks to the authors of RippleNet-TF2, DKN, and OpenKE for making their codes available.