This codebase generates interesting graphs for ML experiments. We've currently focussed on the challenges of review prediction, given three types of nodes:
- Product
- Person
- Review
A review has a score, which is the goal of each experiment to determine.
There are a range of datasets you can generate:
- experiment_0
- experiment_1
- experiment_2
- experiment_3
- experiment_4
- experiment_5
- article_0 - companion to the article "An introduction to machine learning on graph databases"
- article_1 - companion to the article "Review prediction using graph data in Neo4j and an embedding in Tensorflow"
To generate a dataset (for example, article_0) run the following in this repository's root directory:
python3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt
./generate.py --dataset article_0