Elastic Graph Recommender
Building recommenders with Elastic Graph! This app makes movie recommendations using Elastic graph based on the Movielens data set. Movielens is a well known open data set with user movie ratings.
We use this data alongside The Movie Database(TMDB). TMDB has all the movie details such as title, image URL, etc.
ETL and data prep
In the etl/ folder there are several Python scripts for importing movielens & TMDB data into Elasticsearch into two collections.
- One index,
movielensstores user view data. Each record is a user and the movielens identifiers of movies they liked. The primary key is a movielens user id. These documents hold a single field
liked_movies-- the movielens ids of the movies this user liked.
- Another index
ml_tmdbuses the mapping from movielens ids -> tmdb ids to store details about each movies (title, poster image URL, etc). The primary key is the movielens movie id.
Import Movielens ratings
prepareData.shis a shell script for downloading the latest movielens data (ml-20m) and unpacking it to the ml-20m folder.
ratingsToEs.pyis a Python 2.7 script for importing movielens data into Elasticsearch
Import TMDB movie details
It's recommended you get the prepared source data file
ml_tmdb.json from someone. But you can recreate it with the scripts below
tmdb.pycrawls the movielens TMDB movies into tmdb.json
rehashTmdbToMl.pycreates ml_tmdb.json, which is tmdb.json with the movielens as the primary identifier
indexMlTmdb.pyindexes ml_tmdb.json into Elasticsearch
app/ folder holds an angular app for querying Elasticsearch via the graph API for recommendations.
app/depends.sh shell script for bootstrapping bower and npm dependencies
Run the app
Start a dumb web server in the app/ dir,
cd app/ ./srv.sh
Tests are run via Karma, you can run
app/test.sh to run tests. When debugging, I use the following command:
node_modules/karma/bin/karma start --no-single-run --log-level debug --auto-watch --browsers Chrome
which runs Karma in Chrome, autowatching the source files.
By rubbing two sticks together to start a fire
- However you like to deploy stuff, there's a script bootstrap.sh that lists the steps taken to provision an Ubuntu box with Elastic Graph. NOTE this script is meant for development purposes, it does several non-secure things like opens up Elasticsearch to the world and has very liberal CORS permissions.
By using Docker
Start the docker images via:
docker login harbor.dev.o19s.com # ask Eric for credentials docker run -d -p 9200:9200 -p 9300:9300 --name elasticsearch harbor.dev.o19s.com/elastic-graph-recommender/elasticsearch:latest docker run -d -p 8000:8000 --name app -e ELASTICSEARCH_URL=http://localhost:9200 harbor.dev.o19s.com/elastic-graph-recommender/app:latest
If you are deploying in the cloud, remember that the
ELASTICSEARCH_URL is pointing to the public URL for the Elasticsearch node, so update accordingly!
Load the demo data via:
docker exec -it elasticsearch python /etl/rehashTmdbToMl.py docker exec -it elasticsearch python /etl/indexMlTmdb.py http://localhost:9200 /etl/ml_tmdb.json docker exec -it elasticsearch python /etl/ratingsToEs.py http://localhost:9200 /etl/ml_tmdb.json /etl/ml-20m/ratings.csv
By using a blow torch
docker login harbor.dev.o19s.com # ask Eric for credentials docker-compose up
Browse to http://localhost:8000 to try it out!
Building Docker images
Build the docker images from scratch via:
docker build -t elastic-graph-recommender/elasticsearch -f deploy/elasticsearch/Dockerfile . docker build -t elastic-graph-recommender/app -f deploy/app/Dockerfile . docker build -t elastic-graph-recommender/init -f deploy/init/Dockerfile .
Deploy to our private Docker registry http://harbor.dev.o19s.com:
docker login harbor.dev.o19s.com docker tag elastic-graph-recommender/elasticsearch harbor.dev.o19s.com/elastic-graph-recommender/elasticsearch docker tag elastic-graph-recommender/app harbor.dev.o19s.com/elastic-graph-recommender/app docker tag elastic-graph-recommender/init harbor.dev.o19s.com/elastic-graph-recommender/init docker push harbor.dev.o19s.com/elastic-graph-recommender/elasticsearch docker push harbor.dev.o19s.com/elastic-graph-recommender/app docker push harbor.dev.o19s.com/elastic-graph-recommender/init