Skip to content

New York Times graph with toy collaborative filtering recommender

Notifications You must be signed in to change notification settings

jvani/nyt-comments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nyt-comments

This project creates a graph of New York Times Articles (node), Users (node), and user's comments (relationships). The data was taken from kaggle: see New York Times Comments.

The data contains information about the comments made on the articles published in New York Times in Jan-May 2017 and Jan-April 2018. The month-wise data is given in two csv files - one each for the articles on which comments were made and for the comments themselves. The csv files for comments contain over 2 million comments in total with 34 features and those for articles contain 16 features about more than 9,000 articles.

Running this project will start 2 services:

Neo4J Graph @ http://localhost:7474 Toy Recommender @ http://localhost:5000
Neo4J Graph Toy Recommender

Requirements

git
docker
docker-compose

Note: Downloading our dataset (via setup.sh) uses kaggle API credentials stored in ~/.kaggle/kaggle.json. See API Credentials from the kaggle-api docs.

Setup

git clone https://github.com/jvani/nyt-comments.git && \
  cd nyt-comments && \
  ./setup.sh

The setup.sh script will do the following:

  1. Pull used docker images.
  2. Download neo4j plugins (written to $PWD/plugins)
  3. Download our dataset (written to $PWD/kaggle/)
  4. Create our graph data (written to $PWD/import)
  5. Import our data into neo4j (database files will be written to $PWD/data)
  6. Start our docker-compose services: neo4j and toy recommender. NOTE: a jupyter server is included but the image is HUGE and off by default; uncomment in the docker-compose.yaml if desired.
  7. Run graph statistics on our database.

About

New York Times graph with toy collaborative filtering recommender

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published