The task here is to make a movie recommendation with a graph by treating the movies as nodes, and creating edges between movies that have similar ratings by the users.

In [1]:
import networkx as nx
import pandas as pd
from node2vec import Node2Vec

In [2]:
movie_nodes = pd.read_csv('movies.csv')
g = nx.read_edgelist('movies_edgelist.csv',delimiter=',')
movie_nodes.head()

Unnamed: 0.1,Unnamed: 0,movieId,title,genres
0,0,movie_1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,1,movie_2,Jumanji (1995),Adventure|Children|Fantasy
2,2,movie_3,Grumpier Old Men (1995),Comedy|Romance
3,3,movie_4,Waiting to Exhale (1995),Comedy|Drama|Romance
4,4,movie_5,Father of the Bride Part II (1995),Comedy


Remap nodes to make similarity search less conversion steps

In [3]:
mapping = {movieId:title for movieId, title in
           zip(movie_nodes.movieId, movie_nodes.title)}
g = nx.relabel_nodes(g, mapping)

In [4]:
print(nx.info(g))

Graph with 1405 nodes and 40043 edges


In [5]:
nv = Node2Vec(g, dimensions=256, walk_length=256, num_walks=10, workers=8, p=.1,q=1) 
model = nv.fit(window=5, min_count=0, sg=1)

Computing transition probabilities:   0%|          | 0/1405 [00:00<?, ?it/s]

In [6]:
model.wv.most_similar('Old School (2003)')

[('Big Lebowski, The (1998)', 0.6241620779037476),
 ('Fireworks (Hana-bi) (1997)', 0.5322566628456116),
 ('Road Trip (2000)', 0.5225794315338135),
 ('Wonder Boys (2000)', 0.5018271803855896),
 ("Dude, Where's My Car? (2000)", 0.49479034543037415),
 ('Laputa: Castle in the Sky (Tenkû no shiro Rapyuta) (1986)',
  0.49173879623413086),
 ('Strange Brew (1983)', 0.4695519506931305),
 ('Your Highness (2011)', 0.4655013382434845),
 ('Napoleon Dynamite (2004)', 0.4430961310863495),
 ('Scary Movie 2 (2001)', 0.43457329273223877)]

In [7]:
model.wv.most_similar("Godfather, The (1972)")

[('Godfather: Part II, The (1974)', 0.4300858974456787),
 ('Paris Is Burning (1990)', 0.4131982624530792),
 ('Long Goodbye, The (1973)', 0.4105764925479889),
 ("Babette's Feast (Babettes gæstebud) (1987)", 0.4059564471244812),
 ('Chasing Liberty (2004)', 0.3908712565898895),
 ('Walk to Remember, A (2002)', 0.387372225522995),
 ('Magnificent Seven, The (1960)', 0.38694342970848083),
 ('Chinatown (1974)', 0.3842809796333313),
 ('Autumn Sonata (Höstsonaten) (1978)', 0.3685373365879059),
 ('Curious Case of Benjamin Button, The (2008)', 0.3593035042285919)]