Skip to content

Commit

Permalink
Add MovieLens.load to parse and encode the movielens dataset (#947)
Browse files Browse the repository at this point in the history
This adds a `load` function that returns:

- a StellarGraph containing the users (with IDs `u_...`), and movies (IDs
  `m_...`) as well as "rating" edges, where the features in the users nodes have
  been encoded and normalised
- a pandas DataFrame containing the edges (as in `user_id` and `movie_id`) and
  their rating label to use for training/testing

Example first few rows of each file for reference:

`u.data`:
```
196	242	3	881250949
186	302	3	891717742
22	377	1	878887116
```

`u.item`:
```
1|Toy Story (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)|0|0|0|1|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0
2|GoldenEye (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?GoldenEye%20(1995)|0|1|1|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
3|Four Rooms (1995)|01-Jan-1995||http://us.imdb.com/M/title-exact?Four%20Rooms%20(1995)|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|0|0
```

`u.user`:
```
1|24|M|technician|85711
2|53|F|other|94043
3|23|M|writer|32067
```

See: #812
  • Loading branch information
huonw committed Feb 26, 2020
1 parent ba2a696 commit 3cf4b99
Show file tree
Hide file tree
Showing 5 changed files with 249 additions and 694 deletions.
35 changes: 0 additions & 35 deletions demos/link-prediction/hinsage/ml-100k-config.json

This file was deleted.

Loading

0 comments on commit 3cf4b99

Please sign in to comment.