A spider and data-mining project on ErogameScape.
- X[] Find hidden tags.
- [o] Grab comments.
- Grab game pov.
- User recommendations.
- Game recommendations.
How data stores in redis:
-
Hash - game:$id -> title, brand_id
HMSET game:16506 title この大空に、翼をひろげて brand_id 689 HGETALL game:16506 HEXISTS game:16506 title
-
String - brand:$id -> brand_name
SET brand:689 PULLTOP
-
Hash - uid:game_id
HMSET comment:yamadayo:7062 score 65 playtime 30h date 2013年11月04日02時13分14秒 comment "個別が鈴√と来ヶ谷以外全く面白くない" netabare 1
-
Set - indexes for later mining entry: games, users, brands, $user:games
SADD games "16506" SMEMBERS games SADD users "yamadayo" "christia" SMEMBERS users SADD brands 689 SADD yamadayo:games 16506
-
List - new_commented_games (can use LTRIM to create a list that just remembers the lastest N elements)
LPUSH new_commented_games 16506 LRANGE new_commented_games 0 9
- spider_comment.py - grab user comments including score, playtime, comment text, etc.
- spider_game.py - grab game pov.
At version 0.0.1.