Data and source code for the paper "Gamified Incentives: A Badge Recommendation Model to Improve User Engagement in Social Networking Websites". ( pdf )
Contains badges dataset for different time periods which are extracted from
Badges.xml data of Stack Overflow using
dataset2008: Contains randomly generated train and test dataset from badges dataset using
dataset_generation/train_test_generation.pyfor badges that are awarded in the year 2008. The complete dataset for years 2008 to 2010 are compressed in the
datasets.zipfile and should be uncompressed like
dataset2008to be used.
extract_badges.py: Extracts badges for users which are awarded within the
end_yeartime period from the
Badges.xmlfile. It writes the output as a
csvfile in the format of
train_test_generation.py: Generates train and test set from
collaborative_filtering.py: Implementation of the item-based collaborative filtering method to recommend badges and evaluate the results.
popular_badge_baseline.py: Implementation of the baseline algorithm which recommends popular badges to each user.
datasets.py: This module contains the file paths of datasets for easier access in other modules. If you change the repository structure and the place of the
datadirectory, you should modify
_DATASET_ROOTin this module accordingly so that it points to the root directory where datasets reside in.