Kaggle---Google-Local-visit-prediction

(I have to admit that the codes in this repo are messy due to the deadline of the task)

Kaggle---Google-Local-visit-prediction

Predict whether users will visit a business using BPR

Who should read this: People want to know how to implement BPR (Bayesian Personalized Ranking from Implicit Feedback) https://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_et_al2009-Bayesian_Personalized_Ranking.pdf
Competition description: Predict given a (user,business) pair from 'pairs Visit.txt' whether the user visited the business (really, whether it was one of the business they reviewed). Accuracy will be measured in terms of the categorization accuracy (fraction of predictions you got right).
Data description:
- businessID: The ID of the business. This is a hashed product identifier from Google.
- userID: The ID of the reviewer. This is a hashed user identifier from Google.
- rating: The star rating of the user’s review.
- reviewText: The text of the review. It should be possible to successfully complete this assignment without making use of the review data, though an effective solution to the category prediction task will presumably make use of it.
- reviewHash: Hash of the review (essentially a unique identifier for the review).
- unixReviewTime: Time of the review in seconds since 1970.
- reviewTime: Plain-text representation of the review time.
- categories: Category labels of the product being reviewed.
Basic Concept of BPR:
- Maximize the differences bewteen items that users prefer and not prefer, with a Gaussian distribution prior
- Decompose the high dimensional users vectors into low dimensional ones, and still remain such differenes.
- Regularization parameters on User, Positive item (prefered), Negative item (not prefered)
- Stochastic gradient descent
Functions in the codes:
- Sample function: each time sample a pair consists of user, positive item and negative item.
- Update function: each time calculate the gradient of the three object and update their values simultaneously.
- Train function: within given iterations, sample and update repeatedly.
- Test function: test the cost function, to see whether to change learning rate.
- To get the better result, I set the lambda to -0.1,-0.1,-0.2 for user, positive items, negative items respectively.
- For stochastic gradient descent, I stop it when it vibrates when alpha is 0.01. Actually, it can be better, but I am not that patient to wait.

( In the paper, it is said lambda should be positive, but this will lead to overflow.)

Kaggle---Google-rating-prediction

I have uploaded the Latent Factor Model for rating prediction, which is included in the file

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Latent Factor Model		Latent Factor Model
BPR.ipynb		BPR.ipynb
README.md		README.md
train-tmp.json.zip		train-tmp.json.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kaggle---Google-Local-visit-prediction

Predict whether users will visit a business using BPR

( In the paper, it is said lambda should be positive, but this will lead to overflow.)

Kaggle---Google-rating-prediction

About

Releases

Packages

Languages

RenXiangyuan/Kaggle---Google-Local-visit-prediction

Folders and files

Latest commit

History

Repository files navigation

Kaggle---Google-Local-visit-prediction

Predict whether users will visit a business using BPR

( In the paper, it is said lambda should be positive, but this will lead to overflow.)

Kaggle---Google-rating-prediction

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages