Entry for the GitHub Contest
License
Manfred/github-contest
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
GITHUB CONTEST My approach uses a widely publicized probabilistic version of LSA, combined with a variant of the Hellinger distance to generate a value for a recommendation. CONSIDERATIONS PLSA has a few problems, namely overfitting and the fact that it's not a very good generative model for new data (eg. a new user). Both these disadvantages won't be a problem in the contest because we have a fixed dataset. In the future I might take a stab at latent Dirichlet allocation and compare the results on this dataset. The contest ranking is created by looking at the recall of the algorithm and not the precision. I would definately not recommend using this code in production because even though it might have a reasonable score in a synthetic environment, it might not perform very well in the real world. When creating an actual recommendation system for GitHub I would like to include user feedback on the recommendations so supervised learning can be used to train the models. LICENSE The code is released under the same conditions as Nethack. For more details about these conditions see the LICENSE file. Please contact me if you want to use the code under different conditions. Github-contest entry © 2009, Manfred Stienstra
About
Entry for the GitHub Contest
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published