twit-miner

twit-miner is an attempt to build a movie recommendation application. Moreover it has a global list of total score for each movie, which is incremented by 1 if a user likes the given movie and decremented if he hates the movie.

How does this work?

It works by searching in twitter for a list of recent movies. Tweets with movie-titles are classified as like/hate using simple word matching (no magic is going on here). When a matching tweet is found, all other tweets of the user are searched for other movie titles to create a like/hate profile of the user for the list of movies.

If a user has only mentioned a single movie in his twitter-history, the user isn't imported since it he doesn't help us to give recommendations. The total global "score" of the movie is updated, though.

The actual classification is done by a math magic trick: Latend Semantic Indexing.

The SVD matrix creation and the twitter-import is handled by the launch.py script in the project-root directory. You must set the DJANGO_SETTINGS_MODULE environment variable to "settings" to launch the script. The script tries to prune "unnecessary" user data in order to keep the SVD matrix small. If all data would have been stored the computation may result in a MemoryError on smaller VPS.

How can I run the tests?

The tests can be run with modipyd.

What are the dependencies?

There are a lot of dependencies for running twit-miner:

Django 1.0
Python >= 2.5
PyFactory
NumPy
PorterStemmer
modipyd
simplejson
multiprocessing Only needed if you are running on Python 2.5, since 2.6 has it already included.
BeautifulSoup
feedparser
pil

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
twit_miner		twit_miner
.gitignore		.gitignore
LICENSE		LICENSE
README.markdown		README.markdown

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

twit_miner

twit_miner

.gitignore

.gitignore

LICENSE

LICENSE

README.markdown

README.markdown

Repository files navigation

twit-miner

How does this work?

How can I run the tests?

What are the dependencies?

About

Releases

Packages

Languages

License

mop/twit-miner

Folders and files

Latest commit

History

Repository files navigation

twit-miner

How does this work?

How can I run the tests?

What are the dependencies?

About

Resources

License

Stars

Watchers

Forks

Languages