Skip to content
opinions-classifier
JavaScript Java CSS PHP Python HTML Other
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
classifier-py
cruisecontrol
data
docs
lib
pipelines/tweets_pipeline
text-retrieval/retrieval
twitter-java
web
weka
.gitignore
README.md
env.bash

README.md

opinions-classifier

This is small attempt to create application which analyzes real-time stream of data (via twitter streaming API) and clusters different posts according to their meaning. It is planned to use for grouping tweets in real-time and to visualize trends in observed topic.

Various clustering algorithms are planned to be tested as part of this app.

The goal of this project is to create meaningful clustering for textual stream of information and visualize it accordingly.

See discussion regarding possible algorithms to use is on stackoverflow here: http://datascience.stackexchange.com/questions/979/algorithms-for-text-clustering

One of the approaches is using Lingo algorithm from carrot2

Also web-based interface is proposed to observe real-time clusters' fluctuations. To simplify clusters presentation - Masonry javascript layout is used

Application diagram

Application prototype

Post me a message in Twitter is you are interested in the topic or want to contribute: @MaximGalushka

You can’t perform that action at this time.