Skip to content

msawan2/TwitterAnalysis

 
 

Repository files navigation

Twitter Trendiness Score Computation

The objective of this project is to compute the trendiness scores of specific words and phrases (two consecutive words) appearing in Twitter.

What is a "Trend"?
Spikes in the likelihood of seeing a word/phrase relative to its usual likelihood.

Trend

“Trendiness Score” Formula:
The trendiness of a word/phrase p at time t is computed as follows:

Formula1

Here,

Formula2

Approach

Tweets are obtained from the Twitter API.

Each individual tweet along with its timestamp is transformed according to our needs and pushed to a Kakfa Queue.

At the consumer end, the tweets are consumed and loaded onto a Tweets table in a PostgreSQL Database.

Now, when a user wants to find out the trendiness score of a word/phrase at any specific time, the user runs the trendiness_kafka.py script with the word/phrase as input.

The trendiness score of the word/phrase is computed using the formula shown above and displayed.

This process is executed every minute until the code is force stopped.

Finally, trendiness scores are plotted across each minute.

The same is shown below:

Trendiness

About

Real-time Trendiness Score computation for Twitter words/phrases

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 100.0%