Tweet Analysis Sample
We receive a set of tweet files downloaded from http://tweetdownload.net and start out with doing some exploration of the data.
First, we do a simple count of all the tweets per tweet authors in a single file. Next we also investigate the mentions in the tweets. We then refactor the code into code-behind and make it available for reuse in a registered assembly. Then we do the analysis over all files and include some more detailed information about the lineage of the data (who mentioned and which files did provide the tweet).
After we decided on the schema, we finally decide to make the processed data on tweet authors and their mentions available as a table, and write some adhoc analytical queries, that show that while Raghu is not a frequent tweeter, he is very influential :).