Analyzing curated tweets of opinion-shapers and newsmakers to understand the dynamics of the news in the U.S. and in Turkey.
HTML Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
analysis
data
docs
scrapers
.gitignore
LICENSE
README.md

README.md

Commentary Tweets of the Elites

Analyzing curated tweets of opinion-shapers and newsmakers provided by nediyor.com and theplazz.com news sites to understand the dynamics of the responses of the elites to the important events in the US and in Turkey.

Data Collection

On the news that made to the headlines we collected about two years of curated tweets data for the United States (154,684 tweets of 1,442 commentators on 7,376 news between 01/09/2015 and 01/14/2013) and Turkey (190,180 tweets of 1306 commentators on 10,044 news between 01/09/2015 and 01/14/2013).

  • Filenames starting with scrape- :
    • Selenium (as a Python API) is used to scrape the data from the main pages of the websites.
    • Scrolled down 1000 times to overcome the lazy loading feature of the sites.
    • To get individual comments, downloaded ~17,000 htmls from the links scraped from the main pages by nohup sh -c "cat urls.txt | xargs -n 1 -P 10 wget " &
    • The compressed files for nediyor(190MB) and theplazz(107MB) are on dropbox.

Data Analysis

Daily Commentary Statistics

  • Aggregate-daily & container.js :
    • Counts of comments on news are aggregated by day and visualized
    • Time series data is visualized using Highcarts JS.

Commentator Statistics

  • commentators-stats.py calculates and visualizes the following statistics:
    • Comment counts by commentator
    • Group commentators by profession
    • Monthly commentator performance
    • ...

Initial Findings

  • Daily comment count visualization is here