-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for sharing data #2
Comments
Check out Facebook's Graph API: https://developers.facebook.com/docs/graph-api |
@yibeichan and @fhopp discussed today that we can get very informative data via the Twitter API. For now, we are going to focus on "shares" on Twitter and later return to the idea of shares on Facebook. The single unit for scraping twitter data will still be a single URL. However, we will create two extra tables in cassandra:
For (1), each row will be a unique URL and columns will consist of the unique number of users that mentioned this tweet and the total retweet counts, total likes, total comments of this URL. For (2), each row will be the unique tweet that mentioned this URL along with metadata for that tweet such as the text, how many likes the tweet has gotten, how many replies, favorites etc. Next step for @yibeichan is to think about how we can retrieve so many URLs. @musainayatmalik will help with implementing the "twitter scraping" pipeline in PySpark. |
several ways to get historical twitter data (sorted)
|
@yibeichan , can we close this now? We are using sharedcount.com to get facebook data and we will pay to get the twitter data? Can you open an issue for the Twitter data and comment the link to the company so I can get started on the application? Thanks! |
We need to find a good API that lets us obtain sharing data of newspaper articles.
The text was updated successfully, but these errors were encountered: