-
Notifications
You must be signed in to change notification settings - Fork 6
Add option to track multiple pages in one stream #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
erikschlegel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pageIds PR looks good. Perhaps we can create a separate PR to address the enhancement to bifurcate FB posts and comments.
| override def fetchFacebookResponse(after: Date): ResponseList[Post] = { | ||
| protected def fetchFacebookResponse(after: Date): ResponseList[Post] = { | ||
| logDebug(s"Fetching posts for $pageId since $after") | ||
| facebook.getPosts(pageId, createReading(after)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens to the comments that were posted since after? Keep in mind that FB comments are the most useful piece of data for trend detection in fortis. These posts can have comments that carry across for a very long time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posts and Comments are linked in the Facebook API. As such, you can get a post for date T and then look up its linked comments for it that may have been posted at a date after T.
| ): ReceiverInputDStream[FacebookPost] = { | ||
| createPageStreams(ssc, auth, Set(pageId), fields, pollingSchedule, pollingWorkers, storageLevel) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're going to need to expose a createPageCommentsStream utility to focus on newly posted comments since the previous processed time. This stream should crawl facebook4j getPostComments http://facebook4j.github.io/javadoc/facebook4j/api/PostMethods.html#getPostComments(java.lang.String, facebook4j.Reading)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Will do in a follow-up PR, see #3 for tracking.
| auth: FacebookAuth, | ||
| pageId: String, | ||
| pageIds: Set[String], | ||
| fields: Set[String] = Set("message", "place", "caption", "from", "name", "comments"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove comments as this should be populated from createPageCommentsStream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now implemented in #4.
Resolves #1