Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to track multiple pages in one stream #2

Merged
merged 2 commits into from Jul 20, 2017

Conversation

Projects
None yet
2 participants
@c-w
Copy link
Member

commented Jul 19, 2017

Resolves #1

c-w added some commits Jul 19, 2017

@erikschlegel
Copy link

left a comment

The pageIds PR looks good. Perhaps we can create a separate PR to address the enhancement to bifurcate FB posts and comments.


override def fetchFacebookResponse(after: Date): ResponseList[Post] = {
protected def fetchFacebookResponse(after: Date): ResponseList[Post] = {
logDebug(s"Fetching posts for $pageId since $after")
facebook.getPosts(pageId, createReading(after))

This comment has been minimized.

Copy link
@erikschlegel

erikschlegel Jul 19, 2017

What happens to the comments that were posted since after? Keep in mind that FB comments are the most useful piece of data for trend detection in fortis. These posts can have comments that carry across for a very long time.

This comment has been minimized.

Copy link
@c-w

c-w Jul 20, 2017

Author Member

Posts and Comments are linked in the Facebook API. As such, you can get a post for date T and then look up its linked comments for it that may have been posted at a date after T.

storageLevel: StorageLevel = StorageLevel.MEMORY_ONLY
): ReceiverInputDStream[FacebookPost] = {
createPageStreams(ssc, auth, Set(pageId), fields, pollingSchedule, pollingWorkers, storageLevel)
}
}

This comment has been minimized.

Copy link
@erikschlegel

erikschlegel Jul 19, 2017

we're going to need to expose a createPageCommentsStream utility to focus on newly posted comments since the previous processed time. This stream should crawl facebook4j getPostComments http://facebook4j.github.io/javadoc/facebook4j/api/PostMethods.html#getPostComments(java.lang.String, facebook4j.Reading)

This comment has been minimized.

Copy link
@c-w

c-w Jul 20, 2017

Author Member

Okay. Will do in a follow-up PR, see #3 for tracking.

ssc: StreamingContext,
auth: FacebookAuth,
pageId: String,
pageIds: Set[String],
fields: Set[String] = Set("message", "place", "caption", "from", "name", "comments"),

This comment has been minimized.

Copy link
@erikschlegel

erikschlegel Jul 19, 2017

let's remove comments as this should be populated from createPageCommentsStream

This comment has been minimized.

Copy link
@c-w

c-w Jul 20, 2017

Author Member

This is now implemented in #4.

@c-w c-w referenced this pull request Jul 20, 2017

Closed

Move comments to separate stream #3

@c-w c-w merged commit 868f99c into master Jul 20, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@c-w c-w deleted the track-multiple-pages branch Jul 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.