Skip to content
A library for reading social data from Facebook using Spark Streaming.
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
project Add project skeleton May 9, 2017
src Query pages in parallel Jul 20, 2017
.gitignore Add project skeleton May 9, 2017
.travis.yml Add project skeleton May 9, 2017
LICENSE Add project skeleton May 9, 2017
README.md Add links explaining how to generate secrets Jan 9, 2018
build.sbt Add basic integration with facebook4j May 9, 2017
sonatype.sbt Add project skeleton May 9, 2017
version.sbt Bump version Jul 20, 2017

README.md

A library for reading social data from Facebook using Spark Streaming.

Travis CI status

Usage example

Run a demo via:

# set up all the requisite environment variables
# you can create a new app id and secret here: https://developers.facebook.com/quickstarts/
# you can generate a new auth token here: https://developers.facebook.com/tools/accesstoken/
export FACEBOOK_APP_ID="..."
export FACEBOOK_APP_SECRET="..."
export FACEBOOK_AUTH_TOKEN="..."

# compile scala, run tests, build fat jar
sbt assembly

# run locally
java -cp target/scala-2.11/streaming-facebook-assembly-0.0.3.jar FacebookDemo standalone

# run on spark
spark-submit --class FacebookDemo --master local[2] target/scala-2.11/streaming-facebook-assembly-0.0.3.jar spark

How does it work?

Facebook doesn't expose a firehose API so we resort to polling. The FacebookReceiver pings the Facebook API every few seconds and pushes any new posts into Spark Streaming for further processing.

Currently, the following ways to read Facebook items are supported:

Release process

  1. Configure your credentials via the SONATYPE_USER and SONATYPE_PASSWORD environment variables.
  2. Update version.sbt
  3. Enter the SBT shell: sbt
  4. Run sonatypeOpen "enter staging description here"
  5. Run publishSigned
  6. Run sonatypeRelease
You can’t perform that action at this time.