Complete Pipeline Training at Big Data Scala By the Bay
Switch branches/tags
Nothing to show
Clone or download
Failed to load latest commit information.
config fixed job server, cleaned up exports Aug 15, 2015
datasets adding cassandra exercises notebook and associated very small dataset Aug 14, 2015
feeder/src/main Update feeder app to use new Kafka producer. Aug 12, 2015
notebooks/spark-notebook/pipeline updating cassandra exercises Aug 16, 2015
.gitignore pushing minor update to streaming script Aug 12, 2015
Dockerfile snipped dependencies on all things exceptspark-1-4.1-bin-fluxcapacito… Oct 19, 2015 cleaned up README Aug 15, 2015
build.sbt changed wording a bit Aug 15, 2015 updated scripts Aug 15, 2015 fixing batchTime column issue Aug 12, 2015 updated scripts Aug 15, 2015 Add jobserver start script, upgrade cass spark connector to 1.4.0-M3 Aug 14, 2015


Join the chat at Complete Pipeline Training at Big Data Scala By the Bay

Pipeline Description

Dating ratings data => Akka app => Kafka => Spark Streaming => Cassandra => Dashboard

In addition, Spark MLLib, DataFrames will be demonstrated using a combination of the Cassandra real time data plus static Parquet data, on a notebook interface.

Follow the Wiki to continue exploring -->