Skip to content

NashTech-Labs/Lambda-Arch-Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lambda-Arch-Spark

In this project we are trying to analyse twitter's tweets using lambda architecture.


What is Lambda architecture ?


Lambda architecture is a data processing architecture designed to handle massive quantities of data by taking advantage of both batch and stream processing methods. For more details please check Twitter's tweets analysis using Lambda Architecture


Now Play


  • Clone the project into local system : $ git clone git@github.com:knoldus/Lambda-Arch-Spark.git
  • Akka requires that you have Java 8 or later installed on your machine.
  • Install SBT if you do not have
  • Install Kafka
  • Install Cassandra
  • We need to create twitter app to access twitter realtime tweets.
  • We need to put twitter's app consumerKey,consumerSecret,accessToken and accessTokenSecret into application.conf file of this project.
  • Before start the project we need to start kafka and cassandra.
  • Execute sbt clean compile to build the product
  • Execute sbt run to execute the project it will show you multiple option.
  • We need to first start TwitterStreamApp to fetch tweets from twitter, then start CassandraKafkaConsumer which is responsible for fetch data from kafka and put into master dataset.After that we can start SparkStreamingKafkaConsumer for realtime view and BatchProcessor for batch view.There is another app AkkaHttpServer which is responsible for serving layer.Basically it merges realtime and batch view against pre specified query and retrun result back to web client.

References