Skip to content
Marmik edited this page Nov 1, 2016 · 12 revisions

Welcome to the Real-Time-Big-Data-Tutorial wiki!

Here is the short idea of what each lab assignment contains.

Tutorial 1: Spark Line Count

This is Apache Spark program for counting the similar lines of a text file which is written using scala. It also sorts the counts in ascending order.

Tutorial 2: TF words using Spark Transformations and Actions

This program is for finding the Term Frequency of the words for given text file and Counting the unique words and Displaying the top N frequently used term. This is Apache Spark program developed using Transformations and Actions in scala.

Tutorial 3: Video Reconstruction

This is java program using openimaj to detect frames, keyframes, matchingareas, metadata for video file.

Tutorial 4: Video Processing (SHIFT Features and Face Detection)

This is java programs which extract the features of video by sampling it with image and another program detect face and edge from the video which is captured by webcame or stores in storage.

Tutorial 5: Video Feature Extraction & Image Classification

This consists of two programs: One is java program using openImaj to extract features and generate feature vector for input video and other is for classification using Decision Tree and Random Forest machine learning algorithm using Apache Spark in scala.

Tutorial 7 & 8 - Feature Extraction & Count

This program is developed using openImaj, Storm and Kafka. In this program, SHIFT features of video are extracted using openImaj, these extracted features are sent to storm topology using Kafka Pub/Sub system. Storm topology implements the logic to count the SHIFT features in video.