Skip to content

Module 2: ICP #4

Sneha Mishra edited this page Jul 18, 2018 · 3 revisions

Team: 12
Professor: Yugyung Lee

Name: Sneha Mishra
Class ID: 11
Email: smccr@mail.umkc.edu
MyGitHub

Technical Partner:
Name: Aditya Soman
Class ID: 19
Email: aditya.soman@mail.umkc.edu
GitHub

Objective

Introduction to Apache Spark, it is a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.

Features

Steps:

Step 1:

Step 2:

Step 3: Implement the given Use Cases

UseCase 1: Implement Spark Transformations and Spark Actions

Transformation Code:

Output:

References:

https://docs.databricks.com/spark/latest/graph-analysis/graphframes/graph-analysis-tutorial.html

Clone this wiki locally