Big Data Technologies

This is a repository i have created to put up some of the knowledge i have gained around Big Data Technologies especially Spark, GraphX etc.

SPARK

Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.

https://spark.apache.org/

GRAPHX

GraphX is Apache Spark's API for graphs and graph-parallel computation. GraphX unifies ETL, exploratory analysis, and iterative graph computation within a single system. You can view the same data as both graphs and collections, transform and join graphs with RDDs efficiently, and write custom iterative graph algorithms using the Pregel API.

https://spark.apache.org/graphx/

SPARK SQL

Spark SQL is Apache Spark's module for working with structured data. Spark SQL lets you query structured data inside Spark programs, using either SQL or a familiar DataFrame API. Usable in Java, Scala, Python and R.

https://spark.apache.org/sql/

SPARK STREAMING

Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. Spark Streaming brings Apache Spark's language-integrated API to stream processing, letting you write streaming jobs the same way you write batch jobs. It supports Java, Scala and Python.

https://spark.apache.org/streaming/

MLLIB

MLlib is Apache Spark's scalable machine learning library. MLlib fits into Spark's APIs and interoperates with NumPy in Python (as of Spark 0.9) and R libraries (as of Spark 1.5). You can use any Hadoop data source (e.g. HDFS, HBase, or local files), making it easy to plug into Hadoop workflows.

https://spark.apache.org/mllib/

Please go through the PPT's and let me know if you feel some additional information would help.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
GraphX.pptx		GraphX.pptx
MLLib.pptx		MLLib.pptx
README.md		README.md
Spark Core Commands.txt		Spark Core Commands.txt
Spark SQL Commands.txt		Spark SQL Commands.txt
Spark SQL.pptx		Spark SQL.pptx
Spark Streaming.pptx		Spark Streaming.pptx
Spark.pptx		Spark.pptx
graphx_commands.txt		graphx_commands.txt
people.json		people.json
people.txt		people.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big Data Technologies

SPARK

GRAPHX

SPARK SQL

SPARK STREAMING

MLLIB

About

Releases

Packages

SudhansuTaparia/BigData

Folders and files

Latest commit

History

Repository files navigation

Big Data Technologies

SPARK

GRAPHX

SPARK SQL

SPARK STREAMING

MLLIB

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages