Skip to content

Here I play with the services offered by Apache Spark and try to learn them in more depth.

Notifications You must be signed in to change notification settings

lakshay2k/Spark_Playground

Repository files navigation

Spark

I will be sharing my journey of learning Spark using Scala and Python. Be connected throughout

Wondering what is Spark!!

  • Apache Spark is a unified analytics engine for large-scale data processing
    • Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.
    • Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells.
    • Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.
    • You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

About

Here I play with the services offered by Apache Spark and try to learn them in more depth.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published