Spark

I will be sharing my journey of learning Spark using Scala and Python. Be connected throughout

Wondering what is Spark!!

Apache Spark is a unified analytics engine for large-scale data processing
- Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.
- Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells.
- Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.
- You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
Accumulators		Accumulators
Broadcast_variables		Broadcast_variables
CSV-File-Manipulation		CSV-File-Manipulation
Column-Expressions		Column-Expressions
Delving_deeper_into_Spark_APIs		Delving_deeper_into_Spark_APIs
Play_with_RDDS		Play_with_RDDS
RDD_Actions		RDD_Actions
RDD_Transformations		RDD_Transformations
RDD_and_lambda_functions		RDD_and_lambda_functions
SQL-Functions-in-Spark		SQL-Functions-in-Spark
Word_Count		Word_Count
Working-with-Avro-format		Working-with-Avro-format
Working-with-Parquet-format		Working-with-Parquet-format
Working_with_Dataframes		Working_with_Dataframes
lambda_expressions		lambda_expressions
README.md		README.md