ICP_02 : Spark Programming
acikgozmehmet edited this page Sep 4, 2020
·
3 revisions
We will focus on installation and getting familiar with Big Data Analytics and Applications programming concepts.
Spark
- Spark is an open source cluster computing environment similar to Hadoop, developed at the University of California, Berkeley
- Machine Learning
- Spark Streaming
- Faster Batch
- Spark enables in-memory distributed datasets that optimize iterative workloads in addition to interactive queries.
- Spark is complementary to Hadoop and can run side by side over the Hadoop file system.
- Spark supports to build large-scale and low-latency data analytics applications.
2. Creating a well commented Spark program and outputting the correct results and writing it to output file.
Please click on the link to see the recording
References: