Skip to content

This repository is a code along with the O'Reilly book - Data Algorithms with Spark - a book primarily intended for people who want to analyze large amounts of data and develop distributed algorithms using the Spark engine and PySpark.

Notifications You must be signed in to change notification settings

OTeeEnabor/data_algorithms_with_pyspark

Repository files navigation

Data Algorithms with Spark : Recipes and Design Patterns for Scaling Up Using PySpark

Appreciation

  1. Thank you to Marin Aglić for this straightforward and through explanation on how to setup a Spark Cluster on Docker.

About

This repository is a code along with the O'Reilly book - Data Algorithms with Spark - a book primarily intended for people who want to analyze large amounts of data and develop distributed algorithms using the Spark engine and PySpark.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published