Large-Data-Manipulation Large dataset manipulation using Scala and PySpark! All codes have been tested on major cloud platforms, Google Cloud Platform (GCP), Amazon AWS and Databricks.