Parsing the common crawl database using Scala and Spark
-
Updated
Mar 17, 2021 - Scala
Parsing the common crawl database using Scala and Spark
Half-baked implementation of a cluster manager for EMR.
A boilerplate for spark projects with docker support for local development and scripts for emr support.
Add a description, image, and links to the emr-cluster topic page so that developers can more easily learn about it.
To associate your repository with the emr-cluster topic, visit your repo's landing page and select "manage topics."