Demonstration of Google Cloud Dataproc Workflow Templates
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
command-snippets
LICENSE
README.md
template-demo-2.yaml
template-demo-3.yaml
template-demo-4.yaml
template-demo-5.yaml

README.md

Google Cloud Dataproc WorkflowTemplates API Demo

Code repository for post, Using the Google Cloud Dataproc WorkflowTemplates API to Automate Spark and Hadoop Workloads on GCP.

Files

  • template-demo-2.yaml: Non-parametrized version of workflow template with three jobs, using a managed 3-node Spark cluster
  • template-demo-3.yaml: Parametrized version of workflow template with one Python-based PySpark job, using a managed 3-node Spark cluster
  • template-demo-4.yaml: Parametrized version of workflow template with one Python-based PySpark job, using an existing 3-node Spark cluster
  • template-demo-5.yaml: Parametrized version of workflow template with one Java-based Spark job, using an existing 3-node Spark cluster