Skip to content

kmting/s8s-spark-ce-workshop

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serverless Spark CE Workshop - May 2022

This repo contains hands-on-labs that cover serverless Spark on GCP powered by Cloud Dataproc, as part of the Serverless Spark Workshop.

Audience

The intended audience is Google Customer Engineers but anyone with access to GCP can try the lab modules just as well.

Prerequisites

Run the setup in Argolis per instructions in [go/scw-tf]

Goal

(a) Just enough knowledge of serverless Spark on GCP powered by Cloud Dataproc to field customer conversations & questions,
(b) completed setup in Argolis for serverless Spark,
(c) demos and knowledge of how to run them and
(d) awareness of resources available for serverless Spark on GCP.

What is covered?

# Modules Focus Feature
1 Environment provisioning (go/scw-tf) Environment Automation With Terraform N/A
2 Lab 1 - Cell Tower Anomaly Detection Data Engineering Serverless Spark Batch from CLI & with Cloud Composer orchestration
3 Lab 2 - Wikipedia Page View Analysis Data Analysis Serverless Spark Batch from BigQuery UI
4 Lab 3 - Chicago Crimes Analysis Data Analysis Serverless Spark Interactive from Vertex AI managed notebook
N Resources for Serverless Spark

Dont forget to

Shut down/delete resources when done

Labs developed by

Some of the labs are developed by Tek Systems for Google, of are contributions by Googlers.

Contributions welcome

Community contribution to improve the labs or new labs are very much appreciated.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 78.0%
  • Jupyter Notebook 22.0%