Skip to content

rmoff/cc-workshop

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hands-On with Confluent Cloud: Apache Kafka®, Apache Flink®, and Tableflow

Confluent Cloud is a fully managed platform for Apache Kafka, designed to simplify real-time data streaming and processing. It integrates Kafka for data ingestion, Flink for stream processing, and Tableflow for converting streaming data into analytics-ready Apache Iceberg tables. DuckDB, a lightweight analytical database, supports querying these Iceberg tables, making it an ideal tool for the workshop’s analytics component. The workshop is designed for developers with basic programming knowledge, potentially new to Kafka, Flink, or Tableflow, and aims to provide hands-on experience within a condensed time frame.

Important

🚨 CRITICAL - COST PREVENTION: After completing this workshop, immediately follow the teardown guide to prevent unexpected charges from Flink compute pools and Tableflow catalog integrations.

📋 Teardown Guide: guides/05-teardown-resources.adoc

Execute teardown steps within 15 minutes of workshop completion to avoid ongoing billing for cloud resources.

Workshop Overview

This 2-hour hands-on workshop introduces developers to building real-time data pipelines using Confluent Cloud. You’ll learn to stream data with Apache Kafka, process it in real-time with Apache Flink, and convert it into Apache Iceberg tables using Tableflow. The workshop assumes basic familiarity with programming and provides step-by-step guidance.

What You’ll Learn

  • Set up a Kafka cluster and manage topics in Confluent Cloud.

  • Write and run a Flink job to process streaming data.

  • Use Tableflow to materialize Kafka topics as Iceberg tables and query them with DuckDB.

Prerequisites

Note

🖥️ Platform Compatibility: All workshop scripts are designed for Linux and macOS (with zsh shell).

💻 Windows Users: We recommend using GitHub Codespaces or VS Code Dev Containers for the best experience. The devcontainer configuration provides a consistent Linux environment with all tools pre-installed.

🚀 Quick Start Options: * GitHub Codespaces (Recommended for Windows): Click "Code" → "Codespaces" → "Create codespace" * Dev Containers: Open in VS Code → Ctrl+Shift+P → "Dev Containers: Reopen in Container" * Local Setup: Linux/macOS users can run scripts directly

Workshop Segments and Features Covered

Segment Duration Features Covered Objective

Introduction

15 min

Kafka, Flink, Tableflow Overview

Understand event-driven architecture

Setting Up Confluent Cloud

15 min

Kafka Cluster Creation

Set up a managed Kafka cluster

Kafka Hands-On

30 min

Kafka Topics, Producers, Consumers

Stream data with Kafka

Flink Hands-On

45 min

Flink Stream Processing

Process data in real-time

Tableflow Hands-On

30 min

Tableflow, Iceberg, DuckDB

Materialize and query analytics-ready data

Wrap-Up and Q&A

15 min

All features

Summarize and address questions

About

Hands-On with Confluent Cloud: Apache Kafka®, Apache Flink®, and Tableflow

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%