Kafka + Spark examples

This repository has 3 examples of how to produce and consume data from Kafka topics based on string, json, avro and protobuf.

Requirements

sbt = 1.6.2 [site][SDKMAN]
JDK 8 [OpenJDK][SDKMAN]

Prerequisites

Compile Protobuff entities

$ sbt compile

Run Kafka locally

$ docker-compose -f docker/docker-compose-kafka.dev.yml up

Executing different encoding formats

1. String

This example consists of simulating that the content encoded in the topic is a simple text and that, as a result, it will be shown on the consumer's screen.

To execute, you must execute the producer and in another terminal the consumer

Execute producer

$ make string-producer

Execute consumer

$ make string-consumer

2. Json

This example consists of simulating that the content encoded in the topic is a json, the consumer must serialize the data as a dataframe in Spark.

The schema used represents an Ecommerce product, follows the schema:

id: string
name: string
price: double

To execute, you must execute the producer and in another terminal the consumer

Execute producer

$ make json-producer

Execute consumer

$ make json-consumer

3. Avro

This example consists of simulating that the content encoded in the topic is a avro, the consumer must serialize the data as a dataframe in Spark.

The schema used represents an Ecommerce product, follows the schema:

id: string
name: string
price: double

The avro file that determines the schema is located at: ./src/main/resources/product.avsc

To execute, you must execute the producer and in another terminal the consumer

Execute producer

$ make avro-producer

Execute consumer

$ make avro-consumer

4. Protobuf

This example consists of simulating that the content encoded in the topic is a protobuf, the consumer must serialize the data as a dataframe in Spark. To accomplish that was used the scalaPB library for gRPC, for more insights checkout documentation for SparkSQL

The schema used represents an Ecommerce product, follows the schema:

id: string
name: string
price: double

The protobu can be found at: ./src/main/protobuf/product.proto

To execute, you must execute the producer and in another terminal the consumer

Execute producer

$ make proto-producer

Execute consumer

$ make proto-consumer

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docker		docker
project		project
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
assembly.sbt		assembly.sbt
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kafka + Spark examples

Requirements

Prerequisites

Executing different encoding formats

1. String

2. Json

3. Avro

4. Protobuf

About

Uh oh!

Releases

Packages

Languages

License

andlaurentino/kafka-spark-examples

Folders and files

Latest commit

History

Repository files navigation

Kafka + Spark examples

Requirements

Prerequisites

Executing different encoding formats

1. String

2. Json

3. Avro

4. Protobuf

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages