Skip to content

arezamoosavi/spark-streaming-k8s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark-streaming-k8s

Starting minikube

minikube stop
minikube delete
minikube start --insecure-registry="192.168.0.1:5000" --memory 8192 --cpus 4

Geting minikube ip and master address

minikube ip
kubectl cluster-info

Running minikube dashboard

minikube dashboard --url &
kubectl proxy --address='0.0.0.0' --disable-filter=true --port=5885 &

App Development

Build Kafka

make kafka

Build Postgres

make pg

Bulding Spark docker image

Download jar files

make jars_dl

Build the image

make build-app

Push Location Dataset to Kafka

make push-kafka

Check the Kafka Topic

make kafkacat

App K8s Deployment

cd k8s/

Spark Image in Registery

start registery

make start-registery

tag and push image to registery

docker tag spark-streaming-k8s_spark 192.168.0.1:5000/stream-spark:v4
docker push 192.168.0.1:5000/stream-spark:v4
check registery
curl -X GET http://192.168.0.1:5000/v2/stream-spark/tags/list

Run Spark Streaming

Download Spark

wget --no-verbose https://archive.apache.org/dist/spark/spark-3.1.1/spark-3.1.1-bin-hadoop3.2.tgz && \
tar -xzvf spark-3.1.1-bin-hadoop3.2.tgz && \
rm -rf spark-3.1.1-bin-hadoop3.2.tgz && \
cd spark-3.1.1-bin-hadoop3.2/

Create Spark Service Account

kubectl create serviceaccount spark
kubectl create clusterrolebinding spark-role --clusterrole=edit  --serviceaccount=default:spark --namespace=default

Spark Submit Streaming

make run-k8s-streaming

Results

It Starts

alt text alt text

Logs

kubectl logs -f pod/location-streaming-app

alt text

Spark UI

kubectl port-forward --address 0.0.0.0 pod/location-streaming-app 4040:4040

alt text

Postgres

make pg-exec
select * from locations;

alt text