Bootiful Batch Jobs with Data Flow

This Instruction Assumes

Java 11
RDBMS ( postgres / maria / mysql)
Kubernetes 1.18+
Helm 3+
About 45 minutes to an hour time

Database Creation

The DDL for this demo's dataset is as follows:

  Create table fema_disaster (  femaDeclarationString varchar(255) not null, disasterNumber varchar(255) not null, state varchar(255) not null, declarationType varchar(255) not null, declarationDate varchar(255) not null, fyDeclared varchar(255) not null, incidentType varchar(255) not null, declarationTitle varchar(255) not null, ihProgramDeclared varchar(255) not null, iaProgramDeclared varchar(255) not null, paProgramDeclared varchar(255) not null, hmProgramDeclared varchar(255) not null, incidentBeginDate varchar(255) not null, incidentEndDate varchar(255) not null, disasterCloseoutDate varchar(255) not null, fipsStateCode varchar(255) not null, fipsCountyCode varchar(255) not null, placeCode varchar(255) not null, designatedArea varchar(255) not null, declarationRequestNumber varchar(255) not null, hash varchar(255) not null unique, lastRefresh varchar(255) not null, id varchar(255) not null );

Next, add a user called 'orders' do to the work in the app.

grant all privileges on orders.* to orders@'127.0.0.1' identified by 'orders';

Deploy the environment with Kubernetes

Setup a namespace using 'Kubectl':

$ kubectl create namespace bootiful-batch

Add the chart source for Bitnami, and install Bitnami/Spring-cloud-dataflow:

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install bootiful-batch bitnami/spring-coud-dataflow
$ watch kubectl get pods

Wait until all pods are in 'Ready' state, and ensure you can get into the Spring Cloud Dataflow console.

Setup Port-forwarding

Visit NOTES.txt to get instructions:

$ helm get notes bootiful-batch

data-flow server forwarding

export SERVICE_PORT=$(kubectl get --namespace default -o jsonpath="{.spec.ports[0].port}" services bootiful-batch-spring-cloud-dataflow-server)
kubectl port-forward --namespace default svc/bootiful-batch-spring-cloud-dataflow-server ${SERVICE_PORT}:${SERVICE_PORT} &
echo "http://127.0.0.1:${SERVICE_PORT}/dashboard"

Whereas the last command shows the URI that you'll plug into a browser tab.

mysql server forwarding

export SERVICE_PORT=$(kubectl get --namespace default -o jsonpath="{.spec.ports[0].port}" services bootiful-batch-mariadb)
kubectl port-forward --namespace default svc/scd-mariadb ${SERVICE_PORT}:${SERVICE_PORT}
echo "jdbc:mysql://127.0.0.1:${SERVICE_PORT}/orders"

At this time, we can take the output of the last command and set spring.datasource.url property with.

Build and Install into Dataflow

First Thing's First!

Docker

You'll need to have a docker container created, and send it to SCDF-K8S.

eval $(minikube -p minikube docker-env)

You must upload to dockerhub.io repository:

docker tag my_task:version repository/image
docker push repository/image

Use this as the environment parameters when kicking off a new job:

--spring.profiles.active=mysql
deployer.batch-job-f.kubernetes.environmentVariables=FEMA_FILE_LOCATION=https://raw.githubusercontent.com/joshlong/fema-disaster-batch-job/master/data/fema.csv

End

Current issues

Data-mismatching

Rather than forcing the job to use mariadb, we should find a way to separate our job's DB connection from the connection made to facilitate dataflow operations.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.mvn/wrapper		.mvn/wrapper
data		data
src		src
tasks		tasks
.gitignore		.gitignore
README.md		README.md
fema.yaml		fema.yaml
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
run-docker.sh		run-docker.sh
run-k8s.sh		run-k8s.sh

joshlong-attic/fema-disaster-batch-job

Folders and files

Latest commit

History

Repository files navigation

Bootiful Batch Jobs with Data Flow

This Instruction Assumes

Database Creation

Deploy the environment with Kubernetes

Setup Port-forwarding

data-flow server forwarding

mysql server forwarding

Build and Install into Dataflow

First Thing's First!

Docker

End

Current issues

Data-mismatching

About

Resources

Stars

Watchers

Forks

Languages