- Java 11
- RDBMS ( postgres / maria / mysql)
- Kubernetes 1.18+
- Helm 3+
- About 45 minutes to an hour time
The DDL for this demo's dataset is as follows:
Create table fema_disaster ( femaDeclarationString varchar(255) not null, disasterNumber varchar(255) not null, state varchar(255) not null, declarationType varchar(255) not null, declarationDate varchar(255) not null, fyDeclared varchar(255) not null, incidentType varchar(255) not null, declarationTitle varchar(255) not null, ihProgramDeclared varchar(255) not null, iaProgramDeclared varchar(255) not null, paProgramDeclared varchar(255) not null, hmProgramDeclared varchar(255) not null, incidentBeginDate varchar(255) not null, incidentEndDate varchar(255) not null, disasterCloseoutDate varchar(255) not null, fipsStateCode varchar(255) not null, fipsCountyCode varchar(255) not null, placeCode varchar(255) not null, designatedArea varchar(255) not null, declarationRequestNumber varchar(255) not null, hash varchar(255) not null unique, lastRefresh varchar(255) not null, id varchar(255) not null );
Next, add a user called 'orders' do to the work in the app.
grant all privileges on orders.* to orders@'127.0.0.1' identified by 'orders';
Setup a namespace using 'Kubectl':
$ kubectl create namespace bootiful-batch
Add the chart source for Bitnami, and install Bitnami/Spring-cloud-dataflow:
$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install bootiful-batch bitnami/spring-coud-dataflow
$ watch kubectl get pods
Wait until all pods are in 'Ready' state, and ensure you can get into the Spring Cloud Dataflow console.
Visit NOTES.txt to get instructions:
$ helm get notes bootiful-batch
export SERVICE_PORT=$(kubectl get --namespace default -o jsonpath="{.spec.ports[0].port}" services bootiful-batch-spring-cloud-dataflow-server)
kubectl port-forward --namespace default svc/bootiful-batch-spring-cloud-dataflow-server ${SERVICE_PORT}:${SERVICE_PORT} &
echo "http://127.0.0.1:${SERVICE_PORT}/dashboard"
Whereas the last command shows the URI that you'll plug into a browser tab.
export SERVICE_PORT=$(kubectl get --namespace default -o jsonpath="{.spec.ports[0].port}" services bootiful-batch-mariadb)
kubectl port-forward --namespace default svc/scd-mariadb ${SERVICE_PORT}:${SERVICE_PORT}
echo "jdbc:mysql://127.0.0.1:${SERVICE_PORT}/orders"
At this time, we can take the output of the last command and set spring.datasource.url
property with.
You'll need to have a docker container created, and send it to SCDF-K8S.
eval $(minikube -p minikube docker-env)
You must upload to dockerhub.io repository:
docker tag my_task:version repository/image
docker push repository/image
Use this as the environment parameters when kicking off a new job:
--spring.profiles.active=mysql
deployer.batch-job-f.kubernetes.environmentVariables=FEMA_FILE_LOCATION=https://raw.githubusercontent.com/joshlong/fema-disaster-batch-job/master/data/fema.csv
Rather than forcing the job to use mariadb, we should find a way to separate our job's DB connection from the connection made to facilitate dataflow operations.