GitHub - spring-cloud/spring-cloud-dataflow: A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes

Spring Cloud Data Flow is a microservices-based toolkit for building streaming and batch data processing pipelines in Cloud Foundry and Kubernetes.

Data processing pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks.

This makes Spring Cloud Data Flow ideal for a range of data processing use cases, from import/export to event streaming and predictive analytics.

Components

Architecture: The Spring Cloud Data Flow Server is a Spring Boot application that provides RESTful API and REST clients (Shell, Dashboard, Java DSL). A single Spring Cloud Data Flow installation can support orchestrating the deployment of streams and tasks to Local, Cloud Foundry, and Kubernetes.

Familiarize yourself with the Spring Cloud Data Flow architecture and feature capabilities.

Deployer SPI: A Service Provider Interface (SPI) is defined in the Spring Cloud Deployer project. The Deployer SPI provides an abstraction layer for deploying the apps for a given streaming or batch data pipeline and managing the application lifecycle.

Spring Cloud Deployer Implementations:

Domain Model: The Spring Cloud Data Flow domain module includes the concept of a stream that is a composition of Spring Cloud Stream applications in a linear data pipeline from a source to a sink, optionally including processor application(s) in between. The domain also includes the concept of a task, which may be any process that does not run indefinitely, including Spring Batch jobs.

Application Registry: The App Registry maintains the metadata of the catalog of reusable applications. For example, if relying on Maven coordinates, an application URI would be of the format: maven://<groupId>:<artifactId>:<version>.

Shell/CLI: The Shell connects to the Spring Cloud Data Flow Server's REST API and supports a DSL that simplifies the process of defining a stream or task and managing its lifecycle.

Building

Clone the repo and type

$ ./mvnw -s .settings.xml clean install

Looking for more information? Follow this link.

Building on Windows

When using Git on Windows to check out the project, it is important to handle line-endings correctly during checkouts. By default Git will change the line-endings during checkout to CRLF. This is, however, not desired for Spring Cloud Data Flow as this may lead to test failures under Windows.

Therefore, please ensure that you set Git property core.autocrlf to false, e.g. using: $ git config core.autocrlf false. For more information please refer to the Git documentation, Formatting and Whitespace.

Running Locally w/ Oracle

By default, the Dataflow server jar does not include the Oracle database driver dependency. If you want to use Oracle for development/testing when running locally, you can specify the local-dev-oracle Maven profile when building. The following command will include the Oracle driver dependency in the jar:

$ ./mvnw -s .settings.xml clean package -Plocal-dev-oracle

You can follow the steps in the Oracle on Mac ARM64 Wiki to run Oracle XE locally in Docker with Dataflow pointing at it.

NOTE: If you are not running Mac ARM64 just skip the steps related to Homebrew and Colima

Running Locally w/ Microsoft SQL Server

By default, the Dataflow server jar does not include the MSSQL database driver dependency. If you want to use MSSQL for development/testing when running locally, you can specify the local-dev-mssql Maven profile when building. The following command will include the MSSQL driver dependency in the jar:

$ ./mvnw -s .settings.xml clean package -Plocal-dev-mssql

You can follow the steps in the MSSQL on Mac ARM64 Wiki to run MSSQL locally in Docker with Dataflow pointing at it.

NOTE: If you are not running Mac ARM64 just skip the steps related to Homebrew and Colima

Running Locally w/ IBM DB2

By default, the Dataflow server jar does not include the DB2 database driver dependency. If you want to use DB2 for development/testing when running locally, you can specify the local-dev-db2 Maven profile when building. The following command will include the DB2 driver dependency in the jar:

$ ./mvnw -s .settings.xml clean package -Plocal-dev-db2

You can follow the steps in the DB2 on Mac ARM64 Wiki to run DB2 locally in Docker with Dataflow pointing at it.

NOTE: If you are not running Mac ARM64 just skip the steps related to Homebrew and Colima

Contributing

We welcome contributions! See the CONTRIBUTING guide for details.

Code formatting guidelines

The directory ./src/eclipse has two files for use with code formatting, eclipse-code-formatter.xml for the majority of the code formatting rules and eclipse.importorder to order the import statements.
In eclipse you import these files by navigating Windows -> Preferences and then the menu items Preferences > Java > Code Style > Formatter and Preferences > Java > Code Style > Organize Imports respectfully.
In IntelliJ, install the plugin Eclipse Code Formatter. You can find it by searching the "Browse Repositories" under the plugin option within IntelliJ (Once installed you will need to reboot Intellij for it to take effect). Then navigate to Intellij IDEA > Preferences and select the Eclipse Code Formatter. Select the eclipse-code-formatter.xml file for the field Eclipse Java Formatter config file and the file eclipse.importorder for the field Import order. Enable the Eclipse code formatter by clicking Use the Eclipse code formatter then click the OK button. ** NOTE: If you configure the Eclipse Code Formatter from File > Other Settings > Default Settings it will set this policy across all of your Intellij projects.

License

Spring Cloud Data Flow is Open Source software released under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 4,908 Commits
.github		.github
.mvn		.mvn
.run		.run
.vscode		.vscode
lib		lib
models		models
spring-cloud-common-security-config		spring-cloud-common-security-config
spring-cloud-dataflow-aggregate-task		spring-cloud-dataflow-aggregate-task
spring-cloud-dataflow-audit		spring-cloud-dataflow-audit
spring-cloud-dataflow-autoconfigure		spring-cloud-dataflow-autoconfigure
spring-cloud-dataflow-build		spring-cloud-dataflow-build
spring-cloud-dataflow-classic-docs		spring-cloud-dataflow-classic-docs
spring-cloud-dataflow-common		spring-cloud-dataflow-common
spring-cloud-dataflow-completion		spring-cloud-dataflow-completion
spring-cloud-dataflow-composed-task-runner		spring-cloud-dataflow-composed-task-runner
spring-cloud-dataflow-configuration-metadata		spring-cloud-dataflow-configuration-metadata
spring-cloud-dataflow-container-registry		spring-cloud-dataflow-container-registry
spring-cloud-dataflow-core-dsl		spring-cloud-dataflow-core-dsl
spring-cloud-dataflow-core		spring-cloud-dataflow-core
spring-cloud-dataflow-dependencies		spring-cloud-dataflow-dependencies
spring-cloud-dataflow-docs		spring-cloud-dataflow-docs
spring-cloud-dataflow-package		spring-cloud-dataflow-package
spring-cloud-dataflow-parent		spring-cloud-dataflow-parent
spring-cloud-dataflow-platform-cloudfoundry		spring-cloud-dataflow-platform-cloudfoundry
spring-cloud-dataflow-platform-kubernetes		spring-cloud-dataflow-platform-kubernetes
spring-cloud-dataflow-registry		spring-cloud-dataflow-registry
spring-cloud-dataflow-rest-client		spring-cloud-dataflow-rest-client
spring-cloud-dataflow-rest-resource		spring-cloud-dataflow-rest-resource
spring-cloud-dataflow-schema-core		spring-cloud-dataflow-schema-core
spring-cloud-dataflow-schema		spring-cloud-dataflow-schema
spring-cloud-dataflow-server-core		spring-cloud-dataflow-server-core
spring-cloud-dataflow-server		spring-cloud-dataflow-server
spring-cloud-dataflow-shell-core		spring-cloud-dataflow-shell-core
spring-cloud-dataflow-shell		spring-cloud-dataflow-shell
spring-cloud-dataflow-single-step-batch-job		spring-cloud-dataflow-single-step-batch-job
spring-cloud-dataflow-tasklauncher		spring-cloud-dataflow-tasklauncher
spring-cloud-dataflow-test		spring-cloud-dataflow-test
spring-cloud-skipper		spring-cloud-skipper
spring-cloud-starter-dataflow-server		spring-cloud-starter-dataflow-server
spring-cloud-starter-dataflow-ui		spring-cloud-starter-dataflow-ui
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.jdk8		.jdk8
.sdkmanrc		.sdkmanrc
.settings.xml		.settings.xml
.springjavaformatconfig		.springjavaformatconfig
.trivyignore		.trivyignore
CODE_OF_CONDUCT.adoc		CODE_OF_CONDUCT.adoc
CONTRIBUTING.adoc		CONTRIBUTING.adoc
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build-docs.sh		build-docs.sh
create-carvel-package.sh		create-carvel-package.sh
has-javadoc.sh		has-javadoc.sh
mvnw		mvnw
mvnw.cmd		mvnw.cmd
pom.xml		pom.xml
run-db-it.sh		run-db-it.sh
trivy.yaml		trivy.yaml

License

spring-cloud/spring-cloud-dataflow

Folders and files

Latest commit

History

Repository files navigation

Components

Building

Building on Windows

Running Locally w/ Oracle

Running Locally w/ Microsoft SQL Server

Running Locally w/ IBM DB2

Contributing

Code formatting guidelines

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers