Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft on Streaming Developer Guide #7

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ yarn-error.log

public/
data/
content/documentation/files/ext/
content/documentation/files/ext/
.idea/
content/files/ext/
17 changes: 5 additions & 12 deletions content/documentation/pages/1-installation/1-local.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,32 +63,26 @@ To get started, you need to start Docker Compose. To do so:

$ export DATAFLOW_VERSION={local-server-image-tag}
$ export SKIPPER_VERSION={skipper-version}
$ export SCDF_HOST_IP=your-hsot-ip
$ docker-compose up

The `docker-compose.yml` file defines `DATAFLOW_VERSION`,
`SKIPPER_VERSION` and `SCDF_HOST_IP` variables, so that those values can
The `docker-compose.yml` file defines `DATAFLOW_VERSION` and
`SKIPPER_VERSION` variables, so that those values can
be easily changed. The preceding commands first set the
`DATAFLOW_VERSION`, `SKIPPER_VERSION` and `SCDF_HOST_IP` to use in the
`DATAFLOW_VERSION`, `SKIPPER_VERSION` to use in the
environment. Then `docker-compose` is started.

You can use the `ifconfig` (for linux/macos) or `ipconfig` (for Windows)
command line toolkit to obtain the IP address to set in `SCDF_HOST_IP`.
Note that `127.0.0.1` is not a valid option.

You can also use a shorthand version that exposes only the
`DATAFLOW_VERSION`, `SKIPPER_VERSION` and `SCDF_HOST_IP` variables to
`DATAFLOW_VERSION`, `SKIPPER_VERSION` variables to
the `docker-compose` process (rather than setting it in the
environment), as follows:

$ DATAFLOW_VERSION={local-server-image-tag} SKIPPER_VERSION={skipper-version} SCDF_HOST_IP=Your-Host-IP docker-compose up
$ DATAFLOW_VERSION={local-server-image-tag} SKIPPER_VERSION={skipper-version} up

If you use Windows, environment variables are defined by using the `set`
command. To start the system on Windows, enter the following commands:

C:\ set DATAFLOW_VERSION={local-server-image-tag}
C:\ set SKIPPER_VERSION={skipper-version}
C:\ set SCDF_HOST_IP=Your-Host-IP
C:\ docker-compose up

> **Note**
Expand Down Expand Up @@ -440,7 +434,6 @@ database. To do so:
- metrics.prometheus.target.refresh.cron=0/20 * * * * *
- metrics.prometheus.target.discovery.url=http://localhost:9393/runtime/apps
- metrics.prometheus.target.file.path=/tmp/targets.json
- 'SCDF_HOST_IP=${SCDF_HOST_IP:?SCDF_HOST_IP is not set! Use "export SCDF_HOST_IP=<SCDF Server IP>". Note: 127.0.0.1 is not a valid option!}'
depends_on:
- dataflow-server

Expand Down
47 changes: 44 additions & 3 deletions content/documentation/pages/2-concepts/2-what-are-streams.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,52 @@
---
path: 'concepts/what-are-streams/'
title: 'What are streams'
description: 'Lorem markdownum madefacta, circumtulit aliis, restabat'
description: 'Concepts on Streaming pipelines'
---

# What are streams

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
A streaming data pipeline is typically made of independent event-driven streaming applications that connect using a `messaging middleware` or a `streaming platform`.
The streaming pipeline can be `linear` or `non-linear` based on the data flow through the distributed applications of the streaming pipeline.
As a streaming application developer, you can focus on developing your streaming application’s business logic while delegating the plumbing of the application to the messaging middleware/streaming platform using the Spring Cloud Stream framework.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
## Spring Cloud Stream

The streaming application can produce or consume events to/from the messaging middleware or the streaming platform.
In Spring Cloud Stream:

- the application’s endpoints which produce the events to the messaging middleware or the streaming platform represent the `outbound` boundary
- the application's endpoints which consume the events from the messaging middleware or the streaming platform represent the `inbound` boundary.

Spring Cloud Stream framework provides `@Input` and `@Output` annotations which you can use to qualify these input and output elements.

### Spring Cloud Stream Binding

When the Spring Cloud Stream application gets deployed, its `input` and `output` elements that are configured using the `@Input` and `@Output` annotations are bound to the messaging middleware or the streaming platform using the `binding` properties per destination on the messaging middleware or the streaming platform.

#### Spring Cloud Stream Binding Properties

Binding properties require `spring.cloud.stream.bindings.<inbound/outbound-name>` prefix.

Currently, following properties are supported:

- destination - the destination on the messaging middleware or the streaming platform (example: RabbitMQ exchange or Apache Kafka topic)
- group - the consumer group name to be used for the application. Only for consumer applications. In Spring Cloud Data Flow, this will always be the `stream name`.
- contentType - the content type to be used
- binder - the name of the binder to use for the binding. This property is useful for multi binder use cases.

#### Spring Cloud Stream Properties for Messaging Middleware

Depending on the binder used for binding your application to the Messaging Middleware or the Streaming platform, you can provide the configuration properties for each binder.
All these properties would take the prefix as `spring.cloud.stream.<binderName>.binder`.

For instance, all the Apache Kafka binder related configuration properties have the prefix `spring.cloud.stream.kafka.binder`

### Streams in Spring Cloud Data Flow

When the Spring Cloud Stream application gets deployed, the following properties are implicitly assigned as follows:

- The property `spring.cloud.stream.bindings.<input/output>.destination` is assigned to use the `streamName.<application/label name>`
- The property `spring.cloud.stream.bindings.<input/output>.group` is set to use the stream `name`.

You can still override these properties by setting the bindings properties explicitly for each application.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
path: 'stream-developer-guides/'
title: 'Stream Developer guides '
description: 'Lorem markdownum madefacta, circumtulit aliis, restabat'
title: 'Stream Developer Guides '
description: 'Streaming Developer Guides'
summary: true
---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ summary: true

# Stream Processing

Discuss at a high level what is covered in the following guides.
Introduction and getting started with simple stream pipelines using Spring Cloud Data Flow.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
---
path: 'stream-developer-guides/getting-started/simple-stream'
title: 'Simple Stream'
description: 'Create and Deploy a simple streaming pipeline '
---

# Introduction

Spring Cloud Data Flow provides a list of streaming applications that you can use out of the box and address some of the common streaming use cases.
You can also extend from these out of the box applications or create your custom applications.
All the out-of-the-box streaming applications are:

- Available as Apache Maven artifacts or Docker images
- RabbitMQ or the Apache Kafka binder implementation libraries in their classpath
- Micrometer support for Prometheus and InfluxDB metrics

## Registering the out of the box applications

When registering the out of the box streaming applications, you can choose your artifact type `Maven` or `Docker` depending on the target platform.
The `Local` development and the `CloudFoundry` platform support both `Maven` and `Docker` applications.
The `Kubernetes` platform supports only `Docker` based applications.

You can also choose the messaging middleware or the streaming platform: either `RabbitMQ` or `Apache Kafka`.

### Registering applications using the Dashboard UI

### Registering out of the box streaming applications using the Shell

If you are using `RabbitMQ` as the messaging middleware and `maven` artifacts:

```
app import --uri http://bit.ly/Einstein-SR2-stream-applications-rabbit-maven
```

If you are using `RabbitMQ` as the messaging middleware and `docker` images:

```
app import --uri http://bit.ly/Einstein-SR2-stream-applications-rabbit-docker
```

If you are using `Kafka` as the Streaming platform and `maven` artifacts:

```
app import --uri http://bit.ly/Einstein-SR2-stream-applications-kafka-maven
```

If you are using `Kafka` as the Streaming platform and `docker` images:

```
app import --uri http://bit.ly/Einstein-SR2-stream-applications-kafka-docker
```

## Create the stream

Spring Cloud Data Flow provides a Domain Specific Language (DSL) for creating a stream pipeline.
The individual applications inside the streaming pipeline are connected via a `|` symbol.
This pipe symbol is the logical representation of the messaging middleware or the streaming platform you would use to connect your applications in the streaming pipeline.
For instance, the stream DSL `time | log` represents `time` application sending timestamp data to the messaging middleware and the `log` application receiving the timestamp data from the messaging middleware.

### Streaming data pipeline configuration

The streaming data pipeline can have configuration properties at:

- application level
- deployer level

The `application` properties are applied as the configuration for each individual application.
The `application` properties can be set during the stream `creation` or the `deployment` time.
When set during the stream `deployment`, these properties need to be prefixed with `app.<application-name>`

The `deployer` properties are specific to the target deployment platform `Local`, `CloudFoundry` or `Kubernetes`.
The `deployer` properties can be set only when deploying the stream.
These properties need to be prefixed with `deployer.<application-name>`

Let's create a stream that ingests incoming HTTP events (Source) into a logging application (Sink).

The stream DSL `http --server.port=9000 | log` can represent the streaming pipeline that has `http` source application ingesting http events into `log` sink application.
The symbol `|` represents the messaging middleware or the streaming platform that connects the `http` application to `log` application.
The property `server.port` is the `http` application properties set at the stream creation.

## Dashboard UI

Screen shots here

## Shell

### Stream Creation

To create the stream definition:

```
stream create http-ingest --definition "http --server.port=9000 | log"
```

### Stream Deployment

To the deploy the stream:

```
stream deploy http-ingest
```

You can verify stream status from the `stream list` command.

```
stream list
```

```
╔═══════════╤═════════════════════════════╤═════════════════════════════════════════╗
║Stream Name│ Stream Definition │ Status ║
╠═══════════╪═════════════════════════════╪═════════════════════════════════════════╣
║http-ingest│http --server.port=9000 | log│The stream is being deployed ║
╚═══════════╧═════════════════════════════╧═════════════════════════════════════════╝
```

```
stream list
```

```
╔═══════════╤═════════════════════════════╤═════════════════════════════════════════╗
║Stream Name│ Stream Definition │ Status ║
╠═══════════╪═════════════════════════════╪═════════════════════════════════════════╣
║http-ingest│http --server.port=9000 | log│The stream has been successfully deployed║
╚═══════════╧═════════════════════════════╧═════════════════════════════════════════╝
```

Once the stream is deployed and running, you can now post some `HTTP` events:

```
http post --data "Happy streaming" --target http://localhost:9000

```

If the HTTP POST is successfully sent, you will the response as follows:

```
> POST (text/plain) http://localhost:9000 Happy streaming
> 202 ACCEPTED
```

Now, you can check the `runtime apps` to see the running applications and get the `stdout` log file for the `log` sink application to see the consumed message from the `http` source application.

```
runtime apps
```

Depending on the target runtime environment, you will have to access the `stdout` log file of the `log` application.

#### Local Depoloyment

If the stream is deployed on `Local` development environment, the runtime applications show information about where each application is running in the local environment and their log files locations.

**NOTE** If you are running SCDF on docker, to access the log files of the streaming applications:

`docker exec <stream-application-docker-container-id> tail -f <stream-application-log-file>`

#### Cloud Foundry

#### Kubernetes

### Verification

Once you are able to access the `stdout` file of the `log` application, you will see the message posted from the `http` source application in there:

```
log-sink : Happy streaming
```
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,7 @@ summary: true

# Stream Processing

Discuss at a high level what is covered in the following guides.
We start off using only Spring Cloud Stream and then introduce Spring Cloud Data Flow.
In this stream processing guide, you will see how you can:

- design, develop, test and deploy an individual streaming application
- create a streaming pipeline using those individual applications in Spring Cloud Data Flow