ingestion-ws-kafka-gearpump-hbase

GearPump is inspired by recent advances in the Akka framework and a desire to improve on existing streaming frameworks. GearPump draws from a number of existing frameworks including MillWheel, Apache Storm, Spark Streaming, Apache Samza, Apache Tez and Hadoop YARN while leveraging Akka actors throughout its architecture.

User can submit to GearPump a computation DAG (Directed Acyclic Graph of tasks), which contains a list of nodes and edges, and each node can be parallelized to a set of tasks. GearPump will then schedule and distribute different tasks in the DAG to different machines automatically. Each task will be started as an actor, which is long running micro-service.

The example pipeline builds on "websockets->kafka->hdfs" example.

This approach is taken to emphasize that existing ingestions can be enhanced easily by dropping in GearPump computation DAG that could calculate, aggregate, filter data but also allow for parallel execution and stream forking.

The key component of the pipeline is GearPump application (computation DAG) that:

Reads from kafka topic
Splits processing into two streams
One stream just passes the messages to another kafka topic
Second stream:
Processes the message (in this example the processing is simple string reversal)
Persists messages to HBase

Processing is visualized in a dashboard that allows for tracking progress of messages, health of the app and the metrics (messages throughput, etc.):

Please, visit the project's wiki for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
docs		docs
gearpump		gearpump
kafka2hdfs		kafka2hdfs
ws2kafka		ws2kafka
README.md		README.md
pack.sh		pack.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ingestion-ws-kafka-gearpump-hbase

About

Releases

Packages

Contributors 3

Languages

trustedanalytics/ingestion-ws-kafka-gearpump-hbase

Folders and files

Latest commit

History

Repository files navigation

ingestion-ws-kafka-gearpump-hbase

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages