Skip to content
An extensible distributed system for reliable nearline data streaming at scale
Java Other
  1. Java 99.2%
  2. Other 0.8%
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci migrate from circle 1 to circle 2 May 12, 2018
.github Improve formatting of feature request template Mar 15, 2019
checkstyle Add Checkstyle rules for import ordering May 29, 2019
config Change Kafka transport provider port Apr 29, 2019
datastream-client/src Enable FindBugs and fix detected issues Aug 30, 2019
datastream-common/src Enable FindBugs and fix detected issues Aug 30, 2019
datastream-directory/src/main/java/com/linkedin/datastream Enable FindBugs and fix detected issues Aug 30, 2019
datastream-file-connector/src/main/java/com/linkedin/datastream/connectors/file Enable FindBugs and fix detected issues Aug 30, 2019
datastream-kafka-connector/src Fix the datastream state for multiple tasks and connector validation (#… Sep 17, 2019
datastream-kafka-factory-impl/src/main/java/com/linkedin/datastream/connectors/kafka Require javadocs for all public types Apr 16, 2019
datastream-kafka/src Default partition strategy always sends data to single partition (#637) Sep 6, 2019
datastream-server-api/src Fix the datastream state for multiple tasks and connector validation (#… Sep 17, 2019
datastream-server-restli/src Fix the datastream state for multiple tasks and connector validation (#… Sep 17, 2019
datastream-server/src Fix the datastream state for multiple tasks and connector validation (#… Sep 17, 2019
datastream-testcommon/src/main/java/com/linkedin/datastream Enable FindBugs and fix detected issues Aug 30, 2019
datastream-tools/src/main/java/com/linkedin/datastream/tools Enable FindBugs and fix detected issues Aug 30, 2019
datastream-utils/src Enable FindBugs and fix detected issues Aug 30, 2019
findbugs Enable FindBugs and fix detected issues Aug 30, 2019
gradle Bump up version Sep 16, 2019
images Update README (#598) Jun 27, 2019
scripts Fix and improve Brooklin release tarball Apr 24, 2019
.gitignore Fix test flakes in TestKafkaTransportProvider (#599) May 7, 2019
.pullapprove.yml pullapprove: relax approve and reject rules Jan 27, 2016
CHANGELOG.md Release new version 1.0.1 Sep 16, 2019
CODE_OF_CONDUCT.md Add Code of Conduct Apr 5, 2019
CONTRIBUTING.md Update CONTRIBUTING Apr 5, 2019
HEADER Add copyright notice and automated checks Feb 21, 2019
LICENSE Add LICENSE, NOTICE, and CONTRIBUTING files Feb 1, 2019
NOTICE Upgrade metrics-core to version 4.10 (#641) Aug 29, 2019
README.asciidoc Release new version 1.0.1 Sep 16, 2019
Vagrantfile Correct Brooklin download location Jul 16, 2019
build.gradle Remove FindBugs showProgress property Sep 11, 2019
gradle.properties Fix and improve Brooklin release tarball Apr 24, 2019
gradlew Upgrade Gradle version Feb 21, 2019
gradlew.bat Upgrade Gradle version Feb 21, 2019
settings.gradle Fix and improve Brooklin release tarball Apr 24, 2019

README.asciidoc

Brooklin

brooklin bintray v1.0.1 blue kafka brooklin bug

Brooklin Overview


Brooklin is a distributed system intended for streaming data between various heterogeneous source and destination systems with high reliability and throughput at scale. Designed for multitenancy, Brooklin can simultaneously power hundreds of data pipelines across different systems and can easily be extended to support new sources and destinations.

Distinguishing features

  • Extensible for any source and destination

    • Brooklin offers a flexible API that can be extended to support a wide variety of source and destination systems. It is not confined to single type of source or destination system.

    • Source and destination systems can be freely mixed and matched. They do not have to be the same.

  • Scalable

    • Brooklin supports creating an arbitrary number of data streams that are processed concurrently and independently such that errors in one stream are isolated from the rest.

    • Brooklin supports partitioned data streams throughout its core implementation and APIs.

    • Brooklin can be deployed to a cluster of machines (scale out) to support as many data streams as desired.

  • Easy to operate and manage

    • Brooklin exposes a REST endpoint for managing data streams, that offers a rich set of operations on them in addition to CRUD (e.g. pause and resume).

    • Brooklin also exposes a diagnostics REST endpoint that enables on-demand querying of a data stream’s status.

  • Battle-tested at scale with Kafka

    • While it is not limited to any particular system, Brooklin provides capabilities for reading/writing massive amounts of data to/from Kafka with high reliability at scale. You can learn more about this in the [Use Cases] section.

  • Supports Change Data Capture with bootstrap

    • Brooklin supports propagating Change Data Capture events from data stores, e.g. RDBMS, KV stores …​ etc.

    • Brooklin also supports streaming a snapshot of the existing data before propagating change events.

Use cases

Mirroring Kafka clusters

  • Multitenancy

    A single Brooklin cluster can be used to mirror data across several Kafka clusters.

  • Fault isolation across topic partitions

    One bad partition will not affect an entire Kafka topic. Mirroring will continue for all the other healthy partitions.

  • Whitelisting topics using regular expressions

    Select the topics to mirror using regular expression patterns against their names.

  • Pausing and resuming individual partitions

    Through its Datastream Management Service (DMS), Brooklin exposes REST APIs that allow finer control over replication pipelines, like being able to pause and resume individual partitions of a Kafka topic.

Check out Mirroring Kafka Clusters to learn more about using Brooklin to mirror Kafka clusters.

Change Data Capture

  • Brooklin supports propagating Change Data Capture events from data stores, e.g. RDBMS, KV stores …​ etc.

  • Brooklin supports bootstrapping data from a datastore, i.e. streaming a snapshot of the existing data before any change events.

  • MySQL support is currently under development.

Stream processing bridge

Trying out Brooklin

Feel free to check out our step-by-step tutorials for running Brooklin locally in a few example scenarios.

Documentation

Community

License

Copyright (c) LinkedIn Corporation. All rights reserved. Licensed under the BSD 2-Clause License.

You can’t perform that action at this time.