Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
71 lines (52 sloc) 3.3 KB

Breeze

An alternative Clojure API for Apache Storm.

WARNING: THIS API IS NOT READY FOR PUBLIC CONSUMPTION, PLEASE REFRAIN FROM USING UNTIL RELEASED

NOTE: I am planning on releasing an alpha version soon (Jan. - Feb. 2017), and publicly announcing this work. Stay tuned!

Usage

Many examples of Breeze reimplementing Storm examples are available in the Breeze-examples project.

Goals

  1. Idiomatic Clojure API for Apache Storm
  2. Stay out of your way if you need to get "under the hood". All of Storm is accessible through Breeze.
  3. Provide flexibile data-driven topologies, to allow one project to contain all Storm components and build many different topologies that share the same code.

Features

  1. Basic and Rich Bolts
  2. Windowed Bolts
  3. Stateful Bolts (including windowed version)
  4. Rich Spouts
  5. Topology construction from EDN
  6. Consistent Clojure API covering all Storm features.
  7. Support for integrating existing Java components
  8. Tight spec integration, fail-fast design.

Rationale

There are a few problems with the Clojure Storm API as it exists currently. Switching to another system is not a great option due to heavy investment in Storm / Clojure. Rather than continue complaining, I took it upon myself to build a better API.

Clojure appears to be on its way out of Apache Storm, with many components being replaced by Java equivalents. Lots of new features haven't been added to the Clojure API, like windows, state, etc. The current API is also heavily macro-based, hiding many details from the user with little gain besides syntax. This can make testing much more difficult since setup/teardown and execution are all bundled into one big blob of code that looks like normal Clojure, but is executed very differently. Using proxy is not a solution because your code is actually serialized before execution, so lots of objects are unusable. deftype and gen-class are possible work-arounds, but they're not the nicest interfaces to use. Clojure has functions and maps, and we'd like to rely on those as much as possible (plain-old data). This project is inspired by Onyx.

The recommended method of programming against Storm is currently inheritance-based, with user code deriving from BaseRichBolt, BaseWindowedBolt, etc. This project defines several objects that implement Storm interfaces and allows you to inject code via maps a.k.a. data. That map can be specified in code, read from EDN, or merged and assembled in any way that you like, as long as it provides the expected keys.

There are utilities to allow reading, validating, and visualizing topologies specified in EDN, which are constructed and handed to you as a StormTopology object. These utilities are similar in spirit to Flux, but are more Clojure specific. This is useful because it allows you to have a generic set of Storm components in a project, but allows easier reuse when defining additional topologies.

TODO

  • Metrics API
  • Custom / Swappable Scheduler API
  • Visualization of stream-specific topologies
  • Higher-level interface that abstracts Storm

License

Copyright © 2016-2017 Arthur Maciejewicz

Distributed under the MIT License.