Skip to content

@thbar thbar released this Jan 5, 2018 · 49 commits to master since this release

New StreamingRunner engine

Kiba 2 introduces a new, opt-in engine called the StreamingRunner, which allows to generate an arbitrary number of rows inside class transforms. This drastically improves the reusability & composability of Kiba components (see #44 for some background).

To use the StreamingRunner, use the following code:

# activate the new Kiba internal config system
extend Kiba::DSLExtensions::Config
# opt-in for the new engine
config :kiba, runner: Kiba::StreamingRunner

# write transform class able to yield an arbitrary number of rows
class MyYieldingTransform
  def process(row)
    yield {key: 1}
    yield {key: 2}
    {key: 3}
  end
end

The improved runner is compatible with Ruby 2.0+.

⚠️ it is warmly recommended not to share data between the rows yielded this way, otherwise anything changing one row will also affect the others. Make sure to build completely independent rows (or use an immutable Hash structure).

Compatibility with Kiba 1

Kiba 2 is expected to be compatible with existing Kiba 1 scripts, as long as you did not use internal API.

Internal changes include:

  • An opt-in, Elixir's mix-inspired config system, currently only used to select the runner you want at job declaration time
  • A stronger isolation in the Parser, to reduces the chances that ETL scripts could conflict with Kiba internal classes
Assets 2
You can’t perform that action at this time.