Running Kiba jobs from the command line

Thibaut Barrère edited this page Jan 6, 2018 · 2 revisions

To use your Kiba job, you can use the provided command-line:

bundle exec kiba my-data-processing-script.etl

If you need to pass parameters to your kiba command-line, check out this StackOverflow answer, or consider writing your jobs as Kiba.parse blocks if you need more flexibility.

This command essentially starts a two-step process:

script_content = IO.read(filename)
# pass the filename to get line numbers on errors
job_definition = Kiba.parse(script_content, filename)
Kiba.run(job_definition)

Kiba.parse evaluates your ETL Ruby code to register sources, transforms, destinations and post-processors in a job definition. It is important to understand that you can use Ruby logic at the DSL parsing time. This means that such code is possible, provided the CSV files are available at parsing time:

Dir['to_be_processed/*.csv'].each do |file|
  source MyCsvSource, file
end

Once the job definition is loaded, Kiba.run will use that information to do the actual row-by-row processing. It currently uses a simple row-by-row, single-threaded processing that will stop at the first error encountered.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.