Kiba enhancement for Ruby ETL.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

kiba-plus

Kiba enhancement for Ruby ETL. It connects to various data sources including relational, non-relational, and flat file, cloud services and HTTP resources. It has flexible load strategies including insert, bulk load and upsert.

Usage

# /tmp/customer_mysql_to_pg.etl

require 'kiba/plus'

require 'kiba/plus/source/mysql'
require 'kiba/plus/destination/pg_bulk2'

SOURCE_URL = 'mysql://root@localhost/shopperplus'

DEST_URL   = 'postgresql://hooopo@localhost:5432/crm2_dev'

source Kiba::Plus::Source::Mysql, { :connect_url => SOURCE_URL,
                           :query => %Q{SELECT id, email, 'hooopo' AS first_name, 'Wang' AS last_name FROM customers}
                         }

destination Kiba::Plus::Destination::PgBulk2, { :connect_url => DEST_URL,
                                :table_name => "customers",
                                :truncate => true,
                                :columns => [:id, :email, :first_name, :last_name],
                                :incremental => false
                              }

post_process do
  result = PG.connect(DEST_URL).query("SELECT COUNT(*) AS num FROM customers")
  puts "Insert total: #{result.first['num']}"
end

Execute in shell:

$ bundle exec kiba /tmp/customer_mysql_to_pg.etl

# Output:
# I, [2016-05-16T01:53:36.832565 #87909]  INFO -- : TRUNCATE TABLE customers;
# I, [2016-05-16T01:53:36.841770 #87909]  INFO -- : COPY customers (id, email, first_name, last_name) FROM STDIN WITH DELIMITER ',' NULL '\N' CSV
# Insert total: 428972

Execute in ruby script:

require 'kiba'

job_definition = Kiba.parse(IO.read('/tmp/customer_mysql_to_pg.etl'), '/tmp/customer_mysql_to_pg.etl')
Kiba.run(job_definition)

Examples

Main Feature

  • Csv Source

  • MySQL Source

  • Postgresql Source

  • Citus Source

  • Greenplus Source

  • MongoDB Source (TODO)

  • Elastic Source (TODO)

  • Redshift Source (TODO)

  • Csv Destination

  • MySQL Destination

  • Postgresql Destination

  • Citus Destination

  • Greenplus Destination

  • MongoDB Destination (TODO)

  • Elastic Destination (TODO)

  • Redshift Destination (TODO)

  • Bulk Load for large dataset

  • Upsert for MySQL & Postgresql

  • Incremental Update

Installation

Add this line to your application's Gemfile:

gem 'kiba'
gem 'kiba-plus'

And then execute:

$ bundle

Or install it yourself as:

$ gem install kiba-plus

Development

First of all, Please run the following code in shell.

$ mysql -e 'create database kiba_plus_src_test;'

$ mysql -e 'create database kiba_plus_dest_test;'

$ psql -c 'create database kiba_plus_src_test;' -U postgres

$ psql -c 'create database kiba_plus_dest_test;' -U postgres

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/hooopo/kiba-plus.