A sqoop-like jruby/jdbc query application that does not run via map/reduce.
Ruby Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
CHANGELOG
README
nosqoop4u.gemspec

README

nosqoop4u

A sqoop-like jruby/jdbc query application that does not run via map/reduce.
It supports direct output to HDFS and unix filesystems as well as STDOUT.

Requires jruby 1.6+, however 1.6.4+ is HIGHLY recommended for top performance.

- Why write this?  What's wrong with sqoop?

Nothing is wrong with sqoop.  It just doesn't do quite what I need and
it's not so straightforward to debug when something goes wrong.  In
addition, my Hadoop cluster does not route beyond an access layer which
means I need to create individual db-specific firewall rules in order to
use sqoop.

Also, I wanted a nice opportunity to play with jruby's java interop
and here I get to use both the JDBC and Hadoop HDFS APIs.  On a side
note, I'm blown away by jruby 1.6.2...stellar performance and an
elegant seamless integration with java.

# installation

jgem install --no-wrapper nosqoop4u

Or simply copy nosqoop4u.rb and nosqoop4u into your path.

# usage

usage: nosqoop4u options
  -o, --output      # output file (hdfs://, file://, - for stdout)
  -c, --connect url # jdbc connection url (env NS4U_URL)
  -u, --user        # db username         (env NS4U_USER)
  -p, --pass        # db password         (env NS4U_PASS)
  -d, --driver      # JDBC driver class   (env NS4U_DRIVER)
  -e, --query       # sql query to run
  -F, --delim       # delimiter (default: ^A)
  -f, --fetch       # fetch size
  -v, --version
  -h, --help

Please let me know if you find this software useful!

--frank