JDBC plugin feeder mode as an alternative to the deprecated Elasticsearch River API

Jörg Prante edited this page Mar 31, 2015 · 2 revisions

How to set up a JDBC feeder script for Elasticsearch

Now that rivers are deprecated, I want to show how to set up an alternative method for indexing data from a JDBC source into Elasticsearch.

The setup is very easy.

In this example, I assume you are on Linux and want to connect to a MySQL DB.

  1. Install Elasticsearch as usual

  2. Install JDBC plugin with the ./bin/plugin tool and add a JDBC driver jar as usual. This means, you must install JDBC plugin on all nodes of the cluster and you must restart all nodes to activate the plugin cluster-wide.

  3. Now we decide not to use the river API, but instead the feeder mechanism, which works by starting a separate JVM.

  4. Set a bash variable ES_HOME to the home directory of Elasticsearch (if not already exists)

  5. Create a JDBC feed script directory

    mkdir -p $ES_HOME/bin/feeder/
    
  6. Create JDBC feed script $ES_HOME/bin/feeder/mysql.sh

    #!/bin/bash
    
    # ES_HOME required to detect elasticsearch jars
    
    export ES_HOME=~es/elasticsearch-1.4.4
    
    echo '
    {
    "elasticsearch" : {
         "cluster" : "elasticsearch",
         "host" : "localhost",
         "port" : 9300
    },
    "type" : "jdbc",
    "jdbc" : {
        "url" : "jdbc:mysql://localhost:3306/test",
        "user" : "",
        "password" : "",
        "sql" :  "select *, order_id as _id from orders"
      }
    }
    ' | java \
    -cp "${ES_HOME}/lib/*:${ES_HOME}/plugins/jdbc/*" \
    org.xbib.elasticsearch.plugin.jdbc.feeder.Runner \  
    org.xbib.elasticsearch.plugin.jdbc.feeder.JDBCFeeder
    
  7. Set execution attribute with

    chmod +x $ES_HOME/bin/feeder/mysql.sh
    
  8. Create a log4j.properties file and add it to the classpath, e.g. $ES_HOME/lib/log4j.properties

    log4j.rootLogger=DEBUG, out
    log4j.appender.out=org.apache.log4j.ConsoleAppender
    log4j.appender.out.layout=org.apache.log4j.PatternLayout
    log4j.appender.out.layout.ConversionPattern=[%d{ABSOLUTE}][%-5p][%-25c][%t] %m%n
    
  9. Start feeder with

    $ES_HOME/bin/feeder/mysql.sh
    

and now you will see debug messages how the feeder starts up and tries to connect to the specified JDBC source and Elasticsearch node, performing the heavy work.

Voilà!