logstash-output-documentdb is a logstash plugin to output to Azure DocumentDB
Ruby Shell
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

Azure DocumentDB output plugin for Logstash

logstash-output-documentdb is a logstash plugin to output to Azure DocumentDB. Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite destinations. Azure DocumentDB is a managed NoSQL database service provided by Microsoft Azure. It’s schemaless, natively support JSON, very easy-to-use, very fast, highly reliable, and enables rapid deployment, you name it.

Installation

You can install this plugin using the Logstash "plugin" or "logstash-plugin" (for newer versions of Logstash) command:

bin/plugin install logstash-output-documentdb
# or 
bin/logstash-plugin install logstash-output-documentdb  (Newer versions of Logstash)

Please see Logstash reference for more information.

Configuration

output {
    documentdb {
        docdb_endpoint => "https://<YOUR ACCOUNT>.documents.azure.com:443/"
        docdb_account_key => "<ACCOUNT KEY>"
        docdb_database => "<DATABASE NAME>"
        docdb_collection => "<COLLECTION NAME>"
        auto_create_database => true|false
        auto_create_collection => true|false
        partitioned_collection => true|false
        partition_key =>  "<PARTITIONED KEY NAME>"
        offer_throughput => <THROUGHPUT NUM>
    }
}
  • docdb_endpoint (required) - Azure DocumentDB Account endpoint URI
  • docdb_account_key (required) - Azure DocumentDB Account key (master key). You must NOT set a read-only key
  • docdb_database (required) - DocumentDB database nameb
  • docdb_collection (required) - DocumentDB collection name
  • auto_create_database (optional) - Default:true. By default, DocumentDB database named docdb_database will be automatically created if it does not exist
  • auto_create_collection (optional) - Default:true. By default, DocumentDB collection named docdb_collection will be automatically created if it does not exist
  • partitioned_collection (optional) - Default:false. Set true if you want to create and/or store records to partitioned collection. Set false for single-partition collection
  • partition_key (optional) - Default:nil. Partition key must be specified for paritioned collection (partitioned_collection set to be true)
  • offer_throughput (optional) - Default:10100. Throughput for the collection expressed in units of 100 request units per second. This is only effective when you newly create a partitioned collection (ie. Both auto_create_collection and partitioned_collection are set to be true )

Tests

logstash-output-documentdb adds id attribute (UUID format) to in-coming events automatically and send them to DocumentDB. Here is an example configuration where Logstash's event source and destination are configured as Apache2 access log and DocumentDB respectively.

Example Configuration

input {
    file {
        path => "/var/log/apache2/access.log"
        start_position => "beginning"
    }
}

filter {
    if [path] =~ "access" {
        mutate { replace => { "type" => "apache_access" } }
        grok {
            match => { "message" => "%{COMBINEDAPACHELOG}" }
        }
    }
    date {
        match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
    }
}

output {
    documentdb {
        docdb_endpoint => "https://yoichikademo.documents.azure.com:443/"
        docdb_account_key => "EMwUa3EzsAtJ1qYfzwo9nQxxydofsXNm3xLh1SLffKkUHMFl80OZRZIVu4lxdKRKxkgVAj0c2mv9BZSyMN7tdg==(dummy)"
        docdb_database => "testdb"
        docdb_collection => "apache_access"
        auto_create_database => true
        auto_create_collection => true
    }
    # for debug
    stdout { codec => rubydebug }
}

You can find example configuration files in logstash-output-documentdb/examples.

Run the plugin with the example configuration

Now you run logstash with the the example configuration like this:

# Test your logstash configuration before actually running the logstash
bin/logstash -f logstash-apache2-to-documentdb.conf  --configtest
# run
bin/logstash -f logstash-apache2-to-documentdb.conf

Here is an expected output for sample input (Apache2 access log):

Apache2 access log

124.211.152.166 - - [27/Dec/2016:02:12:28 +0000] "GET /test.html HTTP/1.1" 200 316 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"

Output (rubydebug)

{
        "message" => "124.211.152.166 - - [27/Dec/2016:02:12:28 +0000] \"GET /test.html HTTP/1.1\" 200 316 \"-\" \"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36\"",
       "@version" => "1",
     "@timestamp" => "2016-12-27T02:12:28.000Z",
           "path" => "/var/log/apache2/access.log",
           "host" => "yoichitest01",
           "type" => "apache_access",
       "clientip" => "124.211.152.166",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "27/Dec/2016:02:12:28 +0000",
           "verb" => "GET",
        "request" => "/test.html",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => "316",
       "referrer" => "\"-\"",
          "agent" => "\"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36\"",
             "id" => "0cae1966-b7ab-4f32-8893-b4fabc7800ae"
}

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/yokawasa/logstash-output-documentdb.