Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
img
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Azure DocumentDB output plugin for Fluentd

fluent-plugin-documentdb is a fluent plugin to output to Azure DocumentDB

fluent-plugin-documentdb overview

[NEWS] From fluent-plugin-documentdb-0.2.0, it supports partitioned collections, not only single-partition collections (See Partitioning and scaling in Azure DocumentDB for partitioned collections and single-partition collection ).

Requirements

fluent-plugin-documentdb fluentd ruby
>= 0.3.0 >= v0.14.15 >= 2.1
< 0.3.0 >= v0.12.0 >= 1.9

Installation

$ gem install fluent-plugin-documentdb

Configuration

DocumentDB

To use Microsoft Azure DocumentDB, you must create a DocumentDB database account using either the Azure portal, Azure Resource Manager templates, or Azure command-line interface (CLI). In addition, you must have a database and a collection to which fluent-plugin-documentdb writes event-stream out. Here are instructions:

Fluentd - fluent.conf

<match documentdb.*>
    @type documentdb
    @log_level info
    docdb_endpoint  DOCUMENTDB_ACCOUNT_ENDPOINT
    docdb_account_key DOCUMENTDB_ACCOUNT_KEY
    docdb_database  mydb
    docdb_collection mycollection
    auto_create_database true
    auto_create_collection true
    partitioned_collection true 
    partition_key PARTITION_EKY
    offer_throughput 10100
    time_format %s
    localtime false
    add_time_field true
    time_field_name time
    add_tag_field true
    tag_field_name time
</match>
  • docdb_endpoint (required) - Azure DocumentDB Account endpoint URI
  • docdb_account_key (required) - Azure DocumentDB Account key (master key). You must NOT set a read-only key
  • docdb_database (required) - DocumentDB database nameb
  • docdb_collection (required) - DocumentDB collection name
  • auto_create_database (optional) - Default:true. By default, DocumentDB database named docdb_database will be automatically created if it does not exist
  • auto_create_collection (optional) - Default:true. By default, DocumentDB collection named docdb_collection will be automatically created if it does not exist
  • partitioned_collection (optional) - Default:false. Set true if you want to create and/or store records to partitioned collection. Set false for single-partition collection
  • partition_key (optional) - Default:nil. Partition key must be specified for paritioned collection (partitioned_collection set to be true)
  • offer_throughput (optional) - Default:10100. Throughput for the collection expressed in units of 100 request units per second. This is only effective when you newly create a partitioned collection (ie. Both auto_create_collection and partitioned_collection are set to be true )
  • localtime (optional) - Default:false. By default, time record is inserted with UTC (Coordinated Universal Time). This option allows to use local time if you set localtime true
  • time_format (optional) - Default:%s. Time format for a time field to be inserted. Default format is %s, that is unix epoch time. If you want it to be more human readable, set this %Y%m%d-%H:%M:%S, for example.
  • add_time_field (optional) - Default:true. This option allows to insert a time field to record
  • time_field_name (optional) - Default:time. Time field name to be inserted
  • add_tag_field (optional) - Default:true. This option allows to insert a tag field to record
  • tag_field_name (optional) - Default:tag. Tag field name to be inserted

[note] @log_level is a fluentd built-in parameter (optional) that controls verbosity of logging: fatal|error|warn|info|debug|trace (See also Logging of Fluentd)

Configuration examples

fluent-plugin-documentdb will add id attribute which is UUID format and any other attributes of record automatically. In addition, it will add time and tag attributes if add_time_field and add_tag_field are true respectively. Please see 2 types of the plugin configurations example below - single-parition collection and partitioned collection. Source for fluentd to read is apache access log.

(1) Single-Partition Collection Case

fluent.conf

<source>
    @type tail                          # input plugin
    path /var/log/apache2/access.log   # monitoring file
    pos_file /tmp/fluentd_pos_file     # position file
    format apache                      # format
    tag documentdb.access              # tag
</source>

<match documentdb.*>
    @type documentdb
    docdb_endpoint https://yoichikademo.documents.azure.com:443/
    docdb_account_key Tl1xykQxnExUisJ+BXwbbaC8NtUqYVE9kUDXCNust5aYBduhui29Xtxz3DLP88PayjtgtnARc1PW+2wlA6jCJw==
    docdb_database mydb
    docdb_collection my-single-partition-collection
    auto_create_database true
    auto_create_collection true
    partitioned_collection true 
    localtime true
    time_format %Y%m%d-%H:%M:%S
    add_time_field true
    time_field_name time
    add_tag_field true
    tag_field_name tag
</match>

(2) Partitioned Collection Case

fluent.conf

<source>
    @type tail                          # input plugin
    path /var/log/apache2/access.log   # monitoring file
    pos_file /tmp/fluentd_pos_file     # position file
    format apache                      # format
    tag documentdb.access              # tag
</source>

<match documentdb.*>
    @type documentdb
    docdb_endpoint https://yoichikademo.documents.azure.com:443/
    docdb_account_key Tl1xykQxnExUisJ+BXwbbaC8NtUqYVE9kUDXCNust5aYBduhui29Xtxz3DLP88PayjtgtnARc1PW+2wlA6jCJw==
    docdb_database mydb
    docdb_collection my-partitioned-collection
    auto_create_database true
    auto_create_collection true
    partitioned_collection true 
    partition_key host
    offer_throughput 10100
    localtime true
    time_format %Y%m%d-%H:%M:%S
    add_time_field true
    time_field_name time
    add_tag_field true
    tag_field_name tag
</match>

Sample inputs and expected records

An expected output record for sample input will be like this:

Sample Input (apache access log)

125.212.152.166 - - [17/Jan/2016:05:03:25 +0000] "GET /foo/bar/test.html HTTP/1.1" 304 179 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36"

Output Record

{
    id :  d2b2ece8-b948-41ae-a894-0ed1266e242a,
    host :  125.211.152.166,
    user :  -,
    method :  GET,
    path :  /foo/bar/test.html,
    code :  304,
    size :  179,
    referer :  -,
    agent :  Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36,
    time :  20160117-05:03:25,
    tag :  documentdb.access
}  

Tests

Running test code

$ git clone https://github.com/yokawasa/fluent-plugin-documentdb.git
$ cd fluent-plugin-documentdb

# edit CONFIG params of test/plugin/test_documentdb.rb 
$ vi test/plugin/test_documentdb.rb

# run test 
$ rake test

Creating package, running and testing locally

$ rake build
$ rake install:local
 
# running fluentd with your fluent.conf
$ fluentd -c fluent.conf -vv &
 
# send test apache requests for testing plugin ( only in the case that input source is apache access log )
$ ab -n 5 -c 2 http://localhost/foo/bar/test.html

TODOs

Change log

Links

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/yokawasa/fluent-plugin-documentdb.

Copyright

CopyrightCopyright (c) 2016- Yoichi Kawasaki
LicenseApache License, Version 2.0
You can’t perform that action at this time.