Permalink
Browse files

fix to include ConfigPlaceholders mix-in, and mark as v0.1.0 relase

  • Loading branch information...
1 parent 39d59e5 commit bbbbaa63057ad9ec0acd6aafffa9c875ab1754a2 @tagomoris tagomoris committed Aug 20, 2012
Showing with 40 additions and 3 deletions.
  1. +25 −2 README.md
  2. +3 −1 fluent-plugin-webhdfs.gemspec
  3. +3 −0 lib/fluent/plugin/out_webhdfs.rb
  4. +9 −0 test/plugin/test_out_webhdfs.rb
View
@@ -63,10 +63,33 @@ Store data as TSV (TAB separated values) of specified keys, without time, with t
If message doesn't have specified attribute, fluent-plugin-webhdfs outputs 'NULL' instead of values.
+### Performance notifications
@kzk

kzk Aug 20, 2012

Owner

Good note!

+
+Writing data on HDFS single file from 2 or more fluentd nodes, makes many bad blocks of HDFS. If you want to run 2 or more fluentd nodes with fluent-plugin-webhdfs, you should configure 'path' for each node.
+You can use '${hostname}' or '${uuid:random}' placeholders in configuration for this purpose.
+
+For hostname:
+
+ <match access.**>
+ type webhdfs
+ host namenode.your.cluster.local
+ port 50070
+ path /log/access/%Y%m%d/${hostname}.log
+ </match>
+
+Or with random filename (to avoid duplicated file name only):
+
+ <match access.**>
+ type webhdfs
+ host namenode.your.cluster.local
+ port 50070
+ path /log/access/%Y%m%d/${uuid:random}.log
+ </match>
+
+With configurations above, you can handle all of files of '/log/access/20120820/*' as specified timeslice access logs.
+
## TODO
-* long run test
- * over webhdfs and httpfs
* patches welcome!
## Copyright
@@ -1,7 +1,7 @@
# -*- encoding: utf-8 -*-
Gem::Specification.new do |gem|
gem.name = "fluent-plugin-webhdfs"
- gem.version = "0.0.5"
+ gem.version = "0.1.0"
gem.authors = ["TAGOMORI Satoshi"]
gem.email = ["tagomoris@gmail.com"]
gem.summary = %q{Fluentd plugin to write data on HDFS over WebHDFS, with flexible formatting}
@@ -16,8 +16,10 @@ Gem::Specification.new do |gem|
gem.add_development_dependency "rake"
gem.add_development_dependency "fluentd"
gem.add_development_dependency "fluent-mixin-plaintextformatter"
+ gem.add_development_dependency "fluent-mixin-config-placeholders"
gem.add_development_dependency "webhdfs", '>= 0.5.1'
gem.add_runtime_dependency "fluentd"
gem.add_runtime_dependency "fluent-mixin-plaintextformatter"
+ gem.add_runtime_dependency "fluent-mixin-config-placeholders"
gem.add_runtime_dependency "webhdfs", '>= 0.5.1'
end
@@ -1,5 +1,6 @@
# -*- coding: utf-8 -*-
+require 'fluent/mixin/config_placeholders'
require 'fluent/mixin/plaintextformatter'
class Fluent::WebHDFSOutput < Fluent::TimeSlicedOutput
@@ -12,6 +13,8 @@ class Fluent::WebHDFSOutput < Fluent::TimeSlicedOutput
config_param :port, :integer, :default => 50070
config_param :namenode, :string, :default => nil # host:port
+ include Fluent::Mixin::ConfigPlaceholders
+
config_param :path, :string
config_param :username, :string, :default => nil
@@ -41,6 +41,15 @@ def test_configure
assert_equal 'hdfs_user', d.instance.username
end
+ def test_configure_placeholders
+ d = create_driver %[
+hostname testing.node.local
+namenode server.local:50070
+path /hdfs/${hostname}/file.%Y%m%d%H.log
+]
+ assert_equal '/hdfs/testing.node.local/file.%Y%m%d%H.log', d.instance.path
+ end
+
def test_path_format
d = create_driver
assert_equal '/hdfs/path/file.%Y%m%d.log', d.instance.path

0 comments on commit bbbbaa6

Please sign in to comment.