Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make alternative of FileUtils #1285

Open
tagomoris opened this issue Oct 19, 2016 · 3 comments
Open

Make alternative of FileUtils #1285

tagomoris opened this issue Oct 19, 2016 · 3 comments

Comments

@tagomoris
Copy link
Member

tagomoris commented Oct 19, 2016

Updated: StringUtil.match_regexp is now easy to replace with Fluent::Plugin::Base#string_safe_encoding.
There are still Fluent::FileUtil.writable? and #writable_p?.

Fluent::Plugin::StringUtil.match_regexp (implementation was moved to Fluent::Compat::StringUtil now) has some implementation problem:

  • it provides encode-safety about regular expression matching, but method name doesn't show it
  • it uses $log, but should use plugin logger
  • scrubbing strings is very universal process: not only for regular expression matching

My idea is to add a method like this:

def ensure_safe_encoding(str)
  begin
    yield str
  rescue ArgumentError => e
    raise e unless e.message.include?("invalid byte sequence in".freeze)
    log.info "invalid byte sequence is replaced in `#{string}`"
    str = str.scrub('?')
    retry
  end
end

We can use this method like below:

m = ensure_safe_encoding(line){|s| s =~ /my pattern/}
# or other methods or ...
chars  = ensure_safe_encoding(line){|s| s.split }

Blockers for implementation are:

  • Naming problem of the method
  • Namespace? lib/fluent? lib/fluent/util? or...
  • Should it be a method of Fluent::Plugin::Base? plugin helper? (If true, we'll have so many helpers...) or just Mixin? (We're eliminating mixins, but...)
@tagomoris
Copy link
Member Author

match_regexp is used in:

  • lib/fluent/plugin/parser.rb
  • lib/fluent/plugin/filter_grep.rb

@nurse
Copy link
Contributor

nurse commented Oct 19, 2016

You can write just like following. Note that String#valid_encoding? uses a cache in VALUE if the string has already checked the validity.

unless string.valid_encoding?
  log.info "invalid byte sequence is replaced in `#{string}`"
  string = string.scrub('?')
end

@tagomoris
Copy link
Member Author

There's a same issue about FileUtil.

  • FileUtil.writable? does different check from File.writable?... we should use another method name

@tagomoris tagomoris changed the title Make alternative of StringUtil.match_regexp Make alternative of FileUtils Nov 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants