Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filterConfObj.foreach遍历顺序不可控 #29

Closed
RickyHuo opened this issue Aug 28, 2017 · 5 comments
Closed

filterConfObj.foreach遍历顺序不可控 #29

RickyHuo opened this issue Aug 28, 2017 · 5 comments
Assignees
Labels
Milestone

Comments

@RickyHuo
Copy link
Contributor

filterConfObj.foreach遍历顺序不是按照配置文件中插件顺序从上至下遍历

@garyelephant
Copy link
Contributor

garyelephant commented Aug 29, 2017

@RickyHuo 请评估3种方案:


方案一:
由于typesafe 的config在加载配置时,未能保留配置的顺序,所以我们只能修改application.conf 中的配置格式从kv的map形式,改为list形式;

# 更改后的配置类似
filter {
    {
        type = split
        ...
    }
    {
        type = grok
        ...
    }
}

另外如果这样修改,如何支持条件判断相关的配置如:if else,考虑通过如下方式支持:

# 更改后的配置类似,通过如下配置实现一个 if else 逻辑:
filter {
    {
        condition = "status >= 400"
        type = split
		...
    }
    {
        condition = "status < 400"
        type = grok
        ...
    }
}

整体感觉,这种配置让人理解起来不够直观和简单,但是实现相对容易。


方案二:寻找其他配置解析库的替代方案,目前只找到了这一个,是否满足需求,还在确认中:
http://www.cfg4j.org/


方案三:使用antlr4自定义一套配置文件解析规则,同时满足配置看起来直观、简单的需求和支持条件判
断(if else)的需求

实现有学习成本和难度,但是最能满足需求(kv形式,plugin保持顺序,有条件判断, Field引用)。
参考:
https://ivanyu.me/blog/2014/09/13/creating-a-simple-parser-with-antlr/
http://progur.com/2016/09/how-to-create-language-using-antlr4.html
https://blog.knoldus.com/2016/05/04/creating-a-dsl-domain-specific-language-using-antlr-part-ii-writing-the-grammar-file/

@garyelephant garyelephant added this to the M1 milestone Aug 29, 2017
@garyelephant
Copy link
Contributor

garyelephant commented Sep 7, 2017

还有1种方案是修改typesafe config的源码,把它存储配置的数据结构改成LinkedHashMap,就能保持顺序。

注意:这样还是不满足能够做条件判断的需求。

@garyelephant
Copy link
Contributor

garyelephant commented Sep 10, 2017

配置文件parser需求:

  • kv形式

  • plugin保持顺序

  • 有条件判断

  • Field引用

  • 环境变量/自定义变量的在配置中替换(自定义变量的引用方式)

  • 全局配置

  • 配置文件错误提示和定位

@garyelephant
Copy link
Contributor

garyelephant commented Sep 10, 2017

antlr4 grammer file见:

https://github.com/InterestingLab/waterdrop/blob/garyelephant.fea.configparser/src/main/scala/org/interestinglab/waterdrop/configparser/Config.g4

https://github.com/InterestingLab/waterdrop/blob/garyelephant.fea.configparser/src/main/scala/org/interestinglab/waterdrop/configparser/BoolExpr.g4

配置文件示例见:

input {
  kafka {
    brokers = ["10.11.110.35:9092", "10.11.110.36:9092", "10.11.110.37:9092"]
    topic = "accesslog"
  }
}

filter {
  split {
    # default delimeter is whitespace
    fields = ["time", "url", "http_status", "response_time", "refer", "body_size"]
  }

  if ${http_status} >= 500 {
    field {
      action = "add"
      field_name = "internal_error"
      value = 1
    }
  }

  # user defined plugin
  org.apache.mycompany.filters.pagerank {
  }
}

output {

	elasticsearch {
		hosts = ["10.11.110.45:9000", "10.11.110.46:9000"]
		index = "waterdrop-accesslog-${time}"
	}

	if ${http_status} >= 400 AND ${http_status} < 500 {
		kafka {
			brokers = ["10.11.110.35:9092", "10.11.110.36:9092", "10.11.110.37:9092"]
			topic = "user_error"
		}
	}
}

经antlr4解析后,生成如下AST树:

image

根据此AST树和antlr4自动生成的代码进行listener/vistor遍历即可实现配置文件解析功能,解析后的配置转换为typesafe config,供各个插件使用。

需求满足情况如下:

  • 【支持】kv形式

  • 【支持】plugin保持顺序

  • 【支持】有条件判断

  • 【支持】Field引用

  • 【支持但未实现】环境变量/自定义变量的在配置中替换(自定义变量的引用方式)

  • 【支持但未实现】全局配置

  • 【需调研】配置文件错误提示和定位

@garyelephant
Copy link
Contributor

garyelephant commented Sep 10, 2017

@RickyHuo antlr4方案

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants