Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dots in field names and make filters pluggable #1623

Merged
merged 6 commits into from May 19, 2016

Conversation

@monicasarbu
Copy link
Contributor

@monicasarbu monicasarbu commented May 12, 2016

This PR includes the following changes:

  • Each filtering action is a plugin, so you can easily add a new action to the filtering
  • Add drop_event action
  • Be able to pass fields that contain . in the condition:

    equals:
    process.pid: 34443
  • Add a sample example in libbeat.yml
  • Add system tests for Topbeat, Packetbeat, Filebeat, Metricbeat

I will update the documentation in a separate PR as well as system tests for Metricbeat.

@monicasarbu monicasarbu mentioned this pull request May 12, 2016
15 of 17 tasks complete
@monicasarbu monicasarbu force-pushed the monicasarbu:use_dots_field_in_condition branch from 3ce4f07 to cf3bc65 May 12, 2016
@monicasarbu monicasarbu force-pushed the monicasarbu:use_dots_field_in_condition branch 2 times, most recently from a684b5d to 006a567 May 18, 2016
# exported event:
# event -> filter1 -> event1 -> filter2 ->event2 ...
# Supported actions: drop_fields, drop_event, include_fields
# filters:

This comment has been minimized.

@ruflin

ruflin May 18, 2016
Collaborator

remove the space in front of the config options which should be uncommented. makes it easy to see difference between config and docs

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

Good spot. Thanks

Int uint64
Str string
}

type Condition struct {

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

having multiple conditions how are they 'combined'? 'Condition' acts like an AND ?

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

yes, like an AND

equals map[string]EqualsValue
contains map[string]string
regexp map[string]*regexp.Regexp
rangexp map[string]RangeValue

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

can I use the same key with different kind of expressions, but not with same kind of expression?

e.g.

condition:
  equals:
    http.code: 20
  range:
    http.code: ...

How about switching to:

type Condition struct {
    checks []FieldCheck
}

type FieldCheck struct {
    field string
    check Check
}

type Check interface {
    func CheckValue(v interface{}) (bool, error)
}

type IntEq struct { i int64 }
func (i IntEq) CheckValue(v interface{}) (bool, error) { ... }

type FloatEq struct { f float64 }
func (f FloatEq) CheckValue(v interface{}) (bool, error) { ... }

type StringEq struct { s string }
func (s StringEq) CheckValue(v interface{}) (bool, error) { ... }

func (c *Condition) Check(event common.MapStr) (bool, error) {
    for _, fieldCheck := range c.checks {
      val := extractValue(event, fieldCheck.field)
      if b, err := fieldCheck.check.Check(val); !b || err != nil {
          return b, err
      }
    }
    return true, nil
}

If one key is used multiple times, the terms should be or combined. For example:

condition:
  equals:
    - http.code: 200
    - http.code: 302

Will act like http.code == 200 || http.code == 320

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

The reason why the struct Condition looks like this is that in the current version you don't need to add an extra "condition" tag. For example:

- drop_fields:
    equals:
       http.code: 200
       status: OK
    fields: ["http.request_headers", "http.response_headers"]

In the current implementation you cannot use the same key in the condition as it's a map. I am not sure if it's better to have a list instead of a map. Currently I am doing AND between them, not OR.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

In the future I am planning to add or, and and not. For example, your case becomes:

or:
  equals:
    http.code: 200
  equals:
   http.code: 302

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

this won't work either, as the condition types are also map[string].... At some point we need to introduce a list for this use-case.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

Yes, you are right. We need to put an array for all the conditions combined with or and and.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

So, it becomes:

or:
  - equals:
     http.code: 200
  - equals:
      http.code: 302 

i, err := strconv.Atoi(value)
for field, value := range cfg.fields {
uint_value, err := extractInt(value)

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

new code should use camel case

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

Good spot, thanks!

for field, value := range cfg.fields {
switch v := value.(type) {
case string:
c.regexp[field] = regexp.MustCompile(v)

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

be carefull, regexp.MustCompile will panic on invalid regex. Better use regexp.Compile. I'd even recommend compiling the regex in Unpack method and return error on failure. ucfg will print additional context (which config failed and why) if Unpack fails.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

Good spot, thanks!

default:
logp.Warn("unexpected type %T in equals condition as it accepts only integers and strings. ", value)
return false
int_value, err := extractInt(value)

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

camelCase

Equals *ConditionFilter `config:"equals"`
Contains *ConditionFilter `config:"contains"`
Regexp *ConditionFilter `config:"regexp"`
Range *ConditionFilter `config:"range"`

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

some (not so nice) trick to use types in config structs:

type ConditionConfig struct {
    Regexp *regexpConfig `config:"regexp"
}

type regexpConfig struct {
    regexp map[string]*regexp.Regexp
}

func (r *regexpConfig) Unpack(v interface{}) error {
    cfg, err := flatten(v)
    if err != nil {
        return err
    }
    return cfg.Unpack(&r.regexp)
}

// internally use ucfg.Config in order to supress the "." path separator
func flatten(v interface{}) (*ucfg.Config, error) {
  m, ok := to.(map[string]interface{})
  if !ok {
      return fmt.Errorf("wrong type, expect map")
  }

  fields = map[string]interface{}{}
  expand := func(key string, value interface{}) {
    switch v := value.(type) {
    case map[string]interface{}:
      for k, val := range v {
        expand(fmt.Sprintf("%v.%v", key, k), val)
      }
    case []interface{}:
      for i, _ := range v {
        expand(fmt.Sprintf("%v.%v", key, i), v[i])
      }
    default:
      f.fields[key] = value
    }
  }

  for k, val := range m {
    expand(k, val)
  }
  return ucfg.NewFrom(fields)
}

type FilterRule interface {
Filter(event common.MapStr) (common.MapStr, error)
CheckConfig(c common.Config) error

This comment has been minimized.

@urso

urso May 18, 2016
Collaborator

we really need CheckConfig in the interface itself?

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

not really, but I added it there, just that we don't forget about it to implement. Shall I remove it?

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

You are right, I am removing it as it should be the decision of the filter plugin if it wants to do any checking.

@@ -250,3 +250,15 @@ logging.files:

# Number of rotated log files to keep. Oldest files will be deleted first.
#keepfiles: 7

############################# Filters #########################################

This comment has been minimized.

@tsg

tsg May 18, 2016
Collaborator

We changed the convention for these headers to use something like

#============================= Filters ========================

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016
Author Contributor

done, thanks!

@monicasarbu monicasarbu force-pushed the monicasarbu:use_dots_field_in_condition branch 2 times, most recently from c6730f2 to 3c34957 May 18, 2016
@urso
Copy link
Collaborator

@urso urso commented May 18, 2016

LGTM

@monicasarbu
Copy link
Contributor Author

@monicasarbu monicasarbu commented May 18, 2016

@ruflin I have updated the filters implementation at the Metricbeat module level with the latest changes.

# Supported actions: drop_fields, drop_event, include_fields
#filters:
#- drop_fields:
# equals:

This comment has been minimized.

@ruflin

ruflin May 18, 2016
Collaborator

Not sure if the indentation will be correct, if someone removes all 4 # here? I think one more space for the last 3 is needed?

@monicasarbu monicasarbu force-pushed the monicasarbu:use_dots_field_in_condition branch from b0fb996 to 9779a43 May 19, 2016
@tsg tsg merged commit 234c177 into elastic:master May 19, 2016
4 checks passed
4 checks passed
@karmi
CLA Commit author is a member of Elasticsearch
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@elasticsearch-release
default Build finished.
Details
medcl added a commit to medcl/beats that referenced this pull request Aug 17, 2016
Changes:

* Each filtering action is a plugin, so you can easily add a new action to the filtering
*  Add drop_event action
*  Be able to pass fields that contain . in the condition:  equals: process.pid: 34443 
*  Add a sample example in libbeat.yml
*  Add system tests for Topbeat, Packetbeat, Filebeat, Metricbeat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants