New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dots in field names and make filters pluggable #1623

Merged
merged 6 commits into from May 19, 2016

Conversation

Projects
None yet
4 participants
@monicasarbu
Contributor

monicasarbu commented May 12, 2016

This PR includes the following changes:

  • Each filtering action is a plugin, so you can easily add a new action to the filtering
  • Add drop_event action
  • Be able to pass fields that contain . in the condition:

    equals:
    process.pid: 34443
  • Add a sample example in libbeat.yml
  • Add system tests for Topbeat, Packetbeat, Filebeat, Metricbeat

I will update the documentation in a separate PR as well as system tests for Metricbeat.

@monicasarbu monicasarbu referenced this pull request May 12, 2016

Closed

Add condition to a filtering rule #1447

15 of 17 tasks complete

@monicasarbu monicasarbu added the review label May 18, 2016

# exported event:
# event -> filter1 -> event1 -> filter2 ->event2 ...
# Supported actions: drop_fields, drop_event, include_fields
# filters:

This comment has been minimized.

@ruflin

ruflin May 18, 2016

Collaborator

remove the space in front of the config options which should be uncommented. makes it easy to see difference between config and docs

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

Good spot. Thanks

Int uint64
Str string
}
type Condition struct {

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

having multiple conditions how are they 'combined'? 'Condition' acts like an AND ?

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

yes, like an AND

equals map[string]EqualsValue
contains map[string]string
regexp map[string]*regexp.Regexp
rangexp map[string]RangeValue

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

can I use the same key with different kind of expressions, but not with same kind of expression?

e.g.

condition:
  equals:
    http.code: 20
  range:
    http.code: ...

How about switching to:

type Condition struct {
    checks []FieldCheck
}

type FieldCheck struct {
    field string
    check Check
}

type Check interface {
    func CheckValue(v interface{}) (bool, error)
}

type IntEq struct { i int64 }
func (i IntEq) CheckValue(v interface{}) (bool, error) { ... }

type FloatEq struct { f float64 }
func (f FloatEq) CheckValue(v interface{}) (bool, error) { ... }

type StringEq struct { s string }
func (s StringEq) CheckValue(v interface{}) (bool, error) { ... }

func (c *Condition) Check(event common.MapStr) (bool, error) {
    for _, fieldCheck := range c.checks {
      val := extractValue(event, fieldCheck.field)
      if b, err := fieldCheck.check.Check(val); !b || err != nil {
          return b, err
      }
    }
    return true, nil
}

If one key is used multiple times, the terms should be or combined. For example:

condition:
  equals:
    - http.code: 200
    - http.code: 302

Will act like http.code == 200 || http.code == 320

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

The reason why the struct Condition looks like this is that in the current version you don't need to add an extra "condition" tag. For example:

- drop_fields:
    equals:
       http.code: 200
       status: OK
    fields: ["http.request_headers", "http.response_headers"]

In the current implementation you cannot use the same key in the condition as it's a map. I am not sure if it's better to have a list instead of a map. Currently I am doing AND between them, not OR.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

In the future I am planning to add or, and and not. For example, your case becomes:

or:
  equals:
    http.code: 200
  equals:
   http.code: 302

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

this won't work either, as the condition types are also map[string].... At some point we need to introduce a list for this use-case.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

Yes, you are right. We need to put an array for all the conditions combined with or and and.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

So, it becomes:

or:
  - equals:
     http.code: 200
  - equals:
      http.code: 302 
i, err := strconv.Atoi(value)
for field, value := range cfg.fields {
uint_value, err := extractInt(value)

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

new code should use camel case

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

Good spot, thanks!

for field, value := range cfg.fields {
switch v := value.(type) {
case string:
c.regexp[field] = regexp.MustCompile(v)

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

be carefull, regexp.MustCompile will panic on invalid regex. Better use regexp.Compile. I'd even recommend compiling the regex in Unpack method and return error on failure. ucfg will print additional context (which config failed and why) if Unpack fails.

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

Good spot, thanks!

default:
logp.Warn("unexpected type %T in equals condition as it accepts only integers and strings. ", value)
return false
int_value, err := extractInt(value)

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

camelCase

Equals *ConditionFilter `config:"equals"`
Contains *ConditionFilter `config:"contains"`
Regexp *ConditionFilter `config:"regexp"`
Range *ConditionFilter `config:"range"`

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

some (not so nice) trick to use types in config structs:

type ConditionConfig struct {
    Regexp *regexpConfig `config:"regexp"
}

type regexpConfig struct {
    regexp map[string]*regexp.Regexp
}

func (r *regexpConfig) Unpack(v interface{}) error {
    cfg, err := flatten(v)
    if err != nil {
        return err
    }
    return cfg.Unpack(&r.regexp)
}

// internally use ucfg.Config in order to supress the "." path separator
func flatten(v interface{}) (*ucfg.Config, error) {
  m, ok := to.(map[string]interface{})
  if !ok {
      return fmt.Errorf("wrong type, expect map")
  }

  fields = map[string]interface{}{}
  expand := func(key string, value interface{}) {
    switch v := value.(type) {
    case map[string]interface{}:
      for k, val := range v {
        expand(fmt.Sprintf("%v.%v", key, k), val)
      }
    case []interface{}:
      for i, _ := range v {
        expand(fmt.Sprintf("%v.%v", key, i), v[i])
      }
    default:
      f.fields[key] = value
    }
  }

  for k, val := range m {
    expand(k, val)
  }
  return ucfg.NewFrom(fields)
}
type FilterRule interface {
Filter(event common.MapStr) (common.MapStr, error)
CheckConfig(c common.Config) error

This comment has been minimized.

@urso

urso May 18, 2016

Collaborator

we really need CheckConfig in the interface itself?

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

not really, but I added it there, just that we don't forget about it to implement. Shall I remove it?

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

You are right, I am removing it as it should be the decision of the filter plugin if it wants to do any checking.

@@ -250,3 +250,15 @@ logging.files:
# Number of rotated log files to keep. Oldest files will be deleted first.
#keepfiles: 7
############################# Filters #########################################

This comment has been minimized.

@tsg

tsg May 18, 2016

Collaborator

We changed the convention for these headers to use something like

#============================= Filters ========================

This comment has been minimized.

@monicasarbu

monicasarbu May 18, 2016

Contributor

done, thanks!

@urso

This comment has been minimized.

Collaborator

urso commented May 18, 2016

LGTM

@monicasarbu

This comment has been minimized.

Contributor

monicasarbu commented May 18, 2016

@ruflin I have updated the filters implementation at the Metricbeat module level with the latest changes.

# Supported actions: drop_fields, drop_event, include_fields
#filters:
#- drop_fields:
# equals:

This comment has been minimized.

@ruflin

ruflin May 18, 2016

Collaborator

Not sure if the indentation will be correct, if someone removes all 4 # here? I think one more space for the last 3 is needed?

@tsg tsg merged commit 234c177 into elastic:master May 19, 2016

4 checks passed

CLA Commit author is a member of Elasticsearch
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
default Build finished.
Details

medcl added a commit to medcl/beats that referenced this pull request Aug 17, 2016

Use dots in field names and make filters pluggable (elastic#1623)
Changes:

* Each filtering action is a plugin, so you can easily add a new action to the filtering
*  Add drop_event action
*  Be able to pass fields that contain . in the condition:  equals: process.pid: 34443 
*  Add a sample example in libbeat.yml
*  Add system tests for Topbeat, Packetbeat, Filebeat, Metricbeat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment