Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "remove_field" option #218

Closed
maximpashuk opened this issue Aug 4, 2015 · 11 comments
Closed

Add "remove_field" option #218

maximpashuk opened this issue Aug 4, 2015 · 11 comments

Comments

@maximpashuk
Copy link

Many logstash plugins has this option.

Option can have behaviour "strip document fields before index".
So people can strip field "type", for example, to not have both "_type" and "type" fields in document.

@untergeek
Copy link
Contributor

The remove_field option only exists in the filter plugins. However, the functionality you seek exists already.

You can eliminate the type field, or move it to the @metadata object and use the document_type setting in your elasticsearch output block:

filter {
  mutate {
    add_field => { "[@metadata][type]" => "%{type}" }
    remove_field => "type"
  }
}

output {
  elasticsearch {
    ...
    document_type => "%{[@metadata][type]}"
  }
}

This example would remove the type field from your elasticsearch output—but still allow you to define _type accordingly—because none of the contents of @metadata go into the output.

The reason type has persisted is that the elasticsearch output plugin assigns _type at index time with the value of document_type, and document_type gets the value of the logstash event field type if it exists, otherwise it defaults to "logs".

This behavior predates the addition of the @metadata functionality, so this new method is fairly recent.

@maximpashuk
Copy link
Author

@untergeek I already use trick with [@metadata][type] to remove type field from elasticsearch document.

To support logstash events without type provided, I should use code

  if [type] {
    mutate {
      add_field => { "[@metadata][document_type]" => "%{type}" }
      remove_field => [ "type" ]
    }
  } else {
    mutate {
      add_field => { "[@metadata][document_type]" => "logs" }
    }
  }

otherwise documents without type explicitly provided has human-unfriendly elasticsearch _type "[@metadata][document_type]"

As for me, better way will be something like

  elasticsearch {
          remove_field => ["type"]

          host => ["localhost"]
      cluster => "elasticsearch"
  }

@untergeek
Copy link
Contributor

We will not likely add a remove_field flag as this functionality is already covered by mutate in the filters.

We might be persuaded to add some functionality to drop the type field after its value has been passed. It would get in considerably sooner if someone in the community were to add the code, and tests to validate said code.

Otherwise, the workaround (thank you for completing it with the conditional) is the expected way to accomplish a field remove.

@magnusbaeck
Copy link
Contributor

Removing fields in a filter is okay if all outputs should receive the same fields, but that isn't necessarily true. This popped up in a recent discuss.elastic.co thread. Having a general output feature for filtering what fields to consider for that particular output seems reasonable IMHO.

@andrewvc
Copy link
Contributor

I agree with @magnusbaeck here. That being said, this is not an urgent priority for us at the moment.

@andrewvc
Copy link
Contributor

I'm wondering if there's a way to add this to LogStash::Outputs::Base, but I can't think to do that without some perf implications.

@jordansissel
Copy link
Contributor

To me, allowing removing (or selecting) certain fields at each output is basically a subset of a "send a custom object structure" functionality, which we could probably work on. In my opinion, this kind of behavior belongs in a codec, not an output, because choosing what-and-how to send is a codec's job, in my mind. Further, because some codecs (line, plain, etc) are not structured, maybe this functionality only belongs in some codecs (json, edn, msgpack, etc) that actually send arbitrary-structured data? Let's open a ticket on elastic/logstash to discuss how, if at all, this could be implemented.

Just FYI, you could implement your own codec to do this for most outputs (probably except for Elasticsearch output, which doesn't support codecs)

@jordansissel
Copy link
Contributor

One alternative for now is to use the clone filter to make a copy of your event, tag it a special way, restructure it as you please, and route that to only one output.

@alesnav
Copy link

alesnav commented Dec 30, 2016

Hi there!

I think that this would be a nice feature for my particular case, and maybe for more people.

Right now, I have to deploy two different clusters of Logstash indexers, reading data from the same kafka cluster:

1.- The first one is used to write raw data (containing only message field) with a fingerprint to cover legal reasons, using file output plugin.
2.- The second one is used to parse all fields, remove message field and send it to elasticsearch using this plugin.

If this enhancement was supported, we would be able to use just one cluster because we would be able to remove message field before sending data to elasticsearch and write this same field to raw data files.

Thanks,
Best regards

@magnusbaeck
Copy link
Contributor

If this enhancement was supported, we would be able to use just one cluster because we would be able to remove message field before sending data to elasticsearch and write this same field to raw data files.

@alesnav, as @jordansissel noted in a previous comment you can already use a clone filter to fork the event stream and send variations of the same source event to different outputs, thereby alleviating your need of two sets of Logstash instances.

@jsvd
Copy link
Member

jsvd commented May 16, 2018

closing due to inactivity, also, this functionality would likely be implemented outside of this plugin, so feel free to open a new issue in elastic/logstash if there's continued interest

@jsvd jsvd closed this as completed May 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants