Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

wrigby · 2015-05-05T19:21:07Z

This is a bit of an obscure one, and I don't know that it's actually causing anyone problems in production. I think it's very minor, but wanted to get it recorded somewhere and start a discussion in case it needs changing.

When NetFlow v9 template records come in, they're stored in a hash, with the key being a concatenation of the NetFlow Source ID field and the Template ID. (see https://github.com/logstash-plugins/logstash-codec-netflow/blob/master/lib/logstash/codecs/netflow.rb#L144-L146)

However, the NetFlow spec states that NetFlow collectors should use the combination of the Source ID field and the IP address of the flow exporter:

Source ID
A 32-bit value that identifies the Exporter Observation Domain.
NetFlow Collectors SHOULD use the combination of the source IP
address and the Source ID field to separate different export
streams originating from the same Exporter.

(from RFC 3954)

The Source ID field is a 32-bit value that is used to guarantee uniqueness for all flows exported from a particular device. (The Source ID field is the equivalent of the engine type and engine ID fields found in the NetFlow Version 5 and Version 8 headers). The format of this field is vendor specific. In the Cisco implementation, the first two bytes are reserved for future expansion, and will always be zero. Byte 3 provides uniqueness with respect to the routing engine on the exporting device. Byte 4 provides uniqueness with respect to the particular line card or Versatile Interface Processor on the exporting device. Collector devices should use the combination of the source IP address plus the Source ID field to associate an incoming NetFlow export packet with a unique instance of NetFlow on a particular device.

(from Cisco's NetFlow v9 documentation) (emphasis added for clarity)

bodgit · 2015-05-12T11:00:55Z

I'm fairly sure I did that when I initially wrote this filter^Wcodec.

I think it's fairly easy to get a collision by just having two or more bits of identical hardware as they will probably use the same source ID. Also from my experience devices tend to start with a template ID of 256 and count up.

However, if they're identical hardware chances are the templates will be identical also so you'll just play last-man-wins as and when they refresh their templates but I agree, the source IP should really be used as part of the hash key.

wrigby · 2015-05-12T17:20:58Z

@bodgit that's what I've seen in practice - all of our devices are exporting the same templates anyway, so it's a non-issue right now for my deployment. It's something I noticed when hacking on support for MPLS labels (#5), so I figured I would record it here for future reference in case it does end up biting someone.

I think it may prompt some discussion about the right way for an input to pass metadata to a codec.

bodgit · 2015-05-13T09:45:39Z

I think it may prompt some discussion about the right way for an input to pass metadata to a codec.

Of course, when this was originally a filter it had access to the original event. That explains things.

wrigby · 2015-05-13T15:01:33Z

Ah, it makes much more sense now. I had a suspicion that was the case, given the comments that are in there. Thanks for the clarification!

jorritfolmer · 2015-08-22T12:47:04Z

What have been the reasons to change this plugin from filter to codec? I couldn't find any through Google.

Not having the source IP information available makes this codec ignore a "should" or "recommended" statement in the RFC. Currently, if you're unlucky enough to find yourself in a heterogeneous netflow environment, you cannot simple erect one Logstash instance, you're forced to create multiple individual Logstash instances to avoid netflow template collisions. I don't think we should settle for that, unless there are good reasons of course.

wrigby · 2015-08-22T15:09:22Z

I think it makes sense for it to be a codec, by the idealized definition of a codec and a filter (a codec being the component that deals with serializing/deserializing data to/from a wire protocol), but it unfortunately gets us into this ambiguous template key situation.

From my naive point of view, I think the best solution may be to provide a mechanism in the logstash core for inputs to provide metadata to the codecs that are decoding their payloads.

jorritfolmer · 2015-08-23T19:02:56Z

Ah yes ok that makes sense, thanks for your explanation.
Perhaps it is time to open an issue for this in the main logstash repository?

jorritfolmer · 2015-08-23T19:37:46Z

For easy reference: current issues (both open/closed) already in logstash repo that touch on codecs and/or metadata:

elastic/logstash#3725: Split Codecs into Encoders and Decoders
elastic/logstash#3694: Use of @metadata in 2.0
elastic/logstash#3607: Codecs should support 'target' option
elastic/logstash#3218: @metadata not shown in rubydebug

jorritfolmer · 2015-10-21T09:33:40Z

I've added commit jorritfolmer@fca329f7a2b4845020d26c925540851093ba61f1 to address this, as experienced in issue #21.

Besides the above commit, it also requires a small patch to logstash-input-udp, to send metadata to the codec, as per #21 (comment):

--- a/lib/logstash/inputs/udp.rb
+++ b/lib/logstash/inputs/udp.rb
@@ -30,6 +30,9 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
   # before packets will start dropping.
   config :queue_size, :validate => :number, :default => 2000

+  # Should the event's metadata be included?
+  config :metadata, :validate => :boolean, :default => false
+
   public
   def initialize(params)
     super
@@ -92,10 +95,21 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
       while true
         payload, client = @input_to_worker.pop

-        @codec.decode(payload) do |event|
-          decorate(event)
-          event["host"] ||= client[3]
-          @output_queue.push(event)
+        if @metadata
+          metadata_to_codec = {}
+          metadata_to_codec["port"] = client[1]
+          metadata_to_codec["host"] = client[3]
+          @codec.decode(payload, metadata_to_codec) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
+        else
+          @codec.decode(payload) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
         end
       end
     rescue => e

and an extra metadata => true parameter to the logstash config like this:

input {
    udp {
      port => 2055
      metadata => true
      codec => netflow {
        versions => [5,9,10]
      }
    }
}

Feedback on this is more than welcome!

wrigby · 2015-12-17T06:49:09Z

Not sure how I missed all the work you put into this @jorritfolmer - looks great; I'll try to test it out in the next week. I came across elastic/logstash#4289, which proposes a similar fix.

jorritfolmer · 2016-06-17T07:35:33Z

There has been an update to elastic/logstash#4289. With regard to the netflow codec needing packet metadata, I quote Guy Boertje from that thread:

If this is true, you are better off creating a netflow input that includes the current codec logic directly in the input - you then have access to any context directly. It can be based on the udp input.

jorritfolmer · 2016-07-26T19:44:15Z

Following up on the suggestion of @guyboertje, I've started with the transition from codec to logstash-input-netflow in the branch logstash-input-netflow, based on the udp input plugin.

It works!

There is still work to be done however in the rspec department, and we'll probably need to support more than udp only.

guyboertje · 2016-07-27T12:05:26Z

@jorritfolmer - good news indeed. Ping me if you need any help.

wrigby · 2016-07-27T12:38:07Z

@jorritfolmer @guyboertje - Let me know if you guys need help, too. I'm a Ruby noob, but I definitely wouldn't mind helping out. Unfortunately, I'm at a different company now (outside of the telco space), so I don't have any test data anymore :(

jorritfolmer · 2016-08-14T11:21:00Z

Small status update:

I've finished work on logstash-input-netflow, which is waiting as a PR here

However @jordansissel pointed out that this move to input plugin isn't necessary with the availability of something called Identiy Map Codec, which is meant to support a use-case like ours here, while keeping the netflow plugin a codec. I haven't had time to further dig into Identity Map Codec though. It will also require an update to logstash-input-udp.

As I understand it, we should be passing src_ip+port as "identity" to the Identity Map Codec, which will clone a codec for each src_ip+port combination. This will take care of keeping Netflow templates separate and prevent template corruption and overwriting as is the case with e.g. issue #21.

It seems we could copy paste identity_map_codec.rb from the logstash-codec-multiline, as it isn't available in the main Logstash repo.

jorritfolmer · 2016-08-18T20:20:36Z

Great! This issue and issue #21 are gone with identity map codec. It doesn't require modifications to our codec, just a couple of lines in logstash-input-udp. I'll create a PR there next week or so.

jorritfolmer · 2016-12-03T12:59:09Z

Update: This will be addressed in the migration to logstash-plugins/logstash-input-netflow#1.
It will allow us to own the network socket directly use the necessary IP addresses as a key.
Probably 2017 though.

ssmlt · 2018-01-17T16:05:24Z

Hello, is there any update regarding this issue and any plans to release logstash-input-netflow?
Thanks.

jorritfolmer · 2018-01-27T13:41:15Z

Issue is still there.
Logstash-input-netflow still waiting for my initial PR to be merged.

joriws · 2018-04-18T06:15:32Z

I see udp.rb (v6.2) has changed so above metadata patch is no longer directly appliable. It worked with 5.5. So question is what is the status now with 6.2.

Do we need meta-data?
I use latest elastisearch and it is decoding fine single stream, I have not tested with multiple sources yet.
cache-save-file does not include IP-addresses of template sender so I doubt it works?

Please update front-page or readme.md the information how multi-ip-templates are to be made working

humblemumble · 2018-05-11T07:19:43Z

hi, could you please provide metadata udp input patch also for 6.2 version ?

evilelf666 · 2019-11-09T19:22:22Z

I've added commit jorritfolmer@fca329f7a2b4845020d26c925540851093ba61f1 to address this, as experienced in issue #21.

Besides the above commit, it also requires a small patch to logstash-input-udp, to send metadata to the codec, as per #21 (comment):

--- a/lib/logstash/inputs/udp.rb
+++ b/lib/logstash/inputs/udp.rb
@@ -30,6 +30,9 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
   # before packets will start dropping.
   config :queue_size, :validate => :number, :default => 2000

+  # Should the event's metadata be included?
+  config :metadata, :validate => :boolean, :default => false
+
   public
   def initialize(params)
     super
@@ -92,10 +95,21 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
       while true
         payload, client = @input_to_worker.pop

-        @codec.decode(payload) do |event|
-          decorate(event)
-          event["host"] ||= client[3]
-          @output_queue.push(event)
+        if @metadata
+          metadata_to_codec = {}
+          metadata_to_codec["port"] = client[1]
+          metadata_to_codec["host"] = client[3]
+          @codec.decode(payload, metadata_to_codec) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
+        else
+          @codec.decode(payload) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
         end
       end
     rescue => e

and an extra metadata => true parameter to the logstash config like this:

input {
    udp {
      port => 2055
      metadata => true
      codec => netflow {
        versions => [5,9,10]
      }
    }
}

Feedback on this is more than welcome!

I've found your patch for the more recent logstash udp.rb code and applied it to my platform (processing 2500 flows / sec) and it fixed all the bandwidth/pps and invalid protocol/incoherent data display (Mostly cisco environment, same models) for my environment. I'm running the latest github data dump in a docker... this clearly resolved the reporting from the 7609s and ASRs in my environment.

Thanks a lot for this metadata patch man. This saved me from going bonkers.

robcowart · 2019-11-15T10:58:22Z

This issue will be addressed once the following PRs are merged and released for the...

Logstash UDP Input: logstash-plugins/logstash-input-udp#46
Logstash Netflow Codec: #187

evilelf666 · 2019-11-16T08:14:59Z

Thank you @robcowart

As always, your work is very appreciated.

ChrisDeh · 2020-04-03T17:46:18Z

Are there any News on this? Both referenced PRs are closed but not merged.

wrigby changed the title ~~Key used to store Netflow v9 templates is not guaranteed to be unique per-template~~ Key used to store Netflow v9 templates is not guaranteed to be unique per template May 5, 2015

bodgit mentioned this issue May 14, 2015

Template length doesnt fit cleanly into flowset #6

Closed

jorritfolmer mentioned this issue Oct 9, 2015

logstash unable to parse netflow if the netflow are sent from different sources like ASA and cisco router #21

Closed

wrigby mentioned this issue Dec 17, 2015

@codec.decode should support defining metadata. elastic/logstash#4289

Open

jorritfolmer mentioned this issue Feb 20, 2016

#<ArgumentError: wrong number of arguments (1 for 0)> #34

Closed

This was referenced May 4, 2016

Update with new Event APIs #36

Merged

Add support for sending connection metadata to codecs that need it logstash-plugins/logstash-input-udp#20

Closed

jorritfolmer added roadmap work in progress labels Dec 3, 2016

jorritfolmer mentioned this issue Feb 28, 2017

Netflow v9 is parsed incorrectly for IOS XR devices with a version >= 5.x #71

Closed

jorritfolmer mentioned this issue Jun 2, 2017

No matching template for flow id 256 and 257 #76

Closed

bluefangs mentioned this issue Sep 28, 2017

Netflow v9 of a particular format fails to be parsed by logstash #92

Closed

jorritfolmer removed roadmap work in progress labels Apr 8, 2018

robcowart mentioned this issue Jul 19, 2018

Migration to logstash-input-netflow logstash-plugins/logstash-input-netflow#1

Closed

jorritfolmer mentioned this issue Nov 1, 2018

Exception in inputworker - Value for year must be in range #157

Open

robcowart mentioned this issue Nov 3, 2018

Template length exceeds flowset length, skipping robcowart/elastiflow#205

Closed

maltejk mentioned this issue Sep 25, 2019

Add option to report UDP source IP address and port to codec logstash-plugins/logstash-input-udp#46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

wrigby commented May 5, 2015

bodgit commented May 12, 2015

wrigby commented May 12, 2015

bodgit commented May 13, 2015

wrigby commented May 13, 2015

jorritfolmer commented Aug 22, 2015

wrigby commented Aug 22, 2015

jorritfolmer commented Aug 23, 2015

jorritfolmer commented Aug 23, 2015

jorritfolmer commented Oct 21, 2015

wrigby commented Dec 17, 2015

jorritfolmer commented Jun 17, 2016

jorritfolmer commented Jul 26, 2016

guyboertje commented Jul 27, 2016

wrigby commented Jul 27, 2016

jorritfolmer commented Aug 14, 2016

jorritfolmer commented Aug 18, 2016

jorritfolmer commented Dec 3, 2016

ssmlt commented Jan 17, 2018

jorritfolmer commented Jan 27, 2018

joriws commented Apr 18, 2018

humblemumble commented May 11, 2018

evilelf666 commented Nov 9, 2019 •

edited

robcowart commented Nov 15, 2019

evilelf666 commented Nov 16, 2019

ChrisDeh commented Apr 3, 2020

Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

Comments

wrigby commented May 5, 2015

bodgit commented May 12, 2015

wrigby commented May 12, 2015

bodgit commented May 13, 2015

wrigby commented May 13, 2015

jorritfolmer commented Aug 22, 2015

wrigby commented Aug 22, 2015

jorritfolmer commented Aug 23, 2015

jorritfolmer commented Aug 23, 2015

jorritfolmer commented Oct 21, 2015

wrigby commented Dec 17, 2015

jorritfolmer commented Jun 17, 2016

jorritfolmer commented Jul 26, 2016

guyboertje commented Jul 27, 2016

wrigby commented Jul 27, 2016

jorritfolmer commented Aug 14, 2016

jorritfolmer commented Aug 18, 2016

jorritfolmer commented Dec 3, 2016

ssmlt commented Jan 17, 2018

jorritfolmer commented Jan 27, 2018

joriws commented Apr 18, 2018

humblemumble commented May 11, 2018

evilelf666 commented Nov 9, 2019 • edited

robcowart commented Nov 15, 2019

evilelf666 commented Nov 16, 2019

ChrisDeh commented Apr 3, 2020

evilelf666 commented Nov 9, 2019 •

edited