Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key used to store Netflow v9 templates is not guaranteed to be unique per template #9

Open
wrigby opened this issue May 5, 2015 · 25 comments

Comments

@wrigby
Copy link

wrigby commented May 5, 2015

This is a bit of an obscure one, and I don't know that it's actually causing anyone problems in production. I think it's very minor, but wanted to get it recorded somewhere and start a discussion in case it needs changing.

When NetFlow v9 template records come in, they're stored in a hash, with the key being a concatenation of the NetFlow Source ID field and the Template ID. (see https://github.com/logstash-plugins/logstash-codec-netflow/blob/master/lib/logstash/codecs/netflow.rb#L144-L146)

However, the NetFlow spec states that NetFlow collectors should use the combination of the Source ID field and the IP address of the flow exporter:

Source ID
A 32-bit value that identifies the Exporter Observation Domain.
NetFlow Collectors SHOULD use the combination of the source IP
address and the Source ID field to separate different export
streams originating from the same Exporter.

(from RFC 3954)

The Source ID field is a 32-bit value that is used to guarantee uniqueness for all flows exported from a particular device. (The Source ID field is the equivalent of the engine type and engine ID fields found in the NetFlow Version 5 and Version 8 headers). The format of this field is vendor specific. In the Cisco implementation, the first two bytes are reserved for future expansion, and will always be zero. Byte 3 provides uniqueness with respect to the routing engine on the exporting device. Byte 4 provides uniqueness with respect to the particular line card or Versatile Interface Processor on the exporting device. Collector devices should use the combination of the source IP address plus the Source ID field to associate an incoming NetFlow export packet with a unique instance of NetFlow on a particular device.

(from Cisco's NetFlow v9 documentation) (emphasis added for clarity)

@wrigby wrigby changed the title Key used to store Netflow v9 templates is not guaranteed to be unique per-template Key used to store Netflow v9 templates is not guaranteed to be unique per template May 5, 2015
@bodgit
Copy link
Contributor

bodgit commented May 12, 2015

I'm fairly sure I did that when I initially wrote this filter^Wcodec.

I think it's fairly easy to get a collision by just having two or more bits of identical hardware as they will probably use the same source ID. Also from my experience devices tend to start with a template ID of 256 and count up.

However, if they're identical hardware chances are the templates will be identical also so you'll just play last-man-wins as and when they refresh their templates but I agree, the source IP should really be used as part of the hash key.

@wrigby
Copy link
Author

wrigby commented May 12, 2015

@bodgit that's what I've seen in practice - all of our devices are exporting the same templates anyway, so it's a non-issue right now for my deployment. It's something I noticed when hacking on support for MPLS labels (#5), so I figured I would record it here for future reference in case it does end up biting someone.

I think it may prompt some discussion about the right way for an input to pass metadata to a codec.

@bodgit
Copy link
Contributor

bodgit commented May 13, 2015

I think it may prompt some discussion about the right way for an input to pass metadata to a codec.

Of course, when this was originally a filter it had access to the original event. That explains things.

@wrigby
Copy link
Author

wrigby commented May 13, 2015

Ah, it makes much more sense now. I had a suspicion that was the case, given the comments that are in there. Thanks for the clarification!

@jorritfolmer
Copy link
Contributor

What have been the reasons to change this plugin from filter to codec? I couldn't find any through Google.

Not having the source IP information available makes this codec ignore a "should" or "recommended" statement in the RFC. Currently, if you're unlucky enough to find yourself in a heterogeneous netflow environment, you cannot simple erect one Logstash instance, you're forced to create multiple individual Logstash instances to avoid netflow template collisions. I don't think we should settle for that, unless there are good reasons of course.

@wrigby
Copy link
Author

wrigby commented Aug 22, 2015

I think it makes sense for it to be a codec, by the idealized definition of a codec and a filter (a codec being the component that deals with serializing/deserializing data to/from a wire protocol), but it unfortunately gets us into this ambiguous template key situation.

From my naive point of view, I think the best solution may be to provide a mechanism in the logstash core for inputs to provide metadata to the codecs that are decoding their payloads.

@jorritfolmer
Copy link
Contributor

Ah yes ok that makes sense, thanks for your explanation.
Perhaps it is time to open an issue for this in the main logstash repository?

@jorritfolmer
Copy link
Contributor

For easy reference: current issues (both open/closed) already in logstash repo that touch on codecs and/or metadata:

elastic/logstash#3725: Split Codecs into Encoders and Decoders
elastic/logstash#3694: Use of @metadata in 2.0
elastic/logstash#3607: Codecs should support 'target' option
elastic/logstash#3218: @metadata not shown in rubydebug

@jorritfolmer
Copy link
Contributor

I've added commit jorritfolmer@fca329f7a2b4845020d26c925540851093ba61f1 to address this, as experienced in issue #21.

Besides the above commit, it also requires a small patch to logstash-input-udp, to send metadata to the codec, as per #21 (comment):

--- a/lib/logstash/inputs/udp.rb
+++ b/lib/logstash/inputs/udp.rb
@@ -30,6 +30,9 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
   # before packets will start dropping.
   config :queue_size, :validate => :number, :default => 2000

+  # Should the event's metadata be included?
+  config :metadata, :validate => :boolean, :default => false
+
   public
   def initialize(params)
     super
@@ -92,10 +95,21 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
       while true
         payload, client = @input_to_worker.pop

-        @codec.decode(payload) do |event|
-          decorate(event)
-          event["host"] ||= client[3]
-          @output_queue.push(event)
+        if @metadata
+          metadata_to_codec = {}
+          metadata_to_codec["port"] = client[1]
+          metadata_to_codec["host"] = client[3]
+          @codec.decode(payload, metadata_to_codec) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
+        else
+          @codec.decode(payload) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
         end
       end
     rescue => e

and an extra metadata => true parameter to the logstash config like this:

input {
    udp {
      port => 2055
      metadata => true
      codec => netflow {
        versions => [5,9,10]
      }
    }
}

Feedback on this is more than welcome!

@wrigby
Copy link
Author

wrigby commented Dec 17, 2015

Not sure how I missed all the work you put into this @jorritfolmer - looks great; I'll try to test it out in the next week. I came across elastic/logstash#4289, which proposes a similar fix.

@jorritfolmer
Copy link
Contributor

There has been an update to elastic/logstash#4289. With regard to the netflow codec needing packet metadata, I quote Guy Boertje from that thread:

If this is true, you are better off creating a netflow input that includes the current codec logic directly in the input - you then have access to any context directly. It can be based on the udp input.

@jorritfolmer
Copy link
Contributor

Following up on the suggestion of @guyboertje, I've started with the transition from codec to logstash-input-netflow in the branch logstash-input-netflow, based on the udp input plugin.

It works!

There is still work to be done however in the rspec department, and we'll probably need to support more than udp only.

@guyboertje
Copy link

@jorritfolmer - good news indeed. Ping me if you need any help.

@wrigby
Copy link
Author

wrigby commented Jul 27, 2016

@jorritfolmer @guyboertje - Let me know if you guys need help, too. I'm a Ruby noob, but I definitely wouldn't mind helping out. Unfortunately, I'm at a different company now (outside of the telco space), so I don't have any test data anymore :(

@jorritfolmer
Copy link
Contributor

Small status update:

I've finished work on logstash-input-netflow, which is waiting as a PR here

However @jordansissel pointed out that this move to input plugin isn't necessary with the availability of something called Identiy Map Codec, which is meant to support a use-case like ours here, while keeping the netflow plugin a codec. I haven't had time to further dig into Identity Map Codec though. It will also require an update to logstash-input-udp.

As I understand it, we should be passing src_ip+port as "identity" to the Identity Map Codec, which will clone a codec for each src_ip+port combination. This will take care of keeping Netflow templates separate and prevent template corruption and overwriting as is the case with e.g. issue #21.

It seems we could copy paste identity_map_codec.rb from the logstash-codec-multiline, as it isn't available in the main Logstash repo.

@jorritfolmer
Copy link
Contributor

Great! This issue and issue #21 are gone with identity map codec. It doesn't require modifications to our codec, just a couple of lines in logstash-input-udp. I'll create a PR there next week or so.

@jorritfolmer
Copy link
Contributor

Update: This will be addressed in the migration to logstash-plugins/logstash-input-netflow#1.
It will allow us to own the network socket directly use the necessary IP addresses as a key.
Probably 2017 though.

@ssmlt
Copy link

ssmlt commented Jan 17, 2018

Hello, is there any update regarding this issue and any plans to release logstash-input-netflow?
Thanks.

@jorritfolmer
Copy link
Contributor

Issue is still there.
Logstash-input-netflow still waiting for my initial PR to be merged.

@joriws
Copy link

joriws commented Apr 18, 2018

I see udp.rb (v6.2) has changed so above metadata patch is no longer directly appliable. It worked with 5.5. So question is what is the status now with 6.2.

Do we need meta-data?
I use latest elastisearch and it is decoding fine single stream, I have not tested with multiple sources yet.
cache-save-file does not include IP-addresses of template sender so I doubt it works?

Please update front-page or readme.md the information how multi-ip-templates are to be made working

@humblemumble
Copy link

hi, could you please provide metadata udp input patch also for 6.2 version ?

@evilelf666
Copy link

evilelf666 commented Nov 9, 2019

I've added commit jorritfolmer@fca329f7a2b4845020d26c925540851093ba61f1 to address this, as experienced in issue #21.

Besides the above commit, it also requires a small patch to logstash-input-udp, to send metadata to the codec, as per #21 (comment):

--- a/lib/logstash/inputs/udp.rb
+++ b/lib/logstash/inputs/udp.rb
@@ -30,6 +30,9 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
   # before packets will start dropping.
   config :queue_size, :validate => :number, :default => 2000

+  # Should the event's metadata be included?
+  config :metadata, :validate => :boolean, :default => false
+
   public
   def initialize(params)
     super
@@ -92,10 +95,21 @@ class LogStash::Inputs::Udp < LogStash::Inputs::Base
       while true
         payload, client = @input_to_worker.pop

-        @codec.decode(payload) do |event|
-          decorate(event)
-          event["host"] ||= client[3]
-          @output_queue.push(event)
+        if @metadata
+          metadata_to_codec = {}
+          metadata_to_codec["port"] = client[1]
+          metadata_to_codec["host"] = client[3]
+          @codec.decode(payload, metadata_to_codec) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
+        else
+          @codec.decode(payload) do |event|
+            decorate(event)
+            event["host"] ||= client[3]
+            @output_queue.push(event)
+          end
         end
       end
     rescue => e

and an extra metadata => true parameter to the logstash config like this:

input {
    udp {
      port => 2055
      metadata => true
      codec => netflow {
        versions => [5,9,10]
      }
    }
}

Feedback on this is more than welcome!

I've found your patch for the more recent logstash udp.rb code and applied it to my platform (processing 2500 flows / sec) and it fixed all the bandwidth/pps and invalid protocol/incoherent data display (Mostly cisco environment, same models) for my environment. I'm running the latest github data dump in a docker... this clearly resolved the reporting from the 7609s and ASRs in my environment.

Thanks a lot for this metadata patch man. This saved me from going bonkers.

@robcowart
Copy link
Contributor

This issue will be addressed once the following PRs are merged and released for the...

Logstash UDP Input: logstash-plugins/logstash-input-udp#46
Logstash Netflow Codec: #187

@evilelf666
Copy link

Thank you @robcowart

As always, your work is very appreciated.

@ChrisDeh
Copy link

ChrisDeh commented Apr 3, 2020

Are there any News on this? Both referenced PRs are closed but not merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants