Template length exceeds flowset length, skipping #205
Comments
Most likely you have devices that are sending the same template ID (in this case 260) but the content of the flow records is different. The codec is trying to decode the flows from one device with the template from the other. This is an issue with the codec. logstash-plugins/logstash-codec-netflow#9 |
I think the subtext in the issues you cite suggest a workaround could be to use a different dport on the exporter. Am I reading into that correctly, or no? |
Hi Rob, So it seems the only workaround is to stand up multiple LS instances. To that end, I've duplicated the elastiflow directories and added them as additional pipelines in logstash.yml per this link: https://www.elastic.co/guide/en/logstash/6.2/multiple-pipelines.html I changed the port in each instance of First time I tried to restart, I was getting Java out-of-memory errors, even though I've tripled the amount of memory allocated to the VM from 12GB to 32GB, so then I increased the heaps to
in Now, logstash seems to be taking an hour to start, and it's still not done. Last entry in the log was from an hour ago:
Java is running:
I'm going to assume you've had to take similar measures in some of your other deployments for folks who have been running a homogenous network with similar type exports with similar SW versions, etc. Have you seen anything like this before or do you just spin up entirely separate logstash servers? Any pointers? |
The way I would handle this is to break the pipeline into two parts. A "collector", which is basically a the UDP input w/codec and a redis output, and a "processor" which uses a redis input and does all of the post-decoder work and sending to Elasticsearch. You can then have multiple instances of the simple "collector" pipeline, each listening on different ports, and a single instance of the "processor". It would look like this...
Redis is really easy to setup in this use-case and actually can help reduce packet loss receiving flow data via UDP (that is another article to write). I have often thought about adding redis as a requirement for ElastiFlow as it bring a lot of benefits. You could also try the new pipeline input and output instead of using redis. It should also work for your use-case, but I haven't tested it fully yet. Redis on the other hand is a proven technology for this scenario. Either way, you will need to do a little restitching of the pipelines. It will however be worth it, as you will not have to run multiple instances of the heavy processing logic. |
Hi rob, while using the following simple logstash configuration input & output i get an output: input {
}
How do i get my Elastiflow graphs to work Note: |
If all that you have is one source of flows, and you are seeing this error, then something else is wrong. It is possible that the template the vendor is sending doesn't match the flow records. I would have to look at a PCAP which contains both the template and a flow record in order to investigate. |
elastiflow.zip |
You have two different devices sending flows. Both send a template with ID 256, but they contain different fields. This will cause issues as one template will be used to decode both sources, making at one of them always wrong. The only work around for this at the moment is to customize ElastiFlow for multiple collectors, with a common processing instance and some for of messaging queue in between. |
thanks rob |
hi rob, [2019-01-16T09:57:02,462][WARN ][logstash.codecs.netflow ] Can't (yet) decode flowset id 256 from source id 0, because no template to decode it with has been received. This message will usually go away after 1 minute. |
I'm using ElastiFlow v2.1.0 |
How long has it been running. Some devices are really slow at sending templates. Fortinet in particular can take 15-30 minutes. |
Over an hour. |
issue was resolved. Thanks |
@robcowart Do you have any guides on how to implement this? |
Out of interest, does the elasticsearch elastiflow netflow plugin also have the same problem with templates overlapping? |
The root cause is related to where input codecs (which decode the raw flow data) are located. So it is a Logstash-specific issue. Any solution based on Logstash will have the same challenge. |
@robcowart looks like there is a patch for this: logstash-plugins/logstash-codec-netflow#9 (comment) But with Logstash 7.4.2 being broken with netflow codec: #427 I had to roll back to 7.3.2 (for some reason 7.4.2 works with collectd over UDP) I'm wondering if I should start looking at something else for processing netflow.... Filebeat also has its problems, especially around packets seemingly not being processes - I'll need to look at this tomorrow... I'm assuming if I get stuff like this in my Logstash netflow codec logs then my netflow data isn't being processes properly?
|
This issue will be addressed once the following PRs are merged and released for the... Logstash UDP Input: logstash-plugins/logstash-input-udp#46 |
I am wondering the same. |
Unfortunately the Elastic team declined to merge UDP input changes (see... logstash-plugins/logstash-input-udp#46). This leaves no other option than to continue to recommend the workaround of multiple instances of the ElastiFlow pipeline. |
This is no longer a problem for the new ElastiFlow Unified Flow Collector. In particular the new collector properly handles templates across devices and observation domains. More details are available... HERE. |
Hi there,
I don't think this is an EF issue, but I'd like to poke the crowd to see if anyone has any ideas before I but the LS folks.
I started seeing my Elastiflow graphs jump up across all metrics from, for example, tens of Gbps to hundreds of Tbps. Obviously bad data.
I correlated it with these logs starting to appear:
I've seem posts in various forums about bugs in netflow codecs, but those bugs are from months ago, all of which seem to have been fixed(?). I just verified that I'm running the latest coded:
Does this look familiar to anyone?
Thanks!
The text was updated successfully, but these errors were encountered: