Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot decode string of bytes sent from pdns #15

Open
jake2184 opened this issue Nov 1, 2017 · 58 comments
Open

Cannot decode string of bytes sent from pdns #15

jake2184 opened this issue Nov 1, 2017 · 58 comments
Assignees

Comments

@jake2184
Copy link

jake2184 commented Nov 1, 2017

I'm using pdns_recursor to send protobuf messages to logstash.

However the output of some fields is pretty incomprehensible.

The proto file is found here, and was converted using protocol-buffers.

Of particular annoyance are the 'from' and 'to' fields of the protobuf message. In the proto file they are sent as 'bytes', and looking at the pdns source code it looks like what is sent is a C++ string.

When decoded and printed using logstash stdout or passed into elastic search, instead of being given a nice string of the IP address (even in hex format and without '.' delimiter), I get:

"from" => "\n\x16\x02b" - which corresponds to 10.22.2.98

Changing the pb.rb file to attempt to decode as a string rather than bytes gives:

"from" => "\n\u0016\u0002b"

What I am trying to get is some form of consistent representation that I can work with to manipulate, with the end-goal of having "10.22.2.98" as the output.

@IngaFeick IngaFeick self-assigned this Nov 4, 2017
@IngaFeick
Copy link
Collaborator

Hi @jake2184 , is this still relevant? I'm currently working on a new version and you could test-drive it in a couple of days, maybe it solves the issue.

@jake2184
Copy link
Author

I came to a rather hacky work around, involving unpacking the data:

event.set("from", event.get("from").unpack("C4").join("."));
event.set("to", event.get("to").unpack("C4").join("."));
event.set("messageId", event.get("messageId").unpack("H*").join(""));

And for the rData response, which may be IPv6 or IPv4:

response = event.get("response")
if ( response and response["rrs"] and response["rrs"][0] and response["rrs"][0]["rData"] )
   rdata = response["rrs"][0]["rData"]
   hex_value = rdata.unpack("H*").join("")
   ip_value =  rdata.unpack("C4").join(".")
   length_rdata =    rdata.unpack("L*").length
   if ( length_rdata >= 4 )
	 event.set("[response][decoded_rdata]", hex_value )
   else
	 event.set("[response][decoded_rdata]", ip_value)
   end
end

This is not great, but mostly works for the time being.

I look forward to a working solution and am happy to test, but cannot guarantee having time to do so (depending on work projects).

@MangeshBhamre
Copy link

Hi jake2184 ,

I am trying to do a similar thing with pdns_recursor and logstash. In which file did you add above code?

Regards,
Mangesh

@IngaFeick
Copy link
Collaborator

@MangeshBhamre are you using protobuf 2 or 3?

@MangeshBhamre
Copy link

@IngaFeick protobuf 3. Latest.

In addition to above issue highlighted by @jake2184 . I get below errors. While i used logstash 5.2.x earlier this was working fine. Now I am on logstash 6.2.x

[2018-02-18T09:52:18,226][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<NoMethodError: undefined method `<<' for false:FalseClass>.
[2018-02-18T09:52:19,205][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:19,209][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 13, expected 2 but got 7
#<ProtocolBuffers::Field::MessageField:0x1b358717 @otype=:optional, @name=:response, @opts={}, @tag=13, @proxy_class=PBDNSMessage::DNSResponse>>.
[2018-02-18T09:52:19,414][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 9, expected 0 but got 6
#<ProtocolBuffers::Field::Uint32Field:0x25d8638b @otype=:optional, @name=:timeSec, @opts={}, @tag=9>>.
[2018-02-18T09:52:20,208][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:20,477][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 9, expected 0 but got 1
#<ProtocolBuffers::Field::Uint32Field:0x25d8638b @otype=:optional, @name=:timeSec, @opts={}, @tag=9>>.
[2018-02-18T09:52:20,901][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 8, expected 0 but got 7
#<ProtocolBuffers::Field::Uint64Field:0x68544106 @otype=:optional, @name=:inBytes, @opts={}, @tag=8>>.
[2018-02-18T09:52:21,117][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 13, expected 2 but got 0
#<ProtocolBuffers::Field::MessageField:0x1b358717 @otype=:optional, @name=:response, @opts={}, @tag=13, @proxy_class=PBDNSMessage::DNSResponse>>.
[2018-02-18T09:52:21,209][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:21,633][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:22,211][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:22,266][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:22,850][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:23,212][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:23,279][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 8, expected 0 but got 5
#<ProtocolBuffers::Field::Uint64Field:0x68544106 @otype=:optional, @name=:inBytes, @opts={}, @tag=8>>.
[2018-02-18T09:52:23,280][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: invalid message>.
[2018-02-18T09:52:23,686][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 8, expected 0 but got 6
#<ProtocolBuffers::Field::Uint64Field:0x68544106 @otype=:optional, @name=:inBytes, @opts={}, @tag=8>>.
[2018-02-18T09:52:23,902][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<ProtocolBuffers::DecodeError: incorrect wire type for tag: 9, expected 0 but got 3

@IngaFeick
Copy link
Collaborator

@MangeshBhamre the current version of the protobuf codec does not support protobuf 3 yet. No worries though, I'm about to write a PR for version 3 support. Can I interest you in testing the new version so that I can pick up your feedback before I send the PR?

@MangeshBhamre
Copy link

@IngaFeick Sure. I can test out.

How can I fix above errors?

@IngaFeick
Copy link
Collaborator

@MangeshBhamre with the 1.0.4 version that is the official latest: you cannot fix the errors because as I said, it doesn't support protobuf 3 yet. Please install this gem
https://rubygems.org/gems/logstash-codec-protobuf/versions/1.0.5
and in your logstash input section, use the codec like so:

    {
      class_name => "Module.AnotherModule.Classname"
      include_path => ['file1_pb.rb','file2_pb.rb',file3_pb.rb']
      protobuf_version_3 => true
    }

The import change is the last line that activates the protobuf 3 library.

@MangeshBhamre
Copy link

@IngaFeick
I installed protobuf coded using "logstash-plugin install logstash-codec-protobuf"

There is no url or version I used. How can I install the one you specified? Sorry, I am bit new here with logstash/protobuf.

@IngaFeick
Copy link
Collaborator

Hi @MangeshBhamre, no problem, let me quickly walk you through this.
The command that you used will pull the latest official version. What you need to do instead is to download the gem file for the new version that I'm currently testing, from here:
https://rubygems.org/downloads/logstash-codec-protobuf-1.0.5.gem
Put it somewhere locally and then run

logstash-plugin install /path/to/downloaded/logstash-codec-protobuf-1.0.5.gem

You just need to change the last part of the command to the file that you pulled. Let me know if it worked, will you?

@MangeshBhamre
Copy link

Hi @IngaFeick

I successfully installed your gem. Also updated logstash config

codec => protobuf {
class_name => "PBDNSMessage"
include_path => ['/home/bitnami/dnsmessage.pb.rb']
protobuf_version_3 => true
}

However logstash not starting...
[2018-02-18T10:18:31,448][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"NoMethodError", :message=>"undefined method msgclass' for nil:NilClass", :backtrace=>["/opt/bitnami/logstash/vendor/local_gems/eb8a11f0/logstash-codec-protobuf-1.0.5/lib/logstash/codecs/protobuf.rb:80:in register'", "/opt/bitnami/logstash/logstash-core/lib/logstash/codecs/base.rb:20:in initialize'", "/opt/bitnami/logstash/logstash-core/lib/logstash/plugins/plugin_factory.rb:90:in plugin'"

@IngaFeick
Copy link
Collaborator

@MangeshBhamre could be a problem with the class name. Can you show me the protobuf ruby file, dnsmessage.pb.rb please?

@MangeshBhamre
Copy link

Here is it on pastebin https://pastebin.com/1vbCKYVk

@IngaFeick
Copy link
Collaborator

@MangeshBhamre you're using protobuf 2, not 3. If you want to switch to proto3, which I recommend for performance reasons, then please use the official protobuf compiler to generate proto3 ruby files. Afterwards please send me a pastebin link to the new ruby class. You should get something like dnsmessage_pb.rb instead of dnsmessage.pb.rb.
You will probably also have to change the class name in the config section like so:

codec => protobuf {
class_name => "PBDNSMessage.DNSResponse.DNSRR"
include_path => ['/home/bitnami/dnsmessage.pb.rb']
protobuf_version_3 => true
}

but I can confirm that only after you generated actual proto3 ruby files.

@MangeshBhamre
Copy link

@IngaFeick For me to use proto3, does that mean pdns also need to send me protobuf in ver3 format? (I guess so)
If you look at .proto file from (https://doc.powerdns.com/recursor/lua-config/protobuf.html), it has
syntax = "proto2";
I tried compiling this file with protoc (3.5.1) and its complaint about "syntax=proto2". This indicates to me that client and server both needs to be on same version. Please correct me here!

If I wish to use proto2 on logstash and resolve the error I am getting, how to go about it? As I said earlier, with logstash 5.2.x this worked fine, however for some reason we need to use logstash 6.2.x.

Thanks for all your help here.

@IngaFeick
Copy link
Collaborator

@MangeshBhamre

does that mean pdns also need to send me protobuf in ver3 format?

Not necessarily. You can read protobuf2 data with a protobuf3 definition. It's not an elegant thing to do though.

with logstash 5.2.x this worked fine, however for some reason we need to use logstash 6.2.x.

I haven't tested with logstash 6 yet. I will try to do that asap but it might take a couple of days, high workload currently. What you could test in the meantime is what jake did. As for your question

In which file did you add above code?

You'd put that in the filter section of your logstash config. There's a plugin called "ruby" that allows you to execute custom code like the one that he mentioned. The documentation is here:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-ruby.html
Wanna give that a try?

@MangeshBhamre
Copy link

MangeshBhamre commented Feb 18, 2018

@IngaFeick I downloaded gems for all versions and installed it one-by-one. 1.0.0 did not give any errors to me. 1.0.2 and above throws all errors. Either 1.0.0 is not emitting the errors and consuming it internally or it's working correctly.

Is it ok to use 1.0.0 or are there any bugs and you recommend using 1.0.2 and above?

I will try ruby filter plugin you suggested. Looking forward to seeing 1.0.5+ to work for me without any errors and speed up my logstash

@IngaFeick
Copy link
Collaborator

@MangeshBhamre interesting observation. That's a helpful hint actually. I will look into it asap, thanks!

@IngaFeick
Copy link
Collaborator

@MangeshBhamre it should be safe to use 1.0.0, yes. The versions afterwards were related to speed improvement and code quality, afair, and not bugfixes.

@MangeshBhamre
Copy link

@IngaFeick I ran 1.0.0 overnight and found its too slow and missed almost 90% protobuf messages. I am using later version to not miss the events. It will be great if a newer version is available that address my list of errors.

@IngaFeick
Copy link
Collaborator

Hi @MangeshBhamre!
Sorry for the late reply. We have upgraded codec to 1.1.0 with a fresher version of the google protobug lib under the hood. Maybe that tackles your problems? It's not officially released yet but you can download the version here for testing and then you install it by running

logstash-plugin install  /your_downloads/logstash-codec-protobuf-1.1.0.gem

Please note that the config has changed, you need to set
protobuf_version => 3
now instead of protobuf_version_3 => true.
Please let me know if this works for you, and if it doesn't and you still see data loss please describe how you measure that you are loosing messages. Thank you very much! :)

@MangeshBhamre
Copy link

Hi @IngaFeick

I tried it and not able to start logstash. Below are the errors seen.
My config
input {
tcp {
host => "0.0.0.0"
port => 4040
codec => protobuf {
class_name => "PBDNSMessage.DNSResponse.DNSRR"
include_path => ['/home/bitnami/dnsmessage.pb.rb']
protobuf_version => 3
}
}
}

[2018-03-16T13:59:24,951][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.1.2"}
[2018-03-16T13:59:25,488][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-03-16T13:59:31,386][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"NoMethodError", :message=>"undefined method msgclass' for nil:NilClass", :backtrace=>["/opt/bitnami/logstash/vendor/local_gems/41a7e7a9/logstash-codec-protobuf-1.1.0/lib/logstash/codecs/protobuf.rb:99:in register'", "/opt/bitnami/logstash/logstash-core/lib/logstash/codecs/base.rb:20:in initialize'", "/opt/bitnami/logstash/logstash-core/lib/logstash/plugins/plugin_factory.rb:90:in plugin'", "/opt/bitnami/logstash/logstash-core/lib/logstash/pipe

@IngaFeick
Copy link
Collaborator

@MangeshBhamre are you sure you are using a protobuf 3 file? The compiler typically generates _pb.rb files for version 3, and yours ends in .pb.rb. Unless you renamed the file manually you might want to check which pb version it compiled to. Make sure that it is really protobuf 3 and then please upload the definition to gist or somewhere so that I can check the class name. Thank you!

@MangeshBhamre
Copy link

Hi @IngaFeick
Looks like I need to stick to protobuf 2. If you look at https://raw.githubusercontent.com/PowerDNS/pdns/master/pdns/dnsmessage.proto
it says proto2.

If I change it manually to proto3 and try to compile with protoc3, I get error message as below for all 'optional' fields.

dnsmessage.proto:78:14: Explicit 'optional' labels are disallowed in the Proto3 syntax. To define 'optional' fields in Proto3, simply remove the 'optional' label, as fields are 'optional' by default.

Additionally, as you said above, "It's not an elegant thing to do though."

Without moving to protobuf 3, how can I solve the problem ?

@IngaFeick
Copy link
Collaborator

Hi @MangeshBhamre, I don't recommend sticking with pb2 because the new lib is really much faster. Let's approach this differently: I can convert your protobuf definition into a version 3 ruby file for you tomorrow or on sunday and then you just use that. Should do the trick. I will send you that file this weekend. If you cannot wait and want this fixed immediately: just remove the quantifiers from each row (required / optional) and change syntax = "proto2"; to syntax = "proto3";. Then throw the google compiler with --ruby-out against it and with a high probability it should work. If not then let's talk on the weekend. Cheers!

@MangeshBhamre
Copy link

Hi @IngaFeick

I did what you recommended. Removed optional/Required and set 'proto3'. Additionally enum in proto3 requires first enum value to be 0. Added dummy value as well. Got successfully compiled now.

I tried new _pb.rb on logstash, however didn't work. Go below error. No protobuf messages were processed. Needed to revert back to old code.

[2018-03-17T02:15:37,832][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.

Alternatively, I use your latest 1.1.0 gem with older logstash proto2 config and .pb.rb file. I see messages coming in. Still missing some messages and looks bit slow. Don't know if this is useful info. Anyways.

Thanks for all your inputs and work!

@IngaFeick
Copy link
Collaborator

@MangeshBhamre can you please provide the new _pb.rb file that you generated?

@MangeshBhamre
Copy link

@IngaFeick Here you go

_pb.rb - https://pastebin.com/7BTYBkLs

And edited .proto file as per your instructions - https://pastebin.com/JXWN6QSi

@MangeshBhamre
Copy link

@IngaFeick
Hope above info is good enough for you. Let me know if you need anything more.

One important observation with ELK using this plugin over a long time (2-4 days)

  • If I use 1.0.3 gem, I see 15K documents in 15 minutes.
  • If I use 1.0.5 or 1.1.0 gem, I see only 4K documents in 15 minutes

Clearly, 1.0.5/1.1.0 is dropping documents due to errors OR 1.0.3 is faster. You may need to go back to 1.0.3 to see what it is doing right. Hope this helps.

@jake2184
Copy link
Author

Still having many error messages from this.

It looks like the underlying library is throwing a lot of errors, e.g.:
ProtocolBuffers::DecodeError: invalid message
ProtocolBuffers::DecodeError: incorrect wire type
undefined method '<<' for false:FlaseClass>

I am using the proto file generated from https://raw.githubusercontent.com/PowerDNS/pdns/master/pdns/dnsmessage.proto without edits.

I have updated the codec to 1.1.0 with no change.

@IngaFeick
Copy link
Collaborator

@jake2184 would you mind sharing your logstash input configuration? Thank you

@jake2184
Copy link
Author

Here:
dns_logstash.txt

@IngaFeick
Copy link
Collaborator

@jake2184, you are using protobuf 2. The improvements that I announced are only available when you switch the codec to protobuf 3. Some simple steps are required:

  • in your protobuf definitions,
    • change the syntax version (first line) from 2 to 3.
    • remove all 'optional' and 'required' modifiers. They don't exist anymore in pb3.
  • recompile them with the official google pb compiler. You should probably be getting _pb.rb files now instead of .pb.rb. If the compiler complains about syntax, more changes from version 2 to 3 might be necessary. In that case please paste the error message.
  • change your logstash input config like so:
input { 
    tcp {
       port => 5555
       type => "dns-proto"
       codec => protobuf {
         class_name => PBDNSMessage
         include_path => ['/etc/logstash/dns_pb.rb']
         protobuf_version => 3
       }
    }
}

@jake2184
Copy link
Author

As Mangesh described above, the sending software uses v2. I shouldn't need to upgrade to V3. I want to deviate from the v2 template as little as possible.

Are you stating that the plugin does not support v2 anymore?

I'm concerned with making it work, not about performance. Will upgrading to v3 fix any of the issues?

@IngaFeick
Copy link
Collaborator

Okay, background information:
In the first versions of the plugin we have been using a protobuf library that was not written by google. Reason: the Google protobuf lib does not support protobuf 2, the other library did. Now more people are migrating to Protobuf 3 so it made sense to add that option. So we ship the codec with 2 completely different libraries and it uses the "old" one for protobuf version 2, and the official google one for protobuf 3. I cannot answer if the official google one will solve your problem, that is why I am asking you to test, but given the fact that it is written by google and very actively maintained, I think the probability of a bug fix for that particular problem is very high. Please be also aware that protobuf 2 and 3 are (mostly, and in your case most certainly) wire compatible. Meaning: The producer requires no changes. You keep sending protobuf 2 data, you just pretend that they are protobuf 3 when you parse them in logstash. I'm doing the exact same thing at work and it's fast and stable. Wouldn't recommend it otherwise :)
You just need to compile the pseudo syntax 3 definitions once for ruby, but your producer keeps using the original syntax 2 definitions.

Also: of course the plugin still supports version 2. When you don't specify a version it will default to protobuf 2.

@jake2184
Copy link
Author

Thank you for that - it now makes more sense.

Upgrading to v3, I get the same errors as Mangesh above - invalid tag (zero).

In response to your comments, the producer of data is using the original v2 protobuf file from the PDNS website.

@IngaFeick
Copy link
Collaborator

@jake2184 thank you! Sorry to hear that. Can you estimate the percentage of messages for which this happens? Do they all fail or do some messages come through? If the 2nd applies, is it possible for you to log outgoing messages on the producer side and if so, could you identify and send me 1 or 2 of those that fail to be decoded?

@jake2184
Copy link
Author

With v3, no messages get through.

I can do a packet capture, which probably isn't much help, but I can't easily log what the software is sending - it's compiled C.

@jake2184
Copy link
Author

I managed to log a messages that were sent, using a Python decoder provided by PDNS: https://github.com/PowerDNS/pdns/blob/master/contrib/ProtobufLogger.py. The template was generated with protoc, but uses python not ruby/

The incoming serialized bytes:

08021210c398556ac5fb48c2b68a7d09b87f5e9e2001280132040a1622293a040a160263403448838bb5d80550908f125885a60162190a1365782d362e6f78616d2e696e7465726e616c2e100118016a30080012220a1365782d362e6f78616d2e696e7465726e616c2e1001180120b0092a040a16080a28838bb5d80530ed8e127a008a0100

Correctly decodes to:

[2018-05-29 13:20:19.296848] Response of size 52: 10.22.34.41 -> 10.22.2.99 (UDP), id: 21253, uuid: c398556ac5fb48c2b68a7d09b87f5e9e requestorid:  deviceid: 
- Question: 1, 1, ex-6.****.internal.
- Query time: 2018-05-29 13:20:19.296813
- Response Code: 0, RRs: 1
	 - 1, 1, ex-6.****.internal., 1200, 10.22.8.10

I don't think the issue is with the sent data, but due to some part of the logstash decoding.

@IngaFeick
Copy link
Collaborator

IngaFeick commented May 29, 2018

Thank you @jake2184 , I will look into this. Will be next week though, due to vacation (sorry!).

@jake2184
Copy link
Author

No problem, thank you.

@IngaFeick
Copy link
Collaborator

@jake2184, couple of updates:
first my apologies, I have given you incomplete instructions on how to migrate the protobuf definitions to version 3. You also need to make sure that all enums have an option with value 0. Here's a ruby version of it that definitely works, you can maybe use it for testing again while I look at the other stuff. It might (or might not) address the zero tag issue. Just guessing here.

Secondly I was able to reproduce the problem while decoding your example, using just the google lib, version 3 protobuf definition and a ruby script, so the "problem" is not rooted in logstash but rather in the prototype definition of this. Please note that there are two string-y datatypes used in this definition: bytes and string (for example in field 15). Looking at the datatype specification the only difference between those is that string must contain a valid ascii or utf-8 string, while bytes does not. Now I do not know why the author of that protobuf definition didn't use string for the fields that you expect to contain strings. Maybe they couldn't guarantee that it would really always contain strings and strings only? We can only guess. But, as a matter of fact, the behaviour of the library seems to be doing exactly what it is told to do. It's just that the definition and the intended usage don't fit to each other.

The cleanest way to fix this problem would be to change the protobuf definition for both consumer and producer to set this to string instead of bytes. I understand that this is out of scope for you though.
Sadly I don't know how to help you here either, since the problem is provably not caused by logstash and I am not under the impression that this is even a bug in the google lib. My recommendation would be do continue using the modifications that you already implemented.

@jake2184
Copy link
Author

The .pb file I am using already has the Dummy variables, and matches completely the one you provided.

@IngaFeick
Copy link
Collaborator

@jake2184 👍
Did you see my explanation above on why this bug is actually desired behaviour?

@jake2184
Copy link
Author

jake2184 commented Jun 4, 2018

Yes, I think what is not clear perhaps is the original issue topic doesn't matter so much, of being able to turn the bytes into an IP address - the Ruby code I wrote fixed that.

My current problem is the incessant errors being logged, and the fact when I upgrade to proto3 as described I cannot get any messages to decode.

@IngaFeick
Copy link
Collaborator

@jake2184 it has just been pointed out in #26 that we seem to be having a documentation issue when it comes to the kafka plugin specifically. You have to set the deserializer classes in your kafka config (outside the protobuf config) like so:

input {
	kafka {
		key_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
		value_deserializer_class => "org.apache.kafka.common.serialization.ByteArrayDeserializer"
	}
}

Could you please test if that solves the issue?
If this solves the problem: a thousand times sorry for not noticing this earlier. I'm using a different/custom version of the kafka plugin, so I didn't have to set this and never ran into the issue. 😬

@jake2184
Copy link
Author

As my config uploaded above shows, I am not using a kafka input. I am using tcp input, using the protobuf codec.

@IngaFeick
Copy link
Collaborator

@jake2184 my apologies. My mind was elsewhere :)

IngaFeick added a commit to IngaFeick/logstash-codec-protobuf that referenced this issue Jun 15, 2018
@ZAZmaster
Copy link

@jake2184 hi, did you manage to collect logs from pdns?

@mngan
Copy link

mngan commented May 14, 2019

It looks like logstash is UTF-8 encoding the data from the Protobuf bytes field. If I have an IP address such as 205.196.6.2, I end up with a field with ""\uFFFD\uFFFD\u0006\u0002". The \uFFFD is the UTF-8 replacement character since there isn't a valid mapping from the ASCII values 205 and 196. So there is no way to recover the original data. Is there some way to stop logstash from encoding the data so we can preserve the raw bytes?

In my case, I am using GoFlow to send Netflow data, so I have the source code to change the IP to say Base64 encoding. But I would rather not have to do that if I can help it. For the guys using commercial software, they have no chance of making that change. So it would seem that Logstash needs to handle this case better, or admit we should really not be sending binary data through Logstash.

@al-henrychen
Copy link

@mngan I am working on a similar project as well. I changed flow.proto to use string for ip addresses. I use protoc to compile a new flow.pb.go and flow_pb.rb But this logstash plugin still ran into a UTF-8 issue.

I must have missed something. How did you change encoding for IP addresses?

@mngan
Copy link

mngan commented Jan 28, 2020 via email

@al-henrychen
Copy link

al-henrychen commented Jan 29, 2020 via email

@LuciferInLove
Copy link

Hello. I try to send messages from Loki promtail to logstash http input with protobuf codec because promtail uses protobuf. I've generated logproto_pb.rb from this file using protoc-3.5.0. I get errors like in this issue:

[WARN ] 2021-04-16 16:16:35.672 [http-input-handler-executor[T#2]] protobuf - Couldn't decode protobuf: #<RuntimeError: Protocol message tag had invalid wire type.>.

Sample logstash config:

input {
    http {
    port => 10558
    codec => protobuf {
      class_name => "logproto.AnyClass"
      class_file => "./logproto_pb.rb"
      protobuf_version => 3
    }
  }
}
output {
    file {
        path => "/var/log/logstash/test.log"
    }
}

In the test.log file I see:

{"@timestamp":"2021-04-16T16:35:44.910Z","headers":{"content_length":"703","http_version":"HTTP/1.1","request_method":"POST","request_path":"/","content_type":"application/x-protobuf","http_accept":null,"http_host":"127.0.0.1:10558","http_user_agent":"promtail/2.2.0"},"host":"127.0.0.1","tags":["_protobufdecodefailure"],"@version":"1","message":"�\u0012t\n�\u0003\n�\u0003{app=\"calico\", container\u0011\u0014p-node\", file=\"int_dataplane.g\u00012\u0001\u0019|name=\"/var/log/pods/kube-system_\t_\u0005K�-v6zqj_4157024e-3a16-4663-9e8e-b4219b34a86f/c\u0005�\u000570/0.log\", host\to�ip-127-0-0-1\", job=\"k\u0019�\r\\\u0018\", k8s_=\u0004\u0005n0\", level=\"inf\u0001%,line=\"953\", \u0001~\u0014space=.Q\u0000\u0018\", pars99\u0001\u0011\u0004od2G\u0001\t�$\", release\u0011\u001D�\", stream=\"stdout\"}\u0012+\n\f\b���\u0006\u0010���\u0002\u0012\u001BApplying d1�4 updates\n\n�\u0003\n���\u0001\u0010psets��\u0001��\u0001��\u0001��\u0001\u000422��\u0001��\u0001\u0000Q5�x����\u0002\u0012AAsked to resync with the=�\u0018on next-�\u001C. familyA�\u0014et\"\n\n���\u0001��\u0001��\u0001��\u0001��\u0001\b306��\u0003��\u0003\u0000?5�\u001C����\u0002\u0012/R%�\fing ig)�9>�\u0001\u0000���\u0001��\u0001��\u0001��\u0001��\u0001\u00005��\u0001��\u0001\u0000b5�<����\u0002\u0012RFinished m�2�\u0003d numInconsistenciesFound=0\r/TDuration=1.298474ms\n\n���\u0007��\u0007��\u0007��\u0007<U+0084>\u0007\u000467��\u0003��\u0003\u0000MU\u0001\u0018���\u0002\u0012=U\u0001\u0004ap�탡�}�PmsecToApply=1.627523\n"}

Is it possible to decode message right via plugin?

@BBQigniter
Copy link

after playing around hours with this logstash plugin and dnsdist i found this thread :| it would be wonderful if this would work correctly.

@hari819
Copy link

hari819 commented Mar 6, 2023

please do we have a solution/workaround for this issue , i am facing the same parse exception ,

12:49:42.006  WARN  logstash.codecs.protobuf - Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.
12:49:42.006  WARN  logstash.codecs.protobuf - Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.
12:49:42.006  WARN  logstash.codecs.protobuf - Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.
12:49:42.007  WARN  logstash.codecs.protobuf - Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.
12:49:42.007  WARN  logstash.codecs.protobuf - Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>.

and we see the "_protobufdecodefailure" error in kibana ,

I am trying to send data from kafka to opensearch cluster 2.3 , i have tried multiple versions of logstash plugin , protobuf plugin as suggested above , but no luck ,

please could you help ,

@IngaFeick
Copy link
Collaborator

@hari819 could you please provide your logstash configuration (input section only, assuming you use the decoder) and the protobuf definition? thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants