New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot decode string of bytes sent from pdns #15
Comments
Hi @jake2184 , is this still relevant? I'm currently working on a new version and you could test-drive it in a couple of days, maybe it solves the issue. |
I came to a rather hacky work around, involving unpacking the data: event.set("from", event.get("from").unpack("C4").join("."));
event.set("to", event.get("to").unpack("C4").join("."));
event.set("messageId", event.get("messageId").unpack("H*").join("")); And for the rData response, which may be IPv6 or IPv4: response = event.get("response")
if ( response and response["rrs"] and response["rrs"][0] and response["rrs"][0]["rData"] )
rdata = response["rrs"][0]["rData"]
hex_value = rdata.unpack("H*").join("")
ip_value = rdata.unpack("C4").join(".")
length_rdata = rdata.unpack("L*").length
if ( length_rdata >= 4 )
event.set("[response][decoded_rdata]", hex_value )
else
event.set("[response][decoded_rdata]", ip_value)
end
end This is not great, but mostly works for the time being. I look forward to a working solution and am happy to test, but cannot guarantee having time to do so (depending on work projects). |
Hi jake2184 , I am trying to do a similar thing with pdns_recursor and logstash. In which file did you add above code? Regards, |
@MangeshBhamre are you using protobuf 2 or 3? |
@IngaFeick protobuf 3. Latest. In addition to above issue highlighted by @jake2184 . I get below errors. While i used logstash 5.2.x earlier this was working fine. Now I am on logstash 6.2.x [2018-02-18T09:52:18,226][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<NoMethodError: undefined method `<<' for false:FalseClass>. |
@MangeshBhamre the current version of the protobuf codec does not support protobuf 3 yet. No worries though, I'm about to write a PR for version 3 support. Can I interest you in testing the new version so that I can pick up your feedback before I send the PR? |
@IngaFeick Sure. I can test out. How can I fix above errors? |
@MangeshBhamre with the 1.0.4 version that is the official latest: you cannot fix the errors because as I said, it doesn't support protobuf 3 yet. Please install this gem
The import change is the last line that activates the protobuf 3 library. |
@IngaFeick There is no url or version I used. How can I install the one you specified? Sorry, I am bit new here with logstash/protobuf. |
Hi @MangeshBhamre, no problem, let me quickly walk you through this.
You just need to change the last part of the command to the file that you pulled. Let me know if it worked, will you? |
Hi @IngaFeick I successfully installed your gem. Also updated logstash config codec => protobuf { However logstash not starting... |
@MangeshBhamre could be a problem with the class name. Can you show me the protobuf ruby file, dnsmessage.pb.rb please? |
Here is it on pastebin https://pastebin.com/1vbCKYVk |
@MangeshBhamre you're using protobuf 2, not 3. If you want to switch to proto3, which I recommend for performance reasons, then please use the official protobuf compiler to generate proto3 ruby files. Afterwards please send me a pastebin link to the new ruby class. You should get something like dnsmessage_pb.rb instead of dnsmessage.pb.rb.
but I can confirm that only after you generated actual proto3 ruby files. |
@IngaFeick For me to use proto3, does that mean pdns also need to send me protobuf in ver3 format? (I guess so) If I wish to use proto2 on logstash and resolve the error I am getting, how to go about it? As I said earlier, with logstash 5.2.x this worked fine, however for some reason we need to use logstash 6.2.x. Thanks for all your help here. |
Not necessarily. You can read protobuf2 data with a protobuf3 definition. It's not an elegant thing to do though.
I haven't tested with logstash 6 yet. I will try to do that asap but it might take a couple of days, high workload currently. What you could test in the meantime is what jake did. As for your question
You'd put that in the filter section of your logstash config. There's a plugin called "ruby" that allows you to execute custom code like the one that he mentioned. The documentation is here: |
@IngaFeick I downloaded gems for all versions and installed it one-by-one. 1.0.0 did not give any errors to me. 1.0.2 and above throws all errors. Either 1.0.0 is not emitting the errors and consuming it internally or it's working correctly. Is it ok to use 1.0.0 or are there any bugs and you recommend using 1.0.2 and above? I will try ruby filter plugin you suggested. Looking forward to seeing 1.0.5+ to work for me without any errors and speed up my logstash |
@MangeshBhamre interesting observation. That's a helpful hint actually. I will look into it asap, thanks! |
@MangeshBhamre it should be safe to use 1.0.0, yes. The versions afterwards were related to speed improvement and code quality, afair, and not bugfixes. |
@IngaFeick I ran 1.0.0 overnight and found its too slow and missed almost 90% protobuf messages. I am using later version to not miss the events. It will be great if a newer version is available that address my list of errors. |
Hi @MangeshBhamre!
Please note that the config has changed, you need to set |
Hi @IngaFeick I tried it and not able to start logstash. Below are the errors seen.
|
@MangeshBhamre are you sure you are using a protobuf 3 file? The compiler typically generates _pb.rb files for version 3, and yours ends in .pb.rb. Unless you renamed the file manually you might want to check which pb version it compiled to. Make sure that it is really protobuf 3 and then please upload the definition to gist or somewhere so that I can check the class name. Thank you! |
Hi @IngaFeick If I change it manually to proto3 and try to compile with protoc3, I get error message as below for all 'optional' fields. dnsmessage.proto:78:14: Explicit 'optional' labels are disallowed in the Proto3 syntax. To define 'optional' fields in Proto3, simply remove the 'optional' label, as fields are 'optional' by default. Additionally, as you said above, "It's not an elegant thing to do though." Without moving to protobuf 3, how can I solve the problem ? |
Hi @MangeshBhamre, I don't recommend sticking with pb2 because the new lib is really much faster. Let's approach this differently: I can convert your protobuf definition into a version 3 ruby file for you tomorrow or on sunday and then you just use that. Should do the trick. I will send you that file this weekend. If you cannot wait and want this fixed immediately: just remove the quantifiers from each row (required / optional) and change |
Hi @IngaFeick I did what you recommended. Removed optional/Required and set 'proto3'. Additionally enum in proto3 requires first enum value to be 0. Added dummy value as well. Got successfully compiled now. I tried new _pb.rb on logstash, however didn't work. Go below error. No protobuf messages were processed. Needed to revert back to old code. [2018-03-17T02:15:37,832][WARN ][logstash.codecs.protobuf ] Couldn't decode protobuf: #<RuntimeError: Protocol message contained an invalid tag (zero).>. Alternatively, I use your latest 1.1.0 gem with older logstash proto2 config and .pb.rb file. I see messages coming in. Still missing some messages and looks bit slow. Don't know if this is useful info. Anyways. Thanks for all your inputs and work! |
@MangeshBhamre can you please provide the new _pb.rb file that you generated? |
@IngaFeick Here you go _pb.rb - https://pastebin.com/7BTYBkLs And edited .proto file as per your instructions - https://pastebin.com/JXWN6QSi |
@IngaFeick One important observation with ELK using this plugin over a long time (2-4 days)
Clearly, 1.0.5/1.1.0 is dropping documents due to errors OR 1.0.3 is faster. You may need to go back to 1.0.3 to see what it is doing right. Hope this helps. |
Still having many error messages from this. It looks like the underlying library is throwing a lot of errors, e.g.: I am using the proto file generated from https://raw.githubusercontent.com/PowerDNS/pdns/master/pdns/dnsmessage.proto without edits. I have updated the codec to 1.1.0 with no change. |
@jake2184 would you mind sharing your logstash input configuration? Thank you |
Here: |
@jake2184, you are using protobuf 2. The improvements that I announced are only available when you switch the codec to protobuf 3. Some simple steps are required:
|
As Mangesh described above, the sending software uses v2. I shouldn't need to upgrade to V3. I want to deviate from the v2 template as little as possible. Are you stating that the plugin does not support v2 anymore? I'm concerned with making it work, not about performance. Will upgrading to v3 fix any of the issues? |
Okay, background information: Also: of course the plugin still supports version 2. When you don't specify a version it will default to protobuf 2. |
Thank you for that - it now makes more sense. Upgrading to v3, I get the same errors as Mangesh above - In response to your comments, the producer of data is using the original v2 protobuf file from the PDNS website. |
@jake2184 thank you! Sorry to hear that. Can you estimate the percentage of messages for which this happens? Do they all fail or do some messages come through? If the 2nd applies, is it possible for you to log outgoing messages on the producer side and if so, could you identify and send me 1 or 2 of those that fail to be decoded? |
With v3, no messages get through. I can do a packet capture, which probably isn't much help, but I can't easily log what the software is sending - it's compiled C. |
I managed to log a messages that were sent, using a Python decoder provided by PDNS: https://github.com/PowerDNS/pdns/blob/master/contrib/ProtobufLogger.py. The template was generated with protoc, but uses python not ruby/ The incoming serialized bytes:
Correctly decodes to:
I don't think the issue is with the sent data, but due to some part of the logstash decoding. |
Thank you @jake2184 , I will look into this. Will be next week though, due to vacation (sorry!). |
No problem, thank you. |
@jake2184, couple of updates: Secondly I was able to reproduce the problem while decoding your example, using just the google lib, version 3 protobuf definition and a ruby script, so the "problem" is not rooted in logstash but rather in the prototype definition of this. Please note that there are two string-y datatypes used in this definition: The cleanest way to fix this problem would be to change the protobuf definition for both consumer and producer to set this to |
The .pb file I am using already has the Dummy variables, and matches completely the one you provided. |
@jake2184 👍 |
Yes, I think what is not clear perhaps is the original issue topic doesn't matter so much, of being able to turn the bytes into an IP address - the Ruby code I wrote fixed that. My current problem is the incessant errors being logged, and the fact when I upgrade to proto3 as described I cannot get any messages to decode. |
@jake2184 it has just been pointed out in #26 that we seem to be having a documentation issue when it comes to the kafka plugin specifically. You have to set the deserializer classes in your kafka config (outside the protobuf config) like so:
Could you please test if that solves the issue? |
As my config uploaded above shows, I am not using a kafka input. I am using tcp input, using the protobuf codec. |
@jake2184 my apologies. My mind was elsewhere :) |
@jake2184 hi, did you manage to collect logs from pdns? |
It looks like logstash is UTF-8 encoding the data from the Protobuf bytes field. If I have an IP address such as 205.196.6.2, I end up with a field with ""\uFFFD\uFFFD\u0006\u0002". The \uFFFD is the UTF-8 replacement character since there isn't a valid mapping from the ASCII values 205 and 196. So there is no way to recover the original data. Is there some way to stop logstash from encoding the data so we can preserve the raw bytes? In my case, I am using GoFlow to send Netflow data, so I have the source code to change the IP to say Base64 encoding. But I would rather not have to do that if I can help it. For the guys using commercial software, they have no chance of making that change. So it would seem that Logstash needs to handle this case better, or admit we should really not be sending binary data through Logstash. |
@mngan I am working on a similar project as well. I changed flow.proto to use string for ip addresses. I use protoc to compile a new flow.pb.go and flow_pb.rb But this logstash plugin still ran into a UTF-8 issue. I must have missed something. How did you change encoding for IP addresses? |
It isn’t just a matter of changing the format of the field in the protobuf. The IP needs to be formatted as a string of characters representing the IP rather than a string of bytes, which is what I suspect you are doing. Logstash is really text oriented, so the data really needs to be formatted as text and skip using binary formats.
|
Thanks. I will see what I can do.
…On Tue, Jan 28, 2020 at 10:46 AM mngan ***@***.***> wrote:
It isn’t just a matter of changing the format of the field in the
protobuf. The IP needs to be formatted as a string of characters
representing the IP rather than a string of bytes, which is what I suspect
you are doing. Logstash is really text oriented, so the data really needs
to be formatted as text and skip using binary formats.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#15?email_source=notifications&email_token=AKCDKEQSX3NXE4BDE4TOC6LRAB4OTA5CNFSM4EBYRJAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKEOHRA#issuecomment-579396548>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKCDKESHHCLDXC7ZUTKSKDTRAB4OTANCNFSM4EBYRJAA>
.
--
Henry Chen
Netops, Applovin, Inc.
650-995-3805
|
Hello. I try to send messages from Loki promtail to logstash http input with protobuf codec because promtail uses protobuf. I've generated logproto_pb.rb from this file using protoc-3.5.0. I get errors like in this issue:
Sample logstash config:
In the test.log file I see:
Is it possible to decode message right via plugin? |
after playing around hours with this logstash plugin and dnsdist i found this thread :| it would be wonderful if this would work correctly. |
please do we have a solution/workaround for this issue , i am facing the same parse exception ,
and we see the "_protobufdecodefailure" error in kibana , I am trying to send data from kafka to opensearch cluster 2.3 , i have tried multiple versions of logstash plugin , protobuf plugin as suggested above , but no luck , please could you help , |
@hari819 could you please provide your logstash configuration (input section only, assuming you use the decoder) and the protobuf definition? thanks |
I'm using pdns_recursor to send protobuf messages to logstash.
However the output of some fields is pretty incomprehensible.
The proto file is found here, and was converted using protocol-buffers.
Of particular annoyance are the 'from' and 'to' fields of the protobuf message. In the proto file they are sent as 'bytes', and looking at the pdns source code it looks like what is sent is a C++ string.
When decoded and printed using logstash stdout or passed into elastic search, instead of being given a nice string of the IP address (even in hex format and without '.' delimiter), I get:
"from" => "\n\x16\x02b"
- which corresponds to 10.22.2.98Changing the pb.rb file to attempt to decode as a string rather than bytes gives:
"from" => "\n\u0016\u0002b"
What I am trying to get is some form of consistent representation that I can work with to manipulate, with the end-goal of having "10.22.2.98" as the output.
The text was updated successfully, but these errors were encountered: