Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a heka message decoder config that doesn't clobber the message metadata #79

Closed
wants to merge 3 commits into from
Closed

Conversation

relud
Copy link
Member

@relud relud commented Feb 17, 2017

There is useful metadata about a message that is stored in default_headers, which is clobbered by the default heka json decoder. For the cloudops' logging pipeline it's important to preserve those fields, because they are required in order to properly route messages.

@relud relud changed the title Json preserve meta add a heka message decoder config that doesn't clobber the message metadata Feb 17, 2017
@trink
Copy link
Contributor

trink commented Feb 17, 2017

This decoder is for processing messages that conform to the Heka JSON schema. It appears this is some derivation of that schema and should really be its own cloudops decoder.

@relud
Copy link
Member Author

relud commented Feb 20, 2017

I'm fine making a separate decoder, but the schema of the logs is the same, the difference is where I put it in the heka message, in order to preserve default_headers, which I would like a way to preserve. Perhaps allowing a prefix under which it would store the default_headers as metadata? like metadata_group in s3_parquet.lua#L60

@relud
Copy link
Member Author

relud commented Feb 20, 2017

It would be nice to be able to have access to default_headers for Type Logger and Timestamp as well as the values for those that are in the log message itself, and potentially access either or both from message_matcher

local M = {}
setfenv(1, M) -- Remove external access to contain everything in the module

function decode(data, dh)
local msg = cjson.decode(data)
if cfg.preserve_metadata then
if type(msg.Fields) == "table" then
for k,v in pairs(msg.Fields.Fields) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to see your original input as this implies it does not conform to the Heka message schema as there is no nested Fields of Fields hash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a typo, it is supposed to be msg.Fields

@trink
Copy link
Contributor

trink commented Feb 20, 2017

It would be nice to be able to have access to default_headers
I don't understand what this comment means as you have access to everything in the current message and the current sandbox environment. And as far as the message matcher goes this is an input decoder, Anything that is put into the newly created message is easily accessible through the message matcher downstream; if it is well structured (not a sub encoding like a JSON string in the Heka structure and even then it is still accessible but only from a string match level)

local M = {}
setfenv(1, M) -- Remove external access to contain everything in the module

function decode(data, dh)
local msg = cjson.decode(data)
if cfg.preserve_metadata then
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this comment means as you have access to everything in the current message and the current sandbox environment.

dh.Type and msg.Type may have different values here.

With the previous behavior there was no way to preserve dh.Type so that it could be used in a message matcher on an output.

With the new behavior proposed here, setting cfg.preserve_metadata would allow them to be referenced in a message matcher as Type and Fields[Type] respectively, with the caveat that msg.Fields.agent (or similar) would have to be referenced in a message matcher as Fields[Fields.agent]

@relud
Copy link
Member Author

relud commented Mar 2, 2017

closing this in favor of a new decoder

@relud relud closed this Mar 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants