Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent handling of dots in field names by json_parser #2814

Closed
tyrken opened this issue Jun 15, 2020 · 2 comments · Fixed by #2823
Closed

Inconsistent handling of dots in field names by json_parser #2814

tyrken opened this issue Jun 15, 2020 · 2 comments · Fixed by #2823
Assignees
Labels
domain: data model Anything related to Vector's internal data model domain: logs Anything related to Vector's log events have: must We must have this feature, it is critical to the project's success. It is high priority. type: bug A code related bug.

Comments

@tyrken
Copy link
Contributor

tyrken commented Jun 15, 2020

When the json_parser transform unpacks JSON with dots in a key name, then it handles field names containing dots inconsistently, e.g. see the below config/setup/output. I'm working around this for now but hope it can be fixed & isn't a designed-in ambiguity...

# Convenience
data_dir = "/tmp"

[sources.in]
  type = "file"
  include = ["./input.log"]
  start_at_beginning = true
  # Make vector read the small input file
  fingerprinting.strategy = "device_and_inode"

[transforms.t1]
  inputs = ["in"]
  type = "json_parser"
  drop_field = true
  drop_invalid = false
  field = "message"
  # Uncomment the target_field to see "correct" behaviour when inserting always under a key
  # target_field = "json"

[sinks.out]
    type = "console"
    inputs = ["t1"]
    target = "stdout"
    encoding = "json"

When you run v0.9.2 vector (on Linux FWIW) with the below input file.

echo '{"field.with.dots": true, "sub_field": {"another.one": false}}' >input.log

... you get the below result (pretty printed for clarity), noting "field.with.dots" is now a three-level hierarchy not a string:

{
    "field": {
        "with": {
            "dots": true
        }
    },
    "file": "input.log",
    "host": "my-host-name",
    "source_type": "file",
    "sub_field": {
        "another.one": false
    },
    "timestamp": "2020-06-15T08:14:00.312871271Z"
}

If you uncomment the target_field to make sure it's always not-inserting-at-root then the problem disappears.

@bruceg bruceg added event type: log type: bug A code related bug. labels Jun 15, 2020
@binarylogic binarylogic added the have: must We must have this feature, it is critical to the project's success. It is high priority. label Jun 15, 2020
@Hoverbear
Copy link
Contributor

@tyrken
Copy link
Contributor Author

tyrken commented Jun 18, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: data model Anything related to Vector's internal data model domain: logs Anything related to Vector's log events have: must We must have this feature, it is critical to the project's success. It is high priority. type: bug A code related bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants