Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mis-encoded UTF-8 characters in differential log results #5288

Closed
NickTitle opened this issue Nov 9, 2018 · 2 comments
Closed

Mis-encoded UTF-8 characters in differential log results #5288

NickTitle opened this issue Nov 9, 2018 · 2 comments
Labels
bug logging wishlist Open feature request not currently being worked on.

Comments

@NickTitle
Copy link
Contributor

Bug report

What operating system and version are you using?

 version = 10.13.6
   build = 17G65
platform = darwin

What version of osquery are you using?

3.3.0, plus 2.11.2 to validate findings

What steps did you take to reproduce the issue?

run this config and tail logs

{
  "options": {
    "logger_plugin": "filesystem",
    "logger_path": "logs/",
    "utc": "true"
  },
  "schedule": {
    "emoji-test-pack": {
      "query": "SELECT '🚲', unix_time from time",
      "interval": 5
    }
  }
}

What did you expect to see?

results including correctly-encoded unicode characters

What did you see instead?

results including mis-encoded unicode characters (actual output below)

{
  "name": "emoji-test-pack",
  "hostIdentifier": "🚲",
  "calendarTime": "Fri Nov  9 20:58:00 2018 UTC",
  "unixTime": 1541797080,
  "epoch": 0,
  "counter": 0,
  "columns": {
    "'🚲'": "\\xF0\\x9F\\x9A\\xB2",
    "unix_time": "1541797080"
  },
  "action": "added"
}
@NickTitle
Copy link
Contributor Author

#520 looks like it ran into something similar way back in Dec 2014

@obelisk obelisk added bug logging wishlist Open feature request not currently being worked on. labels Nov 13, 2018
@obelisk
Copy link
Contributor

obelisk commented Nov 13, 2018

Yeah this has been a thing for a long time but I'm not sure how complex the fix is. It could be that this is something osquery is doing during serialization (making the fix probably quite easy) or it could be happening in RocksDB (though I think this less likely) and be much harder to fix.

I think the real motivation for a fix here will be the failure to render Chinese/Japanese characters correctly.

directionless added a commit to kolide/launcher that referenced this issue May 3, 2019
Osquery sometimes mis-encodes utf8 data osquery/osquery#5288

This is a broad attempt to repair log files that exhibit that issue. This runs against the entire log file. Hopefully, there isn’t going to be a case where it misfires.

Fixes: #445
directionless added a commit to kolide/launcher that referenced this issue Jan 21, 2021
Osquery sometimes mis-encodes utf8 data osquery/osquery#5288

This is a broad attempt to repair log files that exhibit that issue. This runs against the entire log file. Hopefully, there isn’t going to be a case where it misfires.

Fixes: #445
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug logging wishlist Open feature request not currently being worked on.
Projects
None yet
Development

No branches or pull requests

3 participants