Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling non-finite float values #9212

Closed
urso opened this issue Nov 22, 2018 · 11 comments
Closed

Handling non-finite float values #9212

urso opened this issue Nov 22, 2018 · 11 comments
Labels
discuss Issue needs further discussion. needs_team Indicates that the issue/PR needs a Team:* label Stalled

Comments

@urso
Copy link

urso commented Nov 22, 2018

Dealing with floating point values can be somewhat tricky at points. We want to encode events to JSON, plus require values to be indexible in Elasticsearch. This is causing problems when a floating point value is NaN, -Inf, Inf for example. These special values are not included in the JSON standard and also not supported by Elasticsearch.

Normally the JSON encoding will fail and the event is dropped in the output. An error message is logged in this case.

In general modules are discouraged to publish events with NaN or other special values, but sometimes it can't be helped. When normalizing/checking the event we could alternatively remove fields from events or add placeholder values.

Special values are:

  • +Inf
  • -Inf
  • qNaN (quiet Not a Number): normally used for error propagation
  • sNaN (signalling NaN): can be anything... e.g. not available indicator

Fun fact: NaN values can even contain a payload.

@urso urso added the discuss Issue needs further discussion. label Nov 22, 2018
@ruflin
Copy link
Member

ruflin commented Nov 23, 2018

Agree that in general such values should be handled on event creation but as you mentioned, not always possible. Would be great to find some way to remove these values before we try to do the encoding and also log a better message which field it is.

@jsoriano
Copy link
Member

I think that generating these values should be considered a bug (at least from go-based modules).

I am ok with handling them some way so we don't drop whole events while it is handled on event creation, but for me it'd be even more important to log more information when this happens.
As talked offline, it'd be good to log the field name when an unsupported value is found. If not possible other option could be to log the whole event at the debug level. It is sometimes difficult to catch these bugs if you don't know what generates the wrong values.

@urso
Copy link
Author

urso commented Nov 23, 2018

We can't remove values when encoding. We would have to remove fields when normalizing the events. This is done before running processors.

My only concern is with us silently removing fields + conditionals. If a condition access a field that is not present, it will fail (not sure if it just returns false). But I guess it's no big issue.

We could also add a post-processing step iterating the event fields once again and remove fields after they've been processed, but before they are put into the queue. Yet, I'm not happy adding more work here.

@ruflin
Copy link
Member

ruflin commented Nov 27, 2018

I think a good starting point we can probably all agree on is that we at least log it on a debug level if we drop an event which should make investigating issues easier.

@urso
Copy link
Author

urso commented Nov 27, 2018

Events are not dropped silently. The outputs normally log the encoding errors on error level.

@jsoriano
Copy link
Member

I think that what @ruflin meant is to log the event (or some information about it) on the debug level if it is dropped, apart of current error.

@remmeier
Copy link

any update on this? an improvement to the error message "2019-05-16 10:22:32.651 CEST
2019-05-16T08:22:32.650Z ERROR elasticsearch/client.go:393 Failed to encode event: unsupported float value: NaN" would be great to see.

@kaiyan-sheng
Copy link
Contributor

I have a fix for this in progress #12084 which will ignore prometheus metrics with NaN/Inf values.

@kaiyan-sheng
Copy link
Contributor

One more fix #13790 for this problem.

@botelastic
Copy link

botelastic bot commented Aug 24, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added Stalled needs_team Indicates that the issue/PR needs a Team:* label labels Aug 24, 2020
@botelastic
Copy link

botelastic bot commented Aug 24, 2020

This issue doesn't have a Team:<team> label.

@botelastic botelastic bot closed this as completed Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. needs_team Indicates that the issue/PR needs a Team:* label Stalled
Projects
None yet
Development

No branches or pull requests

5 participants