New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
float type losing precision when doing terms aggregation #30529
Comments
Pinging @elastic/es-search-aggs |
Your number does lose precision, but not the way you think. This is due to how floating-point numbers work: If you print your float, then it will seem to work because the system prints the shortest string whose value is precise enough to distinguish it from adjacent float values. So you only happen to have more decimals in the Because floats and doubles cannot accurately represent a value, it is generally a bad idea to run |
@jpountz I know this ticket is closed but I wanted to add a bit of additional information that might help people who run across this ticket in the future better understand what's going on. 1. Index settings and insert two documents
2. Execute search
The 3. Execute search with stored field
This is the confusing part. If you use stored fields the hits and the bucket keys are not consistent with each other. My personal opinion is that either: (a) the stored fields response should return An easy way to achieve the former behavior is to modify the @Override
public void floatField(FieldInfo fieldInfo, float value) throws IOException {
addValue(fieldInfo.name, (double) value);
} An easy (but potentially inefficient) way to achieve the latter behavior is to change the conversion of stored float values to look like this. static final class SingleFloatValues extends NumericDoubleValues {
...
@Override
public double doubleValue() throws IOException {
String floatValue = Float.toString(
NumericUtils.sortableIntToFloat((int) in.longValue())
);
return Double.parseDouble(floatValue);
}
...
} With either of these options the search results and the aggregation results will be internally consistent. I'm inclined to the former because it accurately represents the actual value of stored in the field when you do the query. The reason I needed to use the The workaround I used is use the |
Elasticsearch Version
5.3.2
Issue feature
Step1: I have a field named "float_numbers" with the field type "float";
Step2: Then i inserted a value 0.62;
Step3: When i make terms aggregations to this field, i found it lost precision. As you can see below.
original value: 0.62
key in buckets: 0.6200000047683716
Clue
I've found some info in this discuss.
It seems that all aggs will convert the values to a double before operating on them.
So, I can set the field mapping to "double" to deal with this issue. But i don't think it an effective solution, since double type costs twice the float type.
Expectation
I'm looking forward to any other solutions or some related info when elasticsearch updating.
Test Case
Here is my test cases, copy and paste into kibana will work.
1. index settings and mapping
2. insert values
3. terms aggregation
Thanks for any suggestions
The text was updated successfully, but these errors were encountered: