-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scaled_float still has precision problem #32570
Comments
Pinging @elastic/es-search-aggs |
I'm not sure this is a bug actually. The scaling factor is just too small as far as I understand the definition of the scaling factor. @jpountz probably knows this better than me though :) so ping :) |
Hmm -- seems like there's definitely some confusion then. From the Elasticsearch documentation:
In terms of "round tripping," my understanding according to the documentation referenced above is that internally, when a document with the value |
At minimum, we need to make a better declaration on how this works and limitations. The documentation currently reads:
For an input value of 79.99 and a scaling factor of 100, I think it's reasonable for a user to expect 79.99*100=7999. I don't think the user reasonably expects that 79.99 * 10 = 7998.9999... unless you're familiar with the internals. The docs either need to imply round down logic or the feature needs to give you the closest value really / we need to define how a user should understand it to work. |
We discussed this in FixItFriday and the outcome is that we feel this is a bug as users do not know (and should not reasonably know) the ins and outs of binary floating point representations. The problem here seems to be centred around us doing the wrong thing to resolve the value of the |
This approach would still have similar problems, they would just occur on the half point between representable values rather than on representable values themselves? Alternatively maybe we could use big decimals to delay rounding to after the multiplication (which is where the fact that doubles do not represent numbers accurately gets amplified) by replacing |
I don't think using
returns I think this is a principal problem with
=> we can't really fix the feature to get the desired behaviour here as far as I can see. => We should probably fix the docs? |
@original-brownbear Try |
@jpountz huh you're right! That works :)
|
I think I suggested the
I think this should not yield all 3 docs, but it does. |
Yeah I was going to comment with an example like that after reading that "users do not know (and should not reasonably know) the ins and outs of binary floating point representations". I think this is a good example why we can't (reasonably) hide this problem entirely from our users. I'm not too worried about this issue since users will only face it if they represent numbers on client-side with something that is more accurate than doubles, which is uncommon. Also for the record, |
Even though this won't solve all issues, we agreed to implement the above proposal (#32570 (comment)), which will remove some surprises. We should also document that scaled floats are prone to floating-point accuracy issues, the fact that our docs use scaling factors of 100 or 1000 which might suggest like this is actually a decimal type. |
If this is the final solution here, I might recommend that you change the example referenced in the documentation that specifically refers to pricing (which is where we used it). While I do understand the intricacies of floating point math, my reading of the documentation left me with the sense that I didn't have to worry about that because the work was being done behind the scenes by elastic. Since that's not really the case, I don't think using this feature for pricing is appropriate. Instead once should just store the smallest non-fractional amount (like cents, etc.) and deal with the rounding outside. |
👍 We should show an example that uses a percentage or something like that instead. |
Is there a place where I can see when this gets slotted in for development? |
@sergiosalvatore When there a PR for this feature, you will be able to see the link to this PR in this issue |
* Use `toString` and `Bigdecimal` parsing to get intuitive behaviour for `scaled_float` as discussed in elastic#32570 * Closes elastic#32570
Reproduced on Elasticsearch version 6.3.2 in Elastic Cloud
Looks like this was addressed in [https://github.com//pull/27207] but problem still exists.
Steps to reproduce:
PUT testindex{
"settings":{
"number_of_shards":1
},
"mappings":{
"type1":{
"properties":{
"field1":{
"type":"scaled_float",
"scaling_factor":100
}
}
}
}
}
PUT /testindex/type1/1?pretty{
"field1":79.99
}
GET testindex/_search{
"query":{
"bool":{
"must":[
{
"range":{
"field1":{
"gte":0,
"lte":79.99
}
}
}
]
}
}
}
Query results in 0 hits. Looks like the query is being resolved to
field1:[0 TO 7998]
, which is incorrect. I would expect to return documentLooking at the code, could we cast dValue to a float??
hi = Math.round(Math.floor((float)dValue));
The text was updated successfully, but these errors were encountered: