-
Notifications
You must be signed in to change notification settings - Fork 765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix rounding problems #321
Conversation
Use BigDecimal in some cases in DataFormatter.formatCellValue, to avoid rounding problems.
@colinwjd could you provide test cases or examples of this issue in practice? Excel is a bit of a mine field when it comes to when to round and when not to round. We have no access to the Excel code or the mindsets of the Microsoft developers. We can only build code based on observation and interpretation of publicly available specs. |
@@ -950,7 +950,7 @@ private String getFormattedNumberString(Cell cell, ConditionalFormattingEvaluato | |||
if (numberFormat == null) { | |||
return String.valueOf(d); | |||
} | |||
String formatted = numberFormat.format(d); | |||
String formatted = numberFormat.format(new BigDecimal(String.valueOf(d))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting a double to a string and then to a bigdecimal and then to a formatted string - this seems dangerous to me - each conversion introducing more risk of rounding issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As shown above, the value 2.05 is correctly rounded to 2.1 in excel, but in the parsed result the value is 2.0
The reason is that the Double type has a loss of precision, so we need to use the BigDecimal type.
We need to use the String type to initialize the BigDecimal object, otherwise there will also be precision problems.
It is described in the Java api documentation as follows
This is generally the preferred way to convert a float or double into a BigDecimal, as it doesn't suffer from the unpredictability of the BigDecimal(double) constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you provide the xlsx file? the fact that you are using a formula evaluator suggests that the cells may not have numbers but have formulas instead - this could affect the result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
05e8a16 - this test case I just added seems to behave as expected without the change you are suggesting
POI is 20 year old code and if DataFormatter has a serious bug like the one you suggest it has then I'm surprised noone else has seen the problem (including me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the code in getFormattedNumberString does not use BigDecimal at the moment, so your API note is off topic - prove there is an issue and we will fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a terrible workaround and not a fix. 2.05f is less than 2.05, so 2.0 would be the correct result when converting to a BigDecimal. The formatter should be fixed instead. Just check in JShell:
axel@xiaolong ~ % jshell
| Welcome to JShell -- Version 18
| For an introduction type: /help intro
jshell> 2.05
$1 ==> 2.05
jshell> new BigDecimal(2.05)
$2 ==> 2.04999999999999982236431605997495353221893310546875
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xzel23 this is not an argument about 2.05f and how it rounds in Java - it is about having Apache POI match the behaviour of Microsoft Excel. I'm happy to have the solution changed as long the test cases still pass - including the new test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I can confirm that Excel displays the value as 2.1, this is obviously done in the Excel formatter as the value in the XML is stored as 2.0499999999999998:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac xr xr2 xr3" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision" xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2" xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3" xr:uid="{7487740C-7451-9943-A628-9BEEC0E38BD2}">
<dimension ref="A1:A6" />
<sheetViews>
<sheetView tabSelected="1" workbookViewId="0">
<selection activeCell="A6" sqref="A6" />
</sheetView>
</sheetViews>
<sheetFormatPr baseColWidth="10" defaultRowHeight="16" x14ac:dyDescent="0.2" />
<sheetData>
<row r="1" spans="1:1" x14ac:dyDescent="0.2">
<c r="A1">
<v>2.04</v>
</c>
</row>
<row r="2" spans="1:1" x14ac:dyDescent="0.2">
<c r="A2" s="1">
<v>2.04</v>
</c>
</row>
<row r="3" spans="1:1" x14ac:dyDescent="0.2">
<c r="A3">
<v>2.0499999999999998</v>
</c>
</row>
<row r="4" spans="1:1" x14ac:dyDescent="0.2">
<c r="A4" s="1">
<v>2.0499999999999998</v>
</c>
</row>
<row r="5" spans="1:1" x14ac:dyDescent="0.2">
<c r="A5">
<v>2.06</v>
</c>
</row>
<row r="6" spans="1:1" x14ac:dyDescent="0.2">
<c r="A6" s="1">
<v>2.06</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3" />
<pageSetup paperSize="9" orientation="portrait" horizontalDpi="0" verticalDpi="0" />
</worksheet>
What should be done is check why the BigDecimal formatter displays this wrong (or in other words: what kind of workaround is implemented there). It looks like they map everything within one ULP to the rounded value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String formatted = numberFormat.format(new BigDecimal(NumberToTextConverter.toText(d)));
This was the first attempt I made - it fixes this github321 case but breaks some existing tests.
That NumberToTextConverter seems to have some logic that mimics how Excel takes the 2.0499999999999998 and converts it to 2.05 . If we can find out why the other tests break that this solution might be better.
Double.toString also seems to correct for some of these issues to because it too converts 2.0499999999999998 to 2.05.
In the end the madness is why do Microsoft store 2.05 as 2.0499999999999998 in their sheet XML in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I had time... Let's just keep it in like this until there is time to look into the formatter.
BTW just last week I found some hand crafted number formatting code in an arbitrary precision library I once wrote and while I don't remember why I would have implemented such a conversion myself I bet it must have been an issue like this one.
Thanks - I've applied your change based on the reproducible test case. |
I'm still tweaking this code because this change is breaking other tests. The Microsoft developers have come up with such terrible Excel code that it is gut-wrenching trying to hack something that matches their code. |
@colinwjd it could be a while until the next POI release. You can build your own version of POI, of course. One workaround that you could try is using the ROUND function, eg ROUND(2.05, 1). The current POI 5.2.2 code seems to correctly round this to 2.1. |
Use BigDecimal in some cases in DataFormatter.formatCellValue, to avoid rounding problems.