Skip to content
This repository has been archived by the owner on Mar 9, 2020. It is now read-only.

Worksheet Calculation Issue #379

Open
derouyag opened this issue Dec 22, 2018 · 12 comments
Open

Worksheet Calculation Issue #379

derouyag opened this issue Dec 22, 2018 · 12 comments
Assignees

Comments

@derouyag
Copy link

I will try to be thorough but brief. When calculating a formula within a cell in memory, the results returned are in a floating point value.


Cells C6:C10 on 'some worksheet' are formatted as 'Number(2)'
Cells D2:D10 on 'some other worksheet' are formatted as 'Number(3)'

e.g. =if(sum('some worksheet'!C6:C10)=sum('some other worksheet'!D2:D10), "Okay", "Error")

sum('some worksheet'!C6:C10) = floating point number = fp1
sum('some other worksheet'!D2:D10) = floating point number = fp2

fp1 <> fp2

If the cell is formatted as 'General' then Epplus uses a floating point type or double to determine the two sums. Because it is floating point, the result will always be "Error" (Think about this in code. worksheet.calculate(), what is the result of the cell? if (cell.value = "Error")... Don't think about Excel.)

To the best of my knowledge, if I format the e.g. formula cell to Number(3) and have "set precision as displayed' ON, then the formula will show "Okay" or "Error"

If you do the same thing in an Excel spreadsheet (Excel not Epplus) the result will correctly display "Okay" or "Error" because it probably uses a Decimal type instead. Or maybe it knows the precision used in both SUMs.

@swmal
Copy link
Collaborator

swmal commented Dec 28, 2018

All numbers are doubles in the context of EPPlus formula calculation, but we use double.Epsilon to avoid floating point issues when comparing doubles internally. How the cell is formatted does not affect how the numeric values are stored.

I assume that you are saying that two different ranges of integer values that sums up to the same integer value still are evaluated as different results?

Will try to recreate your problem with the formula you have supplied.

@swmal
Copy link
Collaborator

swmal commented Dec 28, 2018

The following test is passing when I try, maybe I have misunderstood. Can you change it so it fails according to your description?

        [TestMethod]
        public void Issue379()
        {
            using (var pck = new ExcelPackage())
            {
                var sheet1 = pck.Workbook.Worksheets.Add("Sheet1");
                sheet1.Cells["A1"].Value = 1;
                sheet1.Cells["A2"].Value = 2;
                sheet1.Cells["A3"].Value = 3;
                var sheet2 = pck.Workbook.Worksheets.Add("Sheet2");
                sheet2.Cells["A1"].Value = 1;
                sheet2.Cells["A2"].Value = 2;
                sheet2.Cells["A3"].Value = 3;
                var sheet3 = pck.Workbook.Worksheets.Add("Sheet3");
                sheet3.Cells["A1"].Formula = "IF(SUM(Sheet1!A1:A3) = SUM(Sheet2!A1:A3), \"Equals\", \"Not Equals\")";
                sheet3.Calculate();

                Assert.AreEqual("Equals", sheet3.Cells["A1"].Value);
            }
        }

@swmal swmal self-assigned this Dec 28, 2018
@derouyag
Copy link
Author

derouyag commented Dec 28, 2018 via email

@swmal
Copy link
Collaborator

swmal commented Dec 28, 2018

Ok, thanks. Just for clarity - EPPlus does not change any values in the cells or the way they are calculated due to the parameters you are mentioning, they are just forwarded to the spreadsheet program (i.e. Excel) to use. EPPlus just stores the numeric values as they are either set via its interface or read from the open office xml. So these parameters will not change the result of the calculation.

Will retry the test with your edited version of the test, but the logic in EPPlus is quite simple. If the difference between two decimal numbers is less than double.Epsilon they will be treated as equal, otherwise not.

@derouyag
Copy link
Author

derouyag commented Jan 2, 2019 via email

@derouyag
Copy link
Author

derouyag commented Jan 3, 2019 via email

@swmal
Copy link
Collaborator

swmal commented Jan 3, 2019

Hi Gordon!

The last test you provided was useful and the problem lies in that we cast all numbers to double in EPPlus. I could easily make your test pass by casting doubles to decimals in the SUM function.

We need some time to discuss how to solve this permanently in a generic way.

If you are in a hurry to have this resolved you can clone EPPlus and change Sum.cs in the namespace OfficeOpenXml.FormulaParsing.Excel.Function to the code below.

All the best!

Mats

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using OfficeOpenXml.FormulaParsing.ExpressionGraph;
using OfficeOpenXml.Utils;
using OfficeOpenXml.FormulaParsing.Exceptions;

namespace OfficeOpenXml.FormulaParsing.Excel.Functions.Math
{
    public class Sum : HiddenValuesHandlingFunction
    {
        public override CompileResult Execute(IEnumerable<FunctionArgument> arguments, ParsingContext context)
        {
            var retVal = 0m;
            if (arguments != null)
            {
                foreach (var arg in arguments)
                {
                    retVal += Calculate(arg, context);                    
                }
            }
            return CreateResult(Convert.ToDouble(retVal), DataType.Decimal);
        }

        
        private decimal Calculate(FunctionArgument arg, ParsingContext context)
        {
            var retVal = 0m;
            if (ShouldIgnore(arg))
            {
                return retVal;
            }
            if (arg.Value is IEnumerable<FunctionArgument>)
            {
                foreach (var item in (IEnumerable<FunctionArgument>)arg.Value)
                {
                    retVal += Calculate(item, context);
                }
            }
            else if (arg.Value is ExcelDataProvider.IRangeInfo)
            {
                foreach (var c in (ExcelDataProvider.IRangeInfo)arg.Value)
                {
                    if (ShouldIgnore(c, context) == false)
                    {
                        CheckForAndHandleExcelError(c);
                        retVal += Convert.ToDecimal(c.ValueDouble);
                    }
                }
            }
            else
            {
                CheckForAndHandleExcelError(arg);
                retVal += Convert.ToDecimal(ConvertUtil.GetValueDouble(arg.Value, true));
            }
            return retVal;
        }
    }
}

@derouyag
Copy link
Author

derouyag commented Jan 4, 2019 via email

@swmal
Copy link
Collaborator

swmal commented Jan 4, 2019

We haven’t looked much into the floating point challenge yet. We are handling it on operators, but haven’t had much issues on calculations so far. From the besinning we did the formula engine mostly for fun and it has not been clear to us how much it is used.

Most likely an interesting challenge, but as you write we need to look at how others (including Excel!) have solved it and find a generic solution for all functions.

@kjkrum
Copy link

kjkrum commented May 31, 2019

I was shocked and dismayed to find that my ExcelRange.Value contained a double, and found this issue while looking for a way to fix that. How feasible would it be to make the library read all numeric values as decimal?

@swmal
Copy link
Collaborator

swmal commented Feb 8, 2020

@kjkrum That is how EPPlus is built from the beginning - i.e. all numeric values are doubles all the way down to the cell store. Excel itself stores numeric values in the IEEE 754 binary 64-bit floating point format and the .NET double type is same size and standard.

We could of course change double to decimal, but that would consume twice the memory for storing them.

@kjkrum
Copy link

kjkrum commented Feb 11, 2020

@swmal Interesting. Apparently Excel uses some tricks to hide some (but not all) floating point errors, and its calculations are almost-but-not-quite IEEE 754. Accountants know and just live with it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants