Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inspection for Range.Value2 over Range.Value #3320

Open
IvenBach opened this issue Aug 30, 2017 · 17 comments
Open

Inspection for Range.Value2 over Range.Value #3320

IvenBach opened this issue Aug 30, 2017 · 17 comments
Labels
difficulty-02-ducky Resolving these involves the internal API, but with relatively easy problems to solve. enhancement Feature requests, or enhancements to existing features. Ideas. Anything within the project's scope. feature-inspection-quickfixes feature-inspections library-specific up-for-grabs Use this label in conjunction with a difficulty level label, e.g. difficulty-02-ducky

Comments

@IvenBach
Copy link
Member

There are be some instances where the rounding of .Value could/should be avoided with .Value2. Various sources have inclined me to believe .Value2 is sightly faster but I'm not overly worried about micro-optimizations.

@daFreeMan
Copy link
Contributor

daFreeMan commented Aug 30, 2017

Would that be a blanket "You should consider using .Value2 instead of .Value because {reasons}" with appropriate quick fixes, or are there certain conditions you're thinking about?

@retailcoder retailcoder added difficulty-02-ducky Resolving these involves the internal API, but with relatively easy problems to solve. feature-inspection-quickfixes feature-inspections library-specific up-for-grabs Use this label in conjunction with a difficulty level label, e.g. difficulty-02-ducky labels Aug 30, 2017
@retailcoder
Copy link
Member

retailcoder commented Aug 30, 2017

https://stackoverflow.com/a/17363466/1188513

From Charles William's blog:

  • .Value can seriously damage your numbers
  • .Value2 is faster than .Value with numbers (no significant difference with text)
  • .Value2 using a variant array is much the fastest way to go

I'm failing to see how not using an actual Date for a Date is an advantage in any way, so I'd go with:

  • For a Date value, use .Value; .Value2 gives you the date coerced into a Double.
  • For anything else, prefer .Value2.

So IMO the inspection should only fire when the assignment target isn't declared as a Date.

@FastExcel
Copy link

FastExcel commented Aug 31, 2017

The problem is that Excel does not have a true Date/Time or Currency data-type - excel dates and times are just doubles that depend on whatever format has been applied or changed by the user to appear as dates, times or currency or just a number. So Value is coercing an Excel double to a VBA date but Value2 is not doing any coercing ... For dates coercing the double to a date is probably not doing any damage as long as the code understands that its dependent on a changeable format: pros and cons either way - what we really need is more native Excel data types to avoid this problem. But given the need to grab a large range which could contain both dates, currency and numbers into a variant array I personally always prefer Value2 to avoid the possibility of losing precision on currency, and the performance gain is useful.
Also since Worksheet functions cannot handle VBA date or Currency Datatypes IMHO its unwise to implicitly create then using .Value.

@Vogel612 Vogel612 added the enhancement Feature requests, or enhancements to existing features. Ideas. Anything within the project's scope. label Nov 25, 2017
@retailcoder retailcoder added the hacktoberfest Tags issues for Hacktoberfest 2019 label Sep 26, 2018
@comintern
Copy link
Contributor

I'm dubious as to the rationale in regard to Currency and Date data types. TMO, the fractional portion of a Double exceeds the required precision for a Date type by several orders of magnitude. The coercion that Excel does is almost certainly just a call to VarDateFromR8, which is the same "cast" that VBA would make (and I highly suspect that it just changes the VARENUM from VT_R8 to VT_DATE and doesn't even touch the value. Same thing with Currency. If Excel doesn't use the currency type natively, I'd be shocked if either coercion did anything other than simply call VarCyFromR8. So... does it really matter in terms of precision loss if Excel is calling the oleaut function or VBA is calling it?

I'd want to see some reproducible examples of values that are getting mangled in translation before writing a recommendation based on differences in the cast values retrieved. I'm not seeing it:

Sub test()
    [A1].NumberFormat = "General"
    [A1].Value = 123.456789
    [A1].NumberFormat = "$#,##0.00"
    Debug.Print [A1].Value = [A1].Value2
End Sub

If it's just a matter of a small performance gain, I'm not sure this is something that RD should be concerning itself with. For example, if the code is using .Value once, is it really worthwhile to propose changing it to .Value2? If it's in a tight loop, are we supposed to guess a threshold at which a micro-optimization would be worth recommending?

Note also, that the two statements "Value can mangle your numbers" and "changing Value to Value2 will not alter the functioning of existing code" are mutually exclusive.

@FastExcel
Copy link

FastExcel commented Oct 25, 2018 via email

@comintern
Copy link
Contributor

comintern commented Oct 28, 2018

Still don't see it. Excel isn't "mangling" anything. It's casting it, which is the expected behavior based on its documentation. VBA casts it in exactly the same way...

I'd be shocked if either coercion did anything other than simply call VarCyFromR8. So... does it really matter in terms of precision loss if Excel is calling the oleaut function or VBA is calling it?

The result is exactly the same thing:

Sub test()
    [a1].NumberFormat = "General"
    [a1].Value = 123.456789
    [a1].NumberFormat = "$#,##0.00"
    
    Dim varValue As Variant, varValue2 As Variant
    varValue = [a1].Value
    varValue2 = [a1].Value2
    
    Debug.Print TypeName(varValue)          'Currency
    Debug.Print TypeName(varValue2)         'Double
    Debug.Print varValue = CCur(varValue2)  'True
End Sub

This isn't "seriously damaging your numbers". It's just as valid to say that "if a cell is formatted as Currency, retrieving its value as a Double will yield an incorrect result". This also reinforces my reluctance to make a blanket recommendation about the use of .Value over .Value2. If anything, it simply demonstrates that the .NumberFormat is part of the data, and the differences between the two are fairly well documented. This isn't a case of "" v. vbNullString - the two are not equivalent means of retrieving values from a worksheet at all, so IMO Rubberduck has no real business suggesting a change that has the potential to break existing code.

Note that none of this speaks to the performance claim, which is that VBA can cast a Double to Currency appreciably faster than Excel can. Even if that were true, suggesting changes to code while remaining fundamentally ignorant of the functional impact of those changes strikes me as irresponsible.

I'm recommending [status-declined] for this.

@FastExcel
Copy link

FastExcel commented Oct 29, 2018 via email

@comintern
Copy link
Contributor

First, to be clear, I do not consider all of the statements to be true. In fact, the example code shows at very least the statement "Excel does not have a native currency data type so cannot cast values to currency" to be demonstratively false. That said...

A) Which method should VBA programmers generally use to retrieve values from Excel? The choices are default method, explicit .Value or explicit .Value2

They should use .Value or .Value2 as the situation warrants.

B) Should Rubberduck (optionally) warn VBA programmers if they use the default or explicit .Value method?

Rubberduck should warn about using default member access. It already does that. Rubberduck should not issue a warning about using .Value that may mislead programmers into breaking existing code.

@bclothier
Copy link
Contributor

bclothier commented Oct 29, 2018

@FastExcel

I agree (for little what I know about Excel) except for this point:
The .Value2 method does not do this silent casting

We need to have a MCVE that demonstrates this. So I figure we'd use the famous "0.1 + 0.1 + 0.1 = 0.3". With Excel's equality check, they are obviously doing more than just a straight equality. See the screenshot:

image

In VBA (recall that by default fractional numbers are implicitly double):

?(0.1 + 0.1 + 0.1 ) = 0.3
False

Let's try and read the values directly and see what happens:

?Sheet1.Range("C1").Value + Sheet1.Range("C2").Value + Sheet1.Range("C3").Value = Sheet1.Range("D4").Value
False
?SHeet1.Range("C1").Value2 + Sheet1.Range("C2").Value2 + Sheet1.Range("C3").Value2 = Sheet1.Range("D4").Value2
False

Because VBA's equality is strict, even a single bit difference in the value is enough to upset the apple cart. Even though we were reading directly from the sheet's cells without any casts, Excel and VBA clearly don't agree on the results. That's not due to casting. That's due to difference in the implementation of equality check (and possibly the SUM vs 3 +s, too).

So, let's format them as currency:

image

No changes in the results from Excel's side. I will just read the values as-is, without assignment to any currency variable:

?Sheet1.Range("C1").Value + Sheet1.Range("C2").Value + Sheet1.Range("C3").Value = Sheet1.Range("D4").Value
True
?Sheet1.Range("C1").Value2 + Sheet1.Range("C2").Value2 + Sheet1.Range("C3").Value2 = Sheet1.Range("D4").Value2
False

.Value yields the same result as Excel does whereas Value2 still disagrees.

I've always attributed this to the fact that Excel uses some kind of less-than-strict equality or rounding. If it implemented the IEEE specifications on how to handle floating arithmetic, there'd be lot of angry customers because their financial reports would be off by several digits and to every 5th grade kids, it's obviously 0.1 + 0.1 + 0.1 = 0.3. Note that this also implies being careful in how you set up the formulas, which is what Excel team has done with their formulas. Though the doubles are subject to rounding errors, you can minimize it by re-arranging your formula. Wikipedia has a good example of how you can avoid this. So as such, I'm not 100% convinced it's just a casting problem.

@wqweto
Copy link

wqweto commented Oct 29, 2018

So, let's format them as currency:

. . . and now we start getting

? typename(Sheet1.Range("C1").Value)
Currency

. . . instead of Double the way Value2 data-type (still) remains.

@bclothier
Copy link
Contributor

Correct, but the point is that if you write VBA formula, you are going to be doing it without Excel's extra work to make a "little unequal" equal; and the casting wouldn't help you avoid that fundamental problem because it doesn't exist in VBA. That's why I said it's not just a casting problem.

@FastExcel
Copy link

FastExcel commented Oct 29, 2018 via email

@comintern
Copy link
Contributor

Could you explain how it shows that?
Sounds like your understanding of Excel internals is radically different to mine.

Yes.

    varValue = [a1].Value
    varValue2 = [a1].Value2
    
    Debug.Print TypeName(varValue)          'Currency

^^^ Who is casting that if Excel isn't? varValue is declared as Variant.

@FastExcel
Copy link

FastExcel commented Oct 29, 2018 via email

@FastExcel
Copy link

FastExcel commented Oct 29, 2018 via email

@comintern
Copy link
Contributor

Bottom line: formatting a cell as currency does not change the number in the cell or its Excel datatype in any way.

And this is the source of the mis-understanding. I'm not claiming that the underlying data that Excel is storing is being changed. I'm looking at this from the domain of VBA, and VBA interacts with Excel via its "object model interface", just like any COM client interacts with any COM server. The cast happens outside of VBA, and it is performed by Excel acting in the role of a COM server.

What I am saying is that when a COM client is presented with 2 different interfaces that behave in 2 different ways, it is fundamentally impossible to determine if the intent of the programmer was to have one behavior or the other. One is not inherently "better" than the other. They are simply different.

@FastExcel
Copy link

FastExcel commented Oct 29, 2018 via email

@bclothier bclothier removed the hacktoberfest Tags issues for Hacktoberfest 2019 label Nov 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty-02-ducky Resolving these involves the internal API, but with relatively easy problems to solve. enhancement Feature requests, or enhancements to existing features. Ideas. Anything within the project's scope. feature-inspection-quickfixes feature-inspections library-specific up-for-grabs Use this label in conjunction with a difficulty level label, e.g. difficulty-02-ducky
Projects
None yet
Development

No branches or pull requests

8 participants