Avoiding the precision loss footgun #15

harrysarson · 2019-05-24T10:22:44Z

UnitMath is shaping up nicely :) I have a question:

From the README (v0.3)

This means that if your custom type cannot be represented exactly by a floating-point number, you must use the two-argument unit(value, unitString) to construct units:
// Correct
let d = unit(Decimal('3.1415926535897932384626433832795', 'rad'))
let f = unit(Fraction(1, 2), 'kg') 
// Incorrect
let d = unit('3.1415926535897932384626433832795 rad') // Will lose precision
let f = unit('1 / 2 kg') // Parse error

This seems like an unnecessary footgun to introduce and I wonder if it is not possible to provide a better alternative.

The text was updated successfully, but these errors were encountered:

harrysarson · 2019-05-24T10:27:05Z

What I propose (largely thinking out loud) is the following (for one-argument constructor).

If the first character of the input is a number, split the input on the first whitespace character, give everything before the space to the constructor and parse everything after the space as the unit.
If the first character of the input is not a number parse the entire input as a unit.

The means that unit('1 / 2 kg') will still error (trying to parse unit of / 2 kg) but unit('1/2 kg') will work.
It is also now impossible to accidentally lose precision.

Thoughts?

ericman314 · 2019-05-24T13:47:53Z

I appreciate you giving this some thought, Harry. Would indeed be nice if that could work, especially for the Decimal.js case. The reason I wrote that message in the readme was that I considered it a good compromise between two extremes. On the one extreme, we could attempt to parse those custom types as you suggest, by passing everything up to the first whitespace to the type.conv function. But without knowing the details of that library, there's no guarantee the library even has the ability to parse a string. (And if it does, what if the custom type itself has whitespace?) On the other extreme, we could disallow the single-string constructor entirely, but I chose not to do that because I would have had to rewrite all the built-in unit definitions, which use the single-string version. The user is already required to provide a function to convert a number to the custom type, so I considered that that was where I would draw the line. UnitMath will convert a IEEE 754 number to your custom type, since that is what it already does with the built-in units, but anything beyond that and you'll have to use the two-argument constructor.

Another possibility is to allow the user to control how the custom type is parsed from a string, thus allowing the single string constructor, by providing a type.parse function. This would accept a string, consume the custom type portion of the string, and return the custom type and the unconsumed portion of the string. But that seems like unnecessary complexity. I feel like if the user has already gone to lengths to extend a custom type, they will be capable of constructing units of that type without losing precision.

One more possibility that might be a better compromise: The line below in Parser.js converts the parsed numeric portion of the string to a number, then to the custom type:

UnitMath/src/Parser.js

Line 194 in c7ce5df

unit.value = options.type.conv(parseFloat(valueStr))

We could skip the parseFloat step when there is a custom type.conv. In the case of Decimal.js at least, parseFloat is unnecessary since Decimal.js can convert strings directly to Decimals. Since the user is technically only required to provide a function to convert IEEE 754 numbers to the custom type, we would need to provide an option to re-enable parseFloat if the user so desires. The parser would still not work with 1/2, but at least it would not unnecessarily throw away extra digits of Decimals.

ericman314 · 2019-05-25T05:38:54Z

I've adjusted things so that those lines in Parser.js look like this now:

UnitMath/src/Parser.js

Lines 194 to 198 in 3260233

    
           unit.value = options.type.parse(valueStr) 
        
           if (options.type.parse._IS_UNITMATH_DEFAULT_FUNCTION && !options.type.conv._IS_UNITMATH_DEFAULT_FUNCTION) { 
        
             // User supplied a conv function but not a parse. Use default parse, THEN use conv. 
        
             unit.value = options.type.conv(unit.value) 
        
           }

Each of the type. functions does this:

Function	Converts
`type.parse` (default)	string -> number
`type.parse` (user-supplied)	string -> custom
`type.conv`	number -> custom

If there is a custom type.parse, it will convert the numeric portion of the input string to the custom type in one step. If there is no custom type.parse, then it converts the input string to a number, and then converts that to the custom type. This means that:

unit('3.1415926535897932384626433832795 rad')

results in no loss of precision when type.parse is provided. Unfortunately, this only works for numbers that are in decimal or scientific notation. For fractions and complex numbers, you still have to use the two-argument constructor, but now both arguments can be strings:

unit('1 / 2', 'kg') 
unit('4 + 3i', 'kVA')

Do you think this is acceptable?

harrysarson · 2019-05-25T10:03:51Z

I think this hits the nail on the head:

The parser would still not work with 1/2, but at least it would not unnecessarily throw away extra digits of Decimals.

Any solution that results in no loss of precision would be great in my book 👍

Do you think this is acceptable?

Yes, with one caveat:

If there is no custom type.parse, then it converts the input string to a number, and then converts that to the custom type.

My opinion is that this should throw an error.

I would go even further and say that simply throwing an error if the one argument constructor is invoked when using a custom type is acceptable. However, the solution you propose is much better than this.

I am arguing that if the constructor throws an error, a user can pull up the unitmath README and realise what they need to change. If the constructor silently loses precision then the user will have hard to track down bugs appearing in their program sometime in the future.

josdejong · 2019-05-25T12:36:14Z

Interesting discussion. Would it work to just have a parse option which should be able to parse both string or number input into the custom numeric type? So you don't need a separate parse and conv?

josdejong · 2019-05-25T12:39:37Z

And maybe it's just fine not to accept things like unit('1 / 2 m') to be parsed into a fraction? You can always parse via unit(fraction('1/2'), 'm'), and if needed, the user can write it's own wrapper to support this type of input when needed. It's quite an edge case I think.

ericman314 · 2019-05-25T13:43:26Z

@harrysarson, I was a little uneasy about that as well. I will add an error as appropriate to ensure that the string --> number --> custom conversion path is never invoked.

Would it work to just have a parse option which should be able to parse both string or number input into the custom numeric type?

@josdejong, possibly we could do that, but I thought it would be good to have two separate functions so that the user doesn't have to do their own type checking if they don't want to. Once Harry's latest suggestion is implemented, conv and parse will be used for totally different purposes in the library, so I think it's good to keep them logically separated as well. But you could use the same function for each. For the two custom types I've looked at, Decimal.js and Fraction.js, both are capable of converting either a string or a number to the custom type. The conv function will always be required, and will throw an error immediately if not supplied, but parse will be optional and will not throw unless you try to invoke a constructor where the value is a string.

So in summary, this is what will happen:

If the user supplies conv and parse:

unit('rad') // Works
unit('3.14159 rad') // Works
unit('3.14159', 'rad') // Works
unit(3.14159, 'rad') // Works
unit(Decimal('3.14159', 'rad') // Works
unit('1 / 2 rad') // Parse error
unit('1 / 2', 'rad') // Works
unit(Fraction('1 / 2'), 'rad') // Works

If the user only supplies conv:

unit('rad') // Works
unit('3.14159 rad') // Parse error
unit('3.14159', 'rad') // Parse error
unit(3.14159, 'rad') // Works
unit(Decimal('3.14159', 'rad') // Works
unit('1 / 2 rad') // Parse error
unit('1 / 2', 'rad') // Parse error
unit(Fraction('1 / 2'), 'rad') // Works

josdejong · 2019-05-25T14:02:58Z

I get it. Maybe though in that case it's not needed to embed parse into the library, since the user can wrap the constructor himself too? Like unit(parse('1 / 2 rad')). It would be nice to have just a single "parse" function, that's much easier to understand. If it's needed it's needed of course, but it would be great to keep the API as plain and simple as possible.

ericman314 · 2019-05-26T05:08:34Z

After considering all your comments and giving this a lot of thought, I have put together a PR that combines parse and conv, and adds error messages when omitting required type functions. I would appreciate any feedback you have. #17

josdejong mentioned this issue May 25, 2019

How to prevent users from accidentally forgetting to implement a custom type function? #16

Closed

ericman314 closed this as completed Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoiding the precision loss footgun #15

Avoiding the precision loss footgun #15

harrysarson commented May 24, 2019

harrysarson commented May 24, 2019

ericman314 commented May 24, 2019 •

edited

Loading

ericman314 commented May 25, 2019

harrysarson commented May 25, 2019

josdejong commented May 25, 2019

josdejong commented May 25, 2019

ericman314 commented May 25, 2019

josdejong commented May 25, 2019

ericman314 commented May 26, 2019

Avoiding the precision loss footgun #15

Avoiding the precision loss footgun #15

Comments

harrysarson commented May 24, 2019

harrysarson commented May 24, 2019

ericman314 commented May 24, 2019 • edited Loading

ericman314 commented May 25, 2019

harrysarson commented May 25, 2019

josdejong commented May 25, 2019

josdejong commented May 25, 2019

ericman314 commented May 25, 2019

josdejong commented May 25, 2019

ericman314 commented May 26, 2019

ericman314 commented May 24, 2019 •

edited

Loading