Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better isXXX checks #321

Merged
merged 2 commits into from
Oct 24, 2011
Merged

better isXXX checks #321

merged 2 commits into from
Oct 24, 2011

Conversation

michaelficarra
Copy link
Collaborator

The current tests allow for too many false positives for my liking. This also changes _.isNumber(NaN) from false to true. It really should have been true in the first place.

@jashkenas
Copy link
Owner

So -- here's the history. Underscore.js isType checks were originally implemented like this, but a pull request was submitted -- and merged -- to make them duck-type instead. The rationale being that they're often present in the innermost of inner loops: argument checking, isEqual, dispatch, stuff along those lines. The variants you propose here are (or at least, were at the time) massively slower than simple property checks, to the point where it would make a difference for some applications.

But JavaScript engines, minus IE7 and IE8, have improved drastically since those times, so perhaps it's no longer something to worry about.

If you want to get these changed back to the way they used to be, it would be great to get a comprehensive JSPerf to go along with 'em.

@michaelficarra
Copy link
Collaborator Author

@jashkenas: will do.

@jashkenas
Copy link
Owner

So, finally went and benched this out of curiosity ... Underscore's current property checks are still far faster than the proposed alternatives. Up to 9x faster in Chrome.

http://jsperf.com/underscore-js-istype-alternatives

Closing this patch as a wontfix -- I'm still of the impression that although you can manufacture objects to break the isType checks, they tend to work pretty well in real world code.

@jashkenas jashkenas closed this Oct 17, 2011
@michaelficarra
Copy link
Collaborator Author

:( That's unfortunate, but an understandable decision.

@jashkenas
Copy link
Owner

If people are actually running into false positives in real programs, we should reconsider ... but I don't think I've ever seen a report of that happening.

Using JS objects as hashes is fraught with other problems ... but the isType checks should be able to tell strings from functions from arrays from regexes efficiently.

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

I edited the jsPerf test case. The speed differences are no longer as dramatic in Chrome and I added jQuery as a performance sanity check. As for a false positive, I did a quick search of the issues and came up with this. Here is a summary of my comments on the jsPerf test.

JSPerf Comments

Comment 1

I doubt the speed diff will matter in real world use.
I think using the internal [[Class]] is best because it's consistent with ES5.1 spec, _.isArray(), and with popular libs like jQuery which are often paired with underscore.js.

I believe @kitcambridge also mentioned that by using the internal [[Class]] underscore.js could simplify its _.isEqual() method too.

Also keep in mind there are differences in the results of these implementations. For example

_.isNumber(NaN); // false
_.isNumberNew(NaN); // true
jQuery.type(NaN); // "number"

As with any sugar if performance ever becomes an issue devs can drop back to de-sugared simple typeof checks where suitable.

Note: I tweaked the test to work in IE and added jQuery as a sanity check.


Comment 3

The concern over performance in these sugar methods is misplaced. It's better for these methods to follow spec precedents (set by Array.isArray) to avoid potential false positives and create consistent behavior in the API (aligning with _.isArray).

As for _.isNumber, I wasn't debating the merits of one implementation over the other. I was only giving a heads-up to their differences.

Though by your preference Underscore.js v1.2.0 is actually inconsistent with what it considers a number because it allows _.isNumber(Infinity); // true.

You can't go wrong following spec and checking the internal [[Class]]. If someone has a WTF moment with NaN or Infinity they can take that time to actually read the spec and learn something about the language.

NaN.constructor == Number; // true
Infinity.constructor == Number; // true
typeof NaN.toFixed // function
typeof Infinity.toPrecision // function

var n = Object(NaN);
isNaN(n); // true
n == n; // true

Number.prototype.toJSON = function() { return isFinite(this) ? this : String(this); };
JSON.stringify(n); // '"NaN"'

The concern over NaN and Infinity are edge cases and can be handled, when needed, with isFinite().

@michaelficarra
Copy link
Collaborator Author

@jdalton makes some very good points. Correctness (with isNumber and isFunction) should trump a 30-50% speed bonus. If we look at it the other way around (assume we had these proposed implementations and the current implementation was being proposed), would we accept a few incorrect results for a 2-3x faster test? I don't think so.

@jashkenas
Copy link
Owner

@jdalton: Whoa -- you just edited the original JSPerf comparison? I didn't realize it was possible for other people to do that on JSPerf.com

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

@jdalton: Whoa -- you just edited the original JSPerf comparison? I didn't realize it was possible for other people to do that on JSPerf.com

It's only possible by the jsPerf site maintainers. @mathias and I are the only ones that can.

(I wrote benchmark.js and the JS that handles jsPerf's UI and Browserscope reporting/charting. I also have commit rights to Browserscope).

@jashkenas
Copy link
Owner

Ok -- then do you mind explaining what exactly you changed so that a 90% performance difference yesterday is down to a 30% performance difference today, in the same browser?

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

Yap.

  1. I changed isArgumentsNew() from:
isArgumentsNew: function(obj) {
  return toString.call(obj) == '[object Arguments]' || !!(obj && hasOwnProperty.call(obj, 'callee'));
}

to

isArgumentsNew: toString.call(arguments) == '[object Arguments]'
  ? function(obj) {
      return toString.call(obj) == '[object Arguments]';
    }
  : function(obj) {
      return !!(obj && hasOwnProperty.call(obj, 'callee'));
    }

You were getting hit double for non-argument object values passed to isArgumentsNew().

  1. You forgot to define the hasOwnProperty and toString variables which caused IE to barf but also caused other browsers to do a global lookup of window.hasOwnProperty and window.toString.

  2. I removed the self executing function expression from each test because the compiled test has an arguments object and made the _, $, and other variables local.

(I should've probably used `isArgumentsNew() for the jQuery sanity check as it's more in line w/ its other implementations but it still gets the point across)

@jashkenas
Copy link
Owner

Now I'm just deeply mystified. Doing a new edit, and putting the arguments function back the was it was before @jdalton rewrote it -- I'm seeing the same 1.4x difference instead of yesterday's 9x difference.

@jdalton: Is it possible for you to revert the page to the original revision?

@michaelficarra
Copy link
Collaborator Author

@jashkenas: if you can reopen this pull request for me, I have another commit to push to it. Github doesn't add extra commits to closed pull requests, only open ones. I've redone the isArguments definition based on @jdalton's version:

// Is a given variable an arguments object?
_.isArguments = toString.call(arguments) == '[object Arguments]'
  ? function(obj) {
    return toString.call(obj) == '[object Arguments]';
  }
  : function(obj) {
    return obj ? hasOwnProperty.call(obj, 'callee') : false;
  };

I think an additional check for a length property would be appropriate for the duck-typed test:

return obj ? hasOwnProperty.call(obj, 'callee') && hasOwnProperty.call(obj, 'length') : false;

Thoughts?

@jashkenas jashkenas reopened this Oct 19, 2011
@jashkenas
Copy link
Owner

Sure, reopened. I'd actually perf test the two isArguments implementations side by side before choosing one over the other.

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

@jashkenas please revisit my previous comment. I've clarified it a bit.

@jashkenas
Copy link
Owner

There we go -- it was just the global lookups of hasOwnProperty and toString that made for most of the difference -- thanks for fixing it.

@jashkenas
Copy link
Owner

Let's be clear here -- is the only difference between the two implementations the result of:

_.isNumber(NaN)

Incompatibilities that arise when extending native objects and then trying to use Underscore don't count -- because large swaths of Underscore won't work properly with extended natives, not just the isType tests.

If it's just _.isNumber(NaN), and the difference on mobile browsers and Safari is still 4x or 5x in some cases, I think it's still murky which to prefer.

@michaelficarra
Copy link
Collaborator Author

@jashkenas: as a follow-up to my last comment, here's a perf test showing zero performance difference between the callee-only test and a test for both callee and length: http://jsperf.com/different-isarguments-implementations

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

@jashkenas:

Incompatibilities that arise when extending native objects and then trying to use Underscore don't count -- because large swaths of Underscore won't work properly with extended natives, not just the isType tests.

Sounds like some design flaws w/ underscore.js.
(design flaws most likely created by the same "premature optimizations above all" mentality)

If it's just _.isNumber(NaN), and the difference on mobile browsers and Safari is still 4x or 5x in some cases, I think it's still murky which to prefer.

I think you're getting hung up on these perf details and losing a bit of context.
I added jQuery as a sanity check to show that it is slower than both underscore.js implementations and the web isn't beating down jQuery's door to prematurely optimize its jQuery.type() method. Also the isType method with arguably the most utility, _.isArray(), already uses native/[[Class]]-lookup implementations.

As with other underscore.js issues, correctness and consistency trump these kinds of perf concerns.

@ghost
Copy link

ghost commented Oct 19, 2011

@michaelficarra In Chrome 14 and 15, though, the arguments object has a [[Class]] name of Arguments, so both proposed and strict are using the toString.call(obj) == '[object Arguments]' fork.

@michaelficarra
Copy link
Collaborator Author

@kitcambridge: crap, I forgot about that. Fixing...

update: fixed

@ghost
Copy link

ghost commented Oct 19, 2011

@michaelficarra Much better, thanks. The difference in performance is negligible in IE 6, where it matters most.

@jashkenas
Copy link
Owner

Quoting @broofa:

That isNumber(NaN) should return true is to me both theoretically bizarre
and practically useless. isNumber(x) should tell you whether or not you
can treat x like a number. If x is NaN, you can’t. (By definition!) And
there’s all sorts of behavioral evidence of this.
x+1 -> NaN, NaN == NaN -> false, and JSON.stringify(NaN) -> 'null'.

Show of hands -- because the check can be easily made to work either way. Do you think that _.isNumber(NaN) should return true or false? I lean towards false, because of the tautological truth that not a number is not a number.

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

Quoting myself:

Though by your preference Underscore.js v1.2.0 is actually inconsistent with what it considers a number because it allows _.isNumber(Infinity); // true.

You can't go wrong following spec and checking the internal [[Class]]. If someone has a WTF moment with NaN or Infinity they can take that time to actually read the spec and learn something about the language.

<code snippet>

The concern over NaN and Infinity are edge cases and can be handled, when needed, with isFinite().

My vote is that _.isNumber(NaN) and _.isNumber(Infinity) should return true.

@misfo
Copy link

misfo commented Oct 19, 2011

If

_.isNumber(NaN) === true

then shouldn't _.isNumber be renamed to _.isNumberOrIsNotANumber? :)

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

I think you're getting hung up the wording. Both Infinity and NaN have Number.prototype as their [[Prototype]].

@josh
Copy link
Contributor

josh commented Oct 19, 2011

Some definitions from ECMA spec

Number value primitive value corresponding to a double-precision 64-bit binary format IEEE 754 value. NOTE A Number value is a member of the Number type and is a direct representation of a number.
Number type set of all possible Number values including the special ―Not-a-Number (NaN) values, positive infinity, and
negative infinity
Number object member of the Object type that is an instance of the standard built-in Number constructor
NaN number value that is a IEEE 754 Not-a-Number value

The definition of number conversion is also describe such that undefined converted to type Number is NaN

http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf

All of the underscore isXXX checks look at the type not the class of the value. So the prototype argument doesn't really apply. However, NaN is still stated to be included in the set of number values and is the result of a conversion to type Number.

> Number(undefined)
NaN

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

@arbales

Libraries like Underscore exist to help us handle otherwise troublesome scenarios in a way that maps to their expectations.

Libraries like jQuery and Prototype (whose API underscore.js seems to be a bit inspired by) also use the internal [[Class]] value in their _.isNumber() equivalents.
(meaning their methods would return true for NaN and Infinity too)

@josh

All of the underscore isXXX checks look at the type not the class of the value. So the prototype argument doesn't really apply.

Not so. The _.isArray() method uses the internal [[Class]] value and the rest duck-type (which this pull request would change).

@allenwb
Copy link

allenwb commented Oct 19, 2011

Quoting @broofa:

That isNumber(NaN) should return true is to me both theoretically bizarre
and practically useless. isNumber(x) should tell you whether or not you
can treat x like a number. If x is NaN, you can’t. (By definition!) And
there’s all sorts of behavioral evidence of this.
x+1 -> NaN, NaN == NaN -> false, and JSON.stringify(NaN) -> 'null'.

A NaN values are members of JS Number data type just like (and because) they are members of the IEEE 754 numeric types. The terminology "Not a Number" is a IEEE 754 term and note that they don't also use "number" as the name of a data type so in its original context there was no confusion about the name. BTW, it is IEEE 754 that defines that NaN values are not "equal" to anything else, including themselves.

Because NaN is a Number, you can use it in numeric computations. All the built-in numeric values and Math functions have explicit rules for how they operate upon and propagate NaN values.

@josh
Copy link
Contributor

josh commented Oct 19, 2011

@jdalton lol didn't even look what else changed. If the definition of the is check is changing to match the prototype then that works too. NaN is of value, type, and class a Number. Seems solid.

@keithamus
Copy link

@arbales and @michaelficarra said it best. NaN is a Number, as defined by IEEE 754. However, NaN is not numeric, so the method name for checking if a variable is a Number consisting of 0-9 & ., is isNumeric

@arbales
Copy link

arbales commented Oct 19, 2011

@jdalton I'm not talking about the implementation details of various libraries, but the reason for using a library in the first place. And, I'd say that direct comparisons to the Prototype or jQuery ways are less relevant to Underscore, as we should favor more semantically cogent API decisions. (imho)

@keithamus, I think you might have reversed your answer? Since both Infinity and NaN are things "of or relating to a number" I'd expect them to pass a isNumeric check, but not a isNumber check.

The big issue for me is the contradiction that _.isNumber(NaN) returning true presents. This reads like _.isTrue(kNotTrue) to me. I'm less interested in the specs and numeracy of NaN as implemented, and more interested in ensuring that the API is semantically clear. If being purely on-spec was our goal, then isEqual should really munge types, right? After all, the function name isn't _.isStrictlyEqual.

@jdalton
Copy link
Contributor

jdalton commented Oct 19, 2011

I'm less interested in the specs

Unfortunately others, at some point, have shared your opinion and it's caused nothing but problems in their projects.

@arbales
Copy link

arbales commented Oct 20, 2011

@jdalton, but… we're not talking about a function with a native equivalent. I think its a little unfair to equate "Incorrect ES5 Fallbacks," – which I believe were mostly unintentional – with a conscious decision to favor semantic clarity over unclear uniformity. I'm not saying ignore the spec, I'm saying read it with "plain meaning" — there is no spec for the behavior of isNumber.

Finally, returning true is wrong because NaN is Not A Number; if false is wrong because NaN is of the Number class, than I'd much rather see the addition of an isNumeric function that returns true for all numeric values. At least then there'd be no confusion.

✌️

@ghost
Copy link

ghost commented Oct 20, 2011

@arbales:

Edit: @kitcambridge's own wording supports this: "…Since NaN is a valid numeric value according to the floating-point standard". (Emphasis added)

To clarify: I used the term "numeric value" because writing "the number not-a-number" would have been extremely confusing. Yet, this is precisely what IEEE 754 appears to imply: NaN is a floating-point "number," just as 0.0, 5.5, and 42.3 are floating-point numbers. My example was intended to illustrate that the classification of NaN as a floating-point number is present in other languages.

Languages such as Ruby implement different numeric types (Complex, Rational, Bignum, Float, Fixnum, and Integer), but the NaN value can only be derived from float operations. Hence, 0/0 will raise a ZeroDivisionError, while 0.0/0 will produce the NaN "number" ((0.0/0).is_a? Float). Contrastingly, floating-point values are the only numeric type implemented in JavaScript; thus, NaN is a valid result for any mathematical expression.

@jdalton
Copy link
Contributor

jdalton commented Oct 20, 2011

@arbales In the spirit of fairness other libs like MooTools, ExtJS, and YUI use an isFinite(obj) check to prevent their isNumber equivalents from returning true for NaN/Infinity (which could be an addition to the [[Class]] check).

I would hate for this whole pull request to get flushed because of a debate over the finer points of _.isNumber(). Maybe the details of _.isNumber() should be hashed out in a separate pull request/issue.

@arbales
Copy link

arbales commented Oct 20, 2011

@jdalton very true — I believe ES5 has spec'd both isFinite and isNaN. I think this question is somewhat of a special case, as the those methods illustrate. Type checks for not types, using not the same methodology. Perhaps that is why isNumeric isn't included in the spec to begin with.

The _.isNumber question does seem to muddy this pull request, we've spent a ton of time talking about it. I'd hate to spam the topic to death too, since I think both of our viewpoints are fairly well-expressed at this point. I'd be down to continue in a separate issue though.

@broofa
Copy link

broofa commented Oct 20, 2011

@jdalton: Regarding performance considerations. The isType APIs are used in tight loops. There's real-world cases for that, not the least of which is backbone.js, where every attribute set on a model involves two or more isType() calls (via isEqual()). Updating 100 model objects with 10 attributes may result in 2K-4K isType() invocations which, based on the jsperf data, would translate to additional CPU time of 200-500ms on some mobile devices. More than enough to be noticeable to users and make for some annoying delays in the UI responsiveness.

You may argue that underscore should wait until mobile developers are "beating down the door" for performance improvements but is that realistic? Is it possible that rather than complain, they'll simply switch to one of the numerous other mobile JS frameworks available, that put more of an emphasis on performance?

@michaelficarra
Copy link
Collaborator Author

@broofa: to what other framework are they going to switch? One of the goals of @jdalton's test was to show that jQuery's type method is slower than even our slower, correct proposals.

@jdalton
Copy link
Contributor

jdalton commented Oct 20, 2011

@broofa:

Regarding performance considerations. The isType APIs are used in tight loops. There's real-world cases for that, not the least of which is backbone.js, where every attribute set on a model involves two or more isType() calls (via isEqual()).

Ya know, as I mentioned before, if performance is ever an issue you simply cut back on sugar (yes isType is sugar) in the affected code. You don't have to, and shouldn't, sacrifice correctness/consistency in the sugar though.

@keithamus
Copy link

Sorry @arbales, I misread your original comment. My view is that the following two methods should behave as such:

# Psuedo Code
isNumber: function(a) { return typeof a === 'number'; }
isNumeric: function(a) { return /^[0-9.]+$/.test(a); }

The reason being that if I am checking if something is a Number I want to check that it pertains to the Number class (inc. IEEE definition of NaN and Infinity). If I want to check something matches the human readable definition of a number, i.e something that contains only numeric values, then I would expect to use isNumeric.

@broofa
Copy link

broofa commented Oct 20, 2011

@jdalton:

Ya know, as I mentioned before, if performance is ever an issue you simply cut back on sugar (yes isType is sugar) in the affected code. You don't have to, and shouldn't, sacrifice correctness/consistency in the sugar though.

"Cutting back on sugar" is less of an option in code you don't control. How would a developer using backbone remove the sugar in underscore's isEqual(), upon which backbone relies?

@keithamus: In what circumstances are you testing for a number, but not want to rule out NaN? (I'm looking for something more concrete than "because I want to know if it's an instance of Number". Why do you want to know if it's a number?)

@broofa
Copy link

broofa commented Oct 20, 2011

Hmm. 'Just remembered the jsperf tests are doing 12x tests per iteration, which means the 200-500ms time I quoted earlier is probably more like 15-40ms... so not such a big deal after all. :) So, yeah, maybe the perf aspect of things isn't that big a deal.

@ghost
Copy link

ghost commented Oct 20, 2011

I think the performance concerns are becoming somewhat excessive. To provide some context, many of Underscore's collection methods already implement numerous nested function calls: _.sortBy, for example, calls pluck and map, in addition to the native sort function. pluck relies upon map as well, which, in turn, calls each. This is significantly less efficient than a single call to Object::toString in a one-line type checking function.

Regarding _.isEqual: the original algorithm upon which I based Underscore's current implementation uses [[Class]] checking. This is more efficient because there is no need to manually ensure commutative equality for specific object types. Thus, in the original implementation, if two native ECMAScript objects have different [[Class]] names (determined by a single [[Class]] check), then they are automatically presumed to be inequivalent. Contrastingly, in Underscore's current implementation, the following boilerplate must be implemented for every specific object type we'd like to check:

var isTypeA = _.isType(a), isTypeB = _.isType(b);
if (isTypeA || isTypeB) {
  // `...` represents the type-specific comparison routine.
  return isTypeA && isTypeB && ...;
}

Finally, it seems worthwhile to note that Underscore and other frameworks are already unlikely to appeal to developers who prioritize performance over readability and/or convenience.

@NHQ
Copy link

NHQ commented Oct 21, 2011

And another thing

_.is_belt_or_is_tie(_)

@SeanMcMillan
Copy link

@NHQ: Cravat.

`What a beautiful belt you've got on!' Alice suddenly remarked.

(They had had quite enough of the subject of age, she thought: and if they really were to take 
turns in choosing subjects, it was her turn now.) `At least,' she corrected herself on second 
thoughts, `a beautiful cravat, I should have said -- no, a belt, I mean -- I beg your pardon!' she 
added in dismay, for Humpty Dumpty looked thoroughly offended, and she began to wish she
 hadn't chosen that subject. `If I only knew,' the thought to herself, 'which was neck and which 
was waist!'

Evidently Humpty Dumpty was very angry, though he said nothing for a minute or two. When 
he did speak again, it was in a deep growl.

`It is a -- most -- provoking -- thing,' he said at last, `when a person doesn't know a cravat from a belt!'

`I know it's very ignorant of me,' Alice said, in so humble a tone that Humpty Dumpty relented.

`It's a cravat, child, and a beautiful one, as you say. It's a present from the White King and Queen. There now!' 

@NHQ
Copy link

NHQ commented Oct 21, 2011

@SeanMcMillan Touche!

Now somebody tell the website cuz:

Underscore is a utility-belt library for JavaScript that provides a lot of the functional
programming support that you would expect in Prototype.js (or Ruby), but without
extending any of the built-in JavaScript objects. It's the tie to go along with jQuery's tux.

@jdalton
Copy link
Contributor

jdalton commented Oct 22, 2011

@broofa

"Cutting back on sugar" is less of an option in code you don't control. How would a developer using backbone remove the sugar in underscore's isEqual(), upon which backbone relies?

I think Kit's comment was right on. Projects like backbone.js/underscore.js can reduce unnecessary function calls and use bare bones JS in areas where performance is critical. When I forked Prototype.js and began the process of converting it to my own lib I cut back on the number of API calls used in the internals of its methods. This, in part, helped move its overall performance profile away from Prototype's (which is one of the slowest JS libs) to closer to that of faster libs like jQuery.

@michaelficarra
Copy link
Collaborator Author

I also completely agree with @kitcambridge and @jdalton that, if performance is such a concern, the first place to look is reduction of the average call stack size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet