-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need comprehensive review of why libraries use .call/.bind/.apply #12
Comments
Excellent review. Good catch on readable-stream; I should have recognized what was going on. Although Gzemnid does not usually include transpiled build code in its dataset, it cannot distinguish code that was transpiled in the past and then subsequently used as base source code, as what appears to be happening with readable-stream. The original use case was motivated by @ljharb’s desire for syntax to change the But based on feedback from some TC39 representatives (see #6 (comment), #8, and some discussion on Matrix), I had become a little reluctant to emphasize specific use cases. I’m afraid that associating this proposal with certain use cases—such as individually importable, tree-shakable “extension methods” or the security use case—would make several representatives colder toward this proposal. Instead, I have tried to focus attention on the sheer magnitude of general Having said that, you are right: in order to strengthen the proposal, we must investigate specifically why libraries (and in particular the most popular libraries) use So next actionable items I see:
I invite you to download and run the Gzemnid dataset and script yourself. Thanks again for the excellent review. |
Yeah, I'll scour around and see what other use cases I can find 👍️ |
This is a highly subjective opinion; one that jQuery (arguably the most popular JS library ever) uses heavily; and community opinion on things like this has shifted over time. I don't think it's the committee's place to pass judgement on the use of a fundamental pattern of the language, especially one that every prototype method relies on. |
I also want to make it clear that I do consider method extraction (along with the calling of extracted methods) to be a core use case of this proposal. For example, method extraction is essential for security from prototype pollution and other global mutation. The non-goal of this proposal is what I currently call “tacit” method extraction: a term I made up to refer to when a method is implicitly bound to its original owner, creating a bound function. // “Explicit” method extraction results in a `this`-dependent ordinary function.
// This is handled by this proposal.
const { slice } = arr;
// The `this` value is supplied in the call.
arr::slice(...args);
// “Tacit” method extraction results in a `this`-independent bound function.
const slice = arr&.slice;
// The `this` value is not supplied in the call, because the function is bound.
slice(...args); I think these two concepts could use better names. I opened #13 about this. |
For what it’s worth, there are some matching lines in the top-1000-packages dataset that come from various packages’ > ./search.topcode.sh '.call' | awk '/standalone.js/' | awk 'END { print NR }'
745
> ./search.topcode.sh 'console.log' | awk '/standalone.js/' | awk 'END { print NR }'
93 To be clear, Gzemnid only includes code added to Git repositories. No transpiled code is included unless it is manually added to the repository. It doesn’t seem like Git-managed There also appear to be 53667 instances of > ./search.topcode.sh '.call\b' | awk '/super/' | awk 'END { print NR }'
53667 These constitute about 10.7% of all |
@js-choi - thanks for the clarification on tactic method extraction, that makes sense. @ljharb - to clarify, I was just talking about callbacks passed to a function, not methods on a prototype, and I wasn't claiming that it's wrong to bind custom "this" value to callbacks. I think there are numerous good, proper use cases for doing so. I've seen plenty of libraries do it for good, valid reasons. However, I do stand by my opinion that when you're uncertain what to do, it's better to air on the side of explicit parameters over implicit ones. People generally do this naturally - when you use array.map(), you can access the index and original array parameters by simply accepting more parameters. It could have been designed such that you access those values via If customizing the "this" on callbacks to provide extra context is the only other, good valid use case, and this syntax is provided primarily to support it, then the committee is in fact passing judgment on this use case. They're blessing it with special syntax, telling the world that they want to see more of this done, to the point that they're even willing to provide special syntax to help out. Now, I don't believe this use case is the only valid one. But, my objective here is to try and decompose the usages of .call/.bind/.apply to their different use cases and see how a bind operator can help each one of them. This means that, yes, we'd need to analyze each use case in isolation and see if it's really a use case we with to promote or not. If we do want to promote it, how much? How much weight do we give to each use case? Once we've distilled what we see to a list of use cases we want to promote, then we can focus on how to best support these use cases. For example, if we find that 99% of the time people really just want tactic method extraction, then maybe we need to change the focus of this proposal a bit. Or, if we find that 99% of the time people want to supply an implicit "this" paramter to callbacks, and the committee is willing to promote that idea, then maybe the .call() syntax I proposed in #3 of |
I don’t know how it could possibly have been designed that way; mutating the array to provide that info wouldn’t have ever been viable. |
To be clear, My understanding is that a receiver is a concept from smalltalk, where it represents the object that the a message (that the function represents) is being sent to. The .call API does not represent providing a receiver well, and never has, precisely because everyone is familiar with |
For what it’s worth, I’m still trying to focus attention on the sheer magnitude of general bind/call/apply usage, rather than specific use cases. People seem to rebind But whatever the reasons, they are frequent, and they are clunky. A goal of this proposal is to improve the word order’s ergonomics: I think the biggest valuable thing that this review is already giving us is making sure we don’t include any irrelevant examples, like At least, that’s my current approach. |
@js-choi - those are interesting findings. I've been looking at the 1000 line dump you shared and have been whittling it down - I've pretty much chopped it in half now after removing these types of things:
These are just superficial findings by looking at the overall file. Only so much can be found this way. I want to dig into more individual use cases as well to see if I can find other stuff. |
Excellent, thank you for your hard work. Lots of stuff for us to investigate here further. A lot of this is actually reassuring… How are you checking whether code is coming from transpilers? |
The array instance isn't the "this" argument in that example. It's just some object literal. Like this:
People design APIs like this, all the time. When I do this:
The "this" value is not someHTMLElement, it's an event object. They also pass the event object in as a parameter, so there's never really a need to use the special "this" and most people don't touch it. I would argue that this is one of the bad examples, that's we're now stuck with in browsers. |
If a line of code contained any of these strings, I considered it to be the fault of the transpiler that .call() got used:
I wasn't clear with what I originally stated. I was just looking for code that that was using .call() because it was transpiled, I left other transpiled code untouched. I'm sure I'm catching some false positives. For example, I did see some people using |
Thanks sharing for your methodology; it makes sense. > ./search.topcode.sh '\.call\b' | awk '/super|_getPrototypeOf|_possibleConstructorReturn|WEBPACK VAR INJECTION/' | awk 'END { print NR }'
85121 These constitute about 17.0% of It’ll be interesting to check the outputs for (CC: @tabatkins, who was wondering about this earlier too.) |
Honestly, I was thinking that transpiled code could potentially account for a lot larger fraction of these. 17% is a fair amount, but it's not the majority, so that's good to note. Certainly, there seems to also be a lot of legacy uses for .call(), and it's hard to filter out all of those. But, there also does seem to be a lot of legitimate uses for it. |
@theScottyJam: How did you determine that readable-stream’s |
That was this one:
For those first examples, I was going into the source code and looking through it by hand to figure out the reasons why they were using The basic text-searching approach I did wasn't able to pick up this initial scenario either, like you mentioned. The text searching should be taken as "at minimum, there's roughly X amount that fall into this category, there could be many, many more that we're missing", not "we've identified all entries from this file that fall into this category". However, I'm willing to bet a good portion of |
@theScottyJam: A good heuristic might be to exclude > ./search.topcode.sh '\.call\b' | awk '/[bB]ase([cC]lass)?\.call/'
139612972 readable-stream-3.4.0.tgz/errors-browser.js:26: return _Base.call(this, getMessage(arg1, arg2, arg3)) || this;
27506106 are-we-there-yet-1.1.5.tgz/tracker.js:6: TrackerBase.call(this, name)
27506106 are-we-there-yet-1.1.5.tgz/tracker-group.js:8: TrackerBase.call(this, name)
24117131 md5.js-1.3.5.tgz/index.js:9: HashBase.call(this, 64)
23595018 ripemd160-2.0.2.tgz/index.js:44: HashBase.call(this, 64)
23360946 asn1.js-5.0.1.tgz/lib/asn1/api.js:35: base.call(this, entity);
23155285 elliptic-6.4.1.tgz/lib/elliptic/curve/short.js:12: Base.call(this, 'short', conf);
23155285 elliptic-6.4.1.tgz/lib/elliptic/curve/mont.js:12: Base.call(this, 'mont', conf);
23155285 elliptic-6.4.1.tgz/lib/elliptic/curve/edwards.js:17: Base.call(this, 'edwards', conf);
23155285 elliptic-6.4.1.tgz/lib/elliptic/curve/edwards.js:12: // NOTE: Important as we are creating point in Base.call()
22396297 browserify-des-1.0.2.tgz/index.js:19: CipherBase.call(this)
22332516 create-hash-1.2.0.tgz/browser.js:9: Base.call(this, 'digest')
22316210 create-hmac-1.1.7.tgz/legacy.js:11: Base.call(this, 'digest')
22316210 create-hmac-1.1.7.tgz/browser.js:14: Base.call(this, 'digest')
21954287 cipher-base-1.0.4.tgz/test.js:9: CipherBase.call(this)
… There are 2358 occurrences (4.4%) that match Incidentally, I’m fine with people trying to call a parent constructor without |
That's a pretty hard thing to gage though - most of that kind of code that was written by hand without class syntax was probably done because it was written before class syntax existed, or because they wanted to write it without a transpiler, not necessarily because they didn't like using class syntax. But, point taken that there's probably some valid uses of |
I've started going through source code of different projects, one by one, to find more ways that .call() gets used. I've been findind some interesting stuff, that I'll report on later when I'm done. One major use case for .call() I'm seeing is when the library author wishes to wrap a callback and preserve the this binding of that callback. I hadn't thought of this use case before, but it appears every once in a while. Here's another interesting finding. I wanted to see how many packages used .call() at all. I presume the slim.topcode.1000.txt contains the first 1000 packages along with their dependencies? As they're many, many more packages listed in there than 1000. I took the dump you gave me of the first 1000 occurrences of .call() and figured out what the last package was that was listed in the dump (mocha). I then found that mocha package in slim.topcode.1000.txt and counted how many packages were analyzed from the start to this mocha package. There were 1001 (including mocha). Finally, I counted how many packages were listed in your dump. 421. I then removed what I thought were irrelevant use cases from the dump to see how it would change. This includes uses of This certainly doesn't get rid of all irrelevant use cases, just the ones that are easy to spot. After all of that, I was left with 346 packages that used .call(). It seems only roughly 1/3 of packages (at most) would benefit from a bind syntax to replace .call(). I bet .bind() would have more occurrences, because I know it's common for people to want to extract methods from an instance. Perhaps next I'll write a script that'll perform these calculations across the whole slim.topcode.1000.txt file, looking for both occurrences of .bind() and .call(), and seeing what percent of packages actually use them. Perhaps it would be easier to do this kind of analysis if we had a dataset that only contained packages that were published within the last three years or something. Then we could see how many non-legacy uses of .call() exist. But, I'm not sure how easy that is to do. And, it still wouldn't account for things like Object.prototype.hasOwnProperty.call() |
Benefiting 10% of users would be massive; benefiting a third would be quite high, so I’m encouraged by those results. |
Note that my dump was the first 10,000 occurrences, not the first 1,000 occurrences. Anyways great job and I’ll take a closer look at some of these findings soon. |
> ./search.topcode.sh '\.call\b' | awk '/\/\/.*\.call/' | awk 'END { print NR }' # Exclude comments.
3933
> ./search.topcode.sh '\.call\b' | awk '/hasOwnProperty[a-zA-Z0-9_$]*\.call/' | awk 'END { print NR }'
71479
> ./search.topcode.sh '\.call\b' | awk '/Error\.call\( *this/' | awk 'END { print NR }'
1607
> ./search.topcode.sh '\.call\b' | awk '/[^a-zA-Z][A-Z][a-zA-Z0-9_$]*\.call\( *this/' | awk 'END { print NR }'
38147 These respectively are 0.8%, 14.3%, 0.3%, and 7.6% of 500084. (Note that the fourth search is for any occurrences of Even excluding all of these, This review has already been very valuable for:
|
My opinions on this matter have certainly shifted as I've looked over all of this stuff. The uses of Certainly, I believe we've proved the point here that they're used quite often for non-obsolete use cases. I'm seeing similar results as I look over different examples by hand. I probably want to spend one more morning looking at more examples by hand - the information is intriguing to me, and I'm hoping to find as many different uses for it as I can. After which, I'll see if I can summarize what I've been learning and interesting things I've found as I've looked at specific usage scenarios.
I can pluck some of those out for you. |
(FYI: I actually would love a complete list of the array-like uses you find, because actually affects another proposal’s issue: tc39/proposal-array-from-async#7, whether |
Alright, so here's some finer details about what I've found. I analyzed the first 310 occurrences of .call() from the dump file. Of those, I categorized 113 of them using a script that simply does text search. The remaining 197 I looked over by hand, peering into the source code to understand the context of many of them. This was the categorization logic I used for the automatic categorization (in python): def auto_determin_type(line):
if '.hasOwnProperty.call' in line: return 'ownProp'
if 'Object.prototype.toString.call' in line: return 'objectProto'
if '{}.toString' in line: return 'objectProto'
if '.prototype.propertyIsEnumerable' in line: return 'objectProto'
if re.search(r'\d+:\s*//', line): return 'irrelevant'
if re.search(r'\d+:\s*\*', line): return 'irrelevant'
if 'Error.call' in line: return 'super'
# Bundlers/transpilers/compilers
if '_super.call(' in line: return 'transpiled' # Result of babel transpiling super()
if '_super.prototype.' in line: return 'transpiled' # Result of typescript compiling super.f() calls
if '_getPrototypeOf' in line: return 'transpiled' # Artifact of babel transpiling.
if '_possibleConstructorReturn' in line: return 'transpiled' # Artifact of babel transpiling.
if '__super__' in line: return 'transpiled' # Artifacts of CoffeeScript compilation, or people creating classes by hand.
if 'WEBPACK VAR INJECTION' in line: return 'transpiled' # Output of webpack merging files.
# Operating on array-like objects. (Some of these operated on non-array classes, like Uint8Array)
# (These didn't actually capture anything in the few hundred I was analyzing)
if '.slice.call' in line: return 'arrayLike'
if '.splice.call' in line: return 'arrayLike'
if '.indexOf.call' in line: return 'arrayLike'
if '.lastIndexOf.call' in line: return 'arrayLike'
if re.search(r'Array\.prototype\.[a-zA-Z]+\.call', line): return 'arrayLike'
if re.search(r'\[\].[a-zA-Z]+.call', line): return 'arrayLike'
# The following seem to either be picked off of the prototype and/or polyfilled. (for prototypal protection, or maybe tree shaking)
if 'hasOwnProperty.call' in line: return 'protect'
if 'hasOwnProperty[$\d]+.call' in line: return 'protect'
if 'propertyIsEnumerable.call' in line: return 'protect'
if 'toString.call' in line: return 'protect'
if 'toString[$\d]+\.call' in line: return 'protect'
if 'bind.call' in line: return 'protect'
if 'hasOwn.call' in line: return 'protect'
if 'slice.call' in line: return 'protect'
if 'slice[$\d]+\.call' in line: return 'protect'
if 'splice.call' in line: return 'protect'
if 'indexOf.call' in line: return 'protect'
if 'forEach.call' in line: return 'protect'
if 'call.bind' in line: return 'protect' Here are the results I got from the auto-categorization:
Of the 197 I looked over by hand, here's how I had categorized them:
The "multistepCall" category listed above is for scenarios where // Example 1
const method = obj.f ?? obj.g
method.call(obj, ...)
// Example 2
assertFunction(obj.f).call(obj, f) Here are some interesting specific scenarios I found, many of which didn't fit well into any of the specific categories above (so they went in the "etc" category). readable-stream was doing yargs did something interesting. They had this piece of code: core-js used Here are all of the scenarios I ran into where an array method was being invoked via 1. return arrayToLocaleString.apply(TO_LOCALE_BUG ? arraySlice.call(aTypedArray(this)) : aTypedArray(this), arguments); // core-js-3.1.3.tgz/modules/es.typed-array.to-locale-string.js:24
2. return arrayJoin.call(this); // core-js-3.1.3.tgz/modules/es.typed-array.to-string.js:13
3. arrayToLocaleString.call(new Int8Array(1)); // core-js-3.1.3.tgz/modules/es.typed-array.to-locale-string.js:13
4. return arraySort.call(aTypedArray(this), comparefn); // core-js-3.1.3.tgz/modules/es.typed-array.sort.js:10
5. return arrayKeys.call(aTypedArray(this)); // core-js-3.1.3.tgz/modules/es.typed-array.iterator.js:31
6. return arrayEntries.call(aTypedArray(this)); // core-js-3.1.3.tgz/modules/es.typed-array.iterator.js:26
7. return arrayValues.call(aTypedArray(this)); // core-js-3.1.3.tgz/modules/es.typed-array.iterator.js:20
8. return arrayCopyWithin.call(aTypedArray(this), target, start, arguments.length > 2 ? arguments[2] : undefined); // core-js-3.1.3.tgz/modules/es.typed-array.copy-within.js:10
9. var args = Array.prototype.slice.call(arguments, 0) // semver-6.1.1.tgz/semver.js:10
10. const args = [].slice.call(callerArguments) // yargs-13.2.4.tgz/lib/argsert.js:22
11. var options = [].slice.call(this.options); // commander-2.20.0.tgz/index.js:986
12. const args = [].slice.call(arguments); // chalk-2.4.2.tgz/index.js:35
13. const args = [].slice.call(arguments, 2); // chalk-2.4.2.tgz/index.js:213
14. return [].slice.call(arguments, 1).join(' '); // chalk-2.4.2.tgz/index.js:210
15. argv: [].slice.call(system.args), // esprima-4.0.1.tgz/bin/esvalidate.js:57 Example 1-8 is all from core-js wanting to use array methods on type array instances, to help implement polyfills for those type arrays. There's a lot of uses of Array.prototype.slice.call() that seem to not be doing anything - perhaps I'm just missing something important about how it works. Or, perhaps it's been a popular practice to program defensively and make sure whatever you have really is an array. Dunno. Those 15 scenarios of Array.prototype.whatever.call() are just what I found from doing a deep analysis of these 310 .call() scenarios. I plan on looking through other examples of using .call() on Array on a different day - I can just report what I find in the ticket you linked to. It seems like none of the above examples are that helpful for the purposes of Array.fromAsync(). Here are all of the scenarios I ran into while looking for usages of return nativeObjectToString.call(value); // lodash-4.17.11.tgz/lodash.js:6540:
var result = nativeObjectToString.call(value); // lodash-4.17.11.tgz/lodash.js:5999:
value = nativeObjectToString.call(value); // lodash-4.17.11.tgz/lodash.js:13249:
value = nativeObjectToString.call(value); // lodash-4.17.11.tgz/lodash.js:13214:
value = nativeObjectToString.call(value); // lodash-4.17.11.tgz/invertBy.js:46:
value = nativeObjectToString.call(value); // lodash-4.17.11.tgz/invert.js:36:
return nativeObjectToString.call(value); // lodash-4.17.11.tgz/core.js:1421:
return nativeObjectToString.call(value); // lodash-4.17.11.tgz/_objectToString.js:19:
var result = nativeObjectToString.call(value); // lodash-4.17.11.tgz/_getRawTag.js:35:
return Object.prototype.toString.call(obj) === '[object RegExp]'; // qs-6.7.0.tgz/lib/utils.js:205: All of these are examples of trying to safely stringify an object. In qs's case, it was done for the purpose of trying to check if the object was a regular expression. Here's a couple of complete examples of real-world scenarios for Object.prototype.toString.call (in lodash, nativeObjectToString is Object.prototype.toString, they also appear to participate in global mutation protection, which makes these examples a bit harder to read). /**
* A specialized version of `baseGetTag` which ignores `Symbol.toStringTag` values.
*
* @private
* @param {*} value The value to query.
* @returns {string} Returns the raw `toStringTag`.
*/
function getRawTag(value) {
var isOwn = hasOwnProperty.call(value, symToStringTag),
tag = value[symToStringTag];
try {
value[symToStringTag] = undefined;
var unmasked = true;
} catch (e) {}
var result = nativeObjectToString.call(value);
if (unmasked) {
if (isOwn) {
value[symToStringTag] = tag;
} else {
delete value[symToStringTag];
}
}
return result;
} /**
* Converts `value` to a string using `Object.prototype.toString`.
*
* @private
* @param {*} value The value to convert.
* @returns {string} Returns the converted string.
*/
function objectToString(value) {
return nativeObjectToString.call(value);
} And here's qs's complete example: var isRegExp = function isRegExp(obj) {
return Object.prototype.toString.call(obj) === '[object RegExp]';
}; These specific examples probably aren't the greatest for sharing around, they mostly seem to center on brand-checking, which I know that @ljharb is hoping to eventually have as its own proposal. I can also spend a bit of time to dig up better examples of using Object.prototype methods on a different day. Anyways, that's my deeper analysis of the first 300-ish occurrences of It's been really interesting looking into all of this and seeing the different ways it gets used. |
@theScottyJam: This is wonderful. Thank you so much for your hard work. It’s going to take a few days for me to properly process these results, particularly the automatically sorted stuff, since we should be able to run searches on all of those categories across the entire dataset. But with this the case will become even clearer. Would you be able to publish your 197 categorizations by hand, maybe inside a |
Sure thing. There are my raw notes for each entry. I prefixed each category with "##", e.g. I just went in order through your dump file, and stopped in the middle of the core-js package. They had a lot more usages of
Additional Notes These are some extra notes I took about specific "etc" categories. Some of these I already shared.
I also went ahead and ran my script against the whole data set, having the script filter everything that didn't have from collections import defaultdict
import re
def auto_determin_type(line):
if re.search(r'\d+:\s*//', line): return 'irrelevant'
if re.search(r'\d+:\s*\*', line): return 'irrelevant'
if '.hasOwnProperty.call' in line: return 'ownProp'
if re.search(r'Object\.prototype\.[a-zA-Z]+\.call', line): return 'objectProto'
if re.search(r'\{\s*\}\.[a-zA-Z]+\.call', line): return 'objectProto'
if 'Error.call' in line: return 'super'
# Bundlers/transpilers/compilers
if '_super.call(' in line: return 'transpiled' # Result of babel transpiling super()
if '_super.prototype.' in line: return 'transpiled' # Result of typescript compiling super.f() calls
if '_getPrototypeOf' in line: return 'transpiled' # Artifact of babel transpiling.
if '_possibleConstructorReturn' in line: return 'transpiled' # Artifact of babel transpiling.
if '__super__' in line: return 'transpiled' # Artifacts of CoffeeScript compilation, or people creating classes by hand.
if 'WEBPACK VAR INJECTION' in line: return 'transpiled' # Output of webpack merging files.
# Operating on array-like objects. (Some of these operated on non-array classes, like Uint8Array)
if re.search(r'Array\.prototype\.[a-zA-Z]+\.call', line): return 'arrayLike'
if re.search(r'\[\s*\].[a-zA-Z]+.call', line): return 'arrayLike'
# The following seem to either be picked off of the prototype and/or polyfilled. (for prototypal protection, or maybe tree shaking)
if 'hasOwnProperty.call' in line: return 'protect'
if 'hasOwnProperty[$\d]+.call' in line: return 'protect'
if 'propertyIsEnumerable.call' in line: return 'protect'
if 'toString.call' in line: return 'protect'
if 'toString[$\d]+\.call' in line: return 'protect'
if 'bind.call' in line: return 'protect'
if 'hasOwn.call' in line: return 'protect'
if 'slice.call' in line: return 'protect'
if 'slice[$\d]+\.call' in line: return 'protect'
if 'splice.call' in line: return 'protect'
if 'indexOf.call' in line: return 'protect'
if 'forEach.call' in line: return 'protect'
if 'call.bind' in line: return 'protect'
return 'UNKNOWN'
def parse_org_dump():
type_count = defaultdict(int)
total_call_entries = 0
with open('slim.topcode.1000.txt') as f:
line_num = 0
for line in f:
line_num += 1
if line_num % 1_000_000 == 0: print(str.format('{:,}/139,849,897', line_num))
if '.call' not in line: continue
if re.search(r'\.call\s*\(', line): continue # A slower, more accurate check
total_call_entries += 1
type = auto_determin_type(line)
type_count[type] += 1
return type_count, total_call_entries
type_count, total_call_entries = parse_org_dump()
print()
print('Total entries with .call():', total_call_entries)
print()
for key, value in type_count.items():
print(key + ':', value) And these are the results: Update: These results are bad, there was a bug in my script. Updated results will be posted below
The Object.prototype numbers are really, really low. I was expecting them to be higher. Perhaps people are just doing Object.prototype.whatever.call() in formats that can't be easily detected by a raw string search, e.g. if they store Object.prototype.toString into a separate variable and then .call() it. |
If interested, these are the specific occurrences of Object.prototype stuff it found.
Interestingly enough, it picked up a bunch of coffeescript files that were using it :) |
That’s because prior to ES6 - when coffeescript was a going concern - Object toString was an unforgeable brand check. ES6 broke this, and so its usage is much less common after that, including in Babel output. In some of my polyfills I also use |
Ah, that's interesting to know about the broken brand checking. With the chunk of stuff I looked over by hand, the algorithm could only pick up one occurrence of I did glance over other stuff by hand outside of those specific 300-ish I analyzed, and I know I saw other places where Object.prototype functions were used, but those were also done in indirect ways. Perhaps there's not much use for Object.prototype.toString besides legacy brand checking and that's why we're not seeing much of it. |
I think that’s the case; why robustly cache the method if the check is brittle by looking up Symbol.toStringTag |
My script had an embarrassing bug, so the results from it yesterday are way off. (I was finding every occurrence of ".call" that wasn't being called. I wanted to find the places where My post from two days ago should still all be accurate. Updated script: from collections import defaultdict
import re
def auto_determin_type(line):
if re.search(r'\d+:\s*//', line): return 'irrelevant'
if re.search(r'\d+:\s*\*', line): return 'irrelevant'
# Bundlers/transpilers/compilers
if '_super.call(' in line: return 'transpiled' # Result of babel transpiling super()
if '_super.prototype.' in line: return 'transpiled' # Result of typescript compiling super.f() calls
if '_getPrototypeOf' in line: return 'transpiled' # Artifact of babel transpiling.
if '_possibleConstructorReturn' in line: return 'transpiled' # Artifact of babel transpiling.
if '__super__' in line: return 'transpiled' # Artifacts of CoffeeScript compilation, or people creating classes by hand.
if 'WEBPACK VAR INJECTION' in line: return 'transpiled' # Output of webpack merging files.
if '_objectWithoutProperties' in line: return 'transpiled'
if '.hasOwnProperty.call' in line: return 'ownProp'
if re.search(r'Object\.prototype\.[a-zA-Z]+\.call', line): return 'objectProto'
if re.search(r'\{\s*\}\.[a-zA-Z]+\.call', line): return 'objectProto'
if 'Error.call' in line: return 'super'
# Operating on array-like objects. (Some of these operated on non-array classes, like Uint8Array)
if re.search(r'Array\.prototype\.[a-zA-Z]+\.call', line): return 'arrayLike'
if re.search(r'\[\s*\].[a-zA-Z]+.call', line): return 'arrayLike'
# The following seem to either be picked off of the prototype and/or polyfilled. (for prototypal protection, or maybe tree shaking)
if 'hasOwnProperty.call' in line: return 'protect'
if 'hasOwnProperty[$\d]+.call' in line: return 'protect'
if 'propertyIsEnumerable.call' in line: return 'protect'
if 'toString.call' in line: return 'protect'
if 'toString[$\d]+\.call' in line: return 'protect'
if 'bind.call' in line: return 'protect'
if 'hasOwn.call' in line: return 'protect'
if 'slice.call' in line: return 'protect'
if 'slice[$\d]+\.call' in line: return 'protect'
if 'splice.call' in line: return 'protect'
if 'indexOf.call' in line: return 'protect'
if 'forEach.call' in line: return 'protect'
if 'call.bind' in line: return 'protect'
return 'UNKNOWN'
def parse_org_dump():
type_count = defaultdict(int)
total_call_entries = 0
with open('slim.topcode.1000.txt') as f:
line_num = 0
for line in f:
line_num += 1
if line_num % 1_000_000 == 0: print(str.format('{:,}/139,849,897', line_num))
if '.call' not in line: continue
if not re.search(r'\.call\s*\(', line): continue # A slower, more accurate check
total_call_entries += 1
type = auto_determin_type(line)
type_count[type] += 1
return type_count, total_call_entries
type_count, total_call_entries = parse_org_dump()
print()
print('Total entries with .call():', total_call_entries)
print()
for key, value in type_count.items():
print(key + ':', value) Here are the updated stats:
Now it was able to automatically categorize 43% of the results, which is pretty good. Here are some use cases for Object.prototype: This is Object.prototype.propertyIsEnumerable. I found 315 total usages of this, a good portion of which seemed to be for use cases such as merging or cloning object, or aiding with polyfills. Babel is actually at fault for 146 of them, it looks like it sticks in a helper function when you do spreading, that utalizes propertyIsEnumerable. From hoek v6.1.3 index.js: internals.keys = function (obj, options = {}) {
return options.symbols ? Reflect.ownKeys(obj) : Object.getOwnPropertyNames(obj);
};
// Merge all the properties of source into target, source wins in conflict, and by default null and undefined from source are applied
exports.merge = function (target, source, isNullOverride /* = true */, isMergeArrays /* = true */) {
...Handling of unique scenarios, like arrays...
const keys = internals.keys(source);
for (let i = 0; i < keys.length; ++i) {
const key = keys[i];
if (key === '__proto__' ||
!Object.prototype.propertyIsEnumerable.call(source, key)) { // <-----
continue;
}
...merging logic...
}
return target;
}; Another use for isEnumerable is to check if a value is an arguments array. From marsdb v0.6.11 at lib/EJSON.js function _isArguments(val) {
return (
!!val && typeof val == 'object' &&
Object.prototype.hasOwnProperty.call(val, 'callee') &&
!Object.prototype.propertyIsEnumerable.call(val, 'callee')
);
} An example of .toString() This is from jscodeshift v-0.6.4, in src/code.js /**
* Returns a collection from a node, node path, array of nodes or array of node
* paths.
*
* @ignore
* @param {Node|NodePath|Array} source
* @return {Collection}
*/
function fromAST(ast) {
if (Array.isArray(ast)) {
if (ast[0] instanceof NodePath || ast.length === 0) {
return Collection.fromPaths(ast);
} else if (Node.check(ast[0])) {
return Collection.fromNodes(ast);
}
} else {
if (ast instanceof NodePath) {
return Collection.fromPaths([ast]);
} else if (Node.check(ast)) {
return Collection.fromNodes([ast]);
}
}
throw new TypeError(
'Received an unexpected value ' + Object.prototype.toString.call(ast) // <-----
);
} The vast majority of toString uses seem to be for tag checking, but every once in a while, it's used for stringifying an unknown value for an error message. Among the entire slim.topcode.1000.txt file, there are only 6 occurrences of Object.prototype.isPrototypeOf.call:
None of the pakcages used |
Should we try to completely exclude, from our statistics, any use of |
There’s still valid use cases for it - it just won’t be ”brand checking”. In particular, some things use the toString output as a tag for a conceptual type union, and don’t care that it’s forgeable. |
This proposal's README states the reason we need this bind operator is because:
It goes on to do an excellent job at explaining both of these points.
I was curious as to the reason why the .bind() family of functions was used so often, so I decided to look into each example the proposal presented to figure out why the code authors chose to use a bind function there. Here's my findings:
Update: To preserve context for what I'm talking about, here's the code snippets from the README I was originally referring to. It's comparing how current code looks like, and how it would look like if it used the proposed bind syntax.
The "kind-of" package uses .call() because they want to call Object.prototype.toString() on a passed-in object, and they're following the good practice of not trusting that .toString() will be present and well-behaved on the passed-in object (e.g. the object could have a null prototype).
There were two examples of .call() from the "debug" package. In both scenarios, they're binding "this" to a custom object, in order to provide additional context for that function. In the first scenario, they're binding "this" while calling a user-provided function - this seems to be an undocumented feature. In the second scenario, they're only binding "this" to internal callbacks.
The "readable-stream"'s usage of .call() is actually pretty irrelevant to this proposal. It's the result of a babel transformation, turning class syntax with super calls into functions and prototypes, with .call() being used to help emulate super().
The "yargs" package was just trying to do what we can do today with the spread operator. The code must have been written before the spread operator existed.
It was a little difficult to follow the pretty-format's code. if anyone else wants to give it a shot, feel free. I think ultimately they were trying to do a form of method extraction from one of their third-party libraries, but they were binding the "this" value at a different location from where they were extracting the methods. From my initial impression, it looked like they might even be binding the wrong "this" value to these methods, but the third-party library didn't care, because the third-party library doesn't actually use "this" anywhere in its source code (it was a smaller library, and a grep for "this" returned zero results). I suspect I'm just misinterpreting what was going on, but maybe not.
The "Q" package uses .apply() as part of some internal logic, which seems to mainly be used by publicly exposed functions such as promise.fapply(), promise.fcall(), and promise.fbind(). Ultimately the purpose is to mimic javascript's bind functions, but with some async logic added to them.
In the case of rxjs, they're exposing functions that accept a callback argument, and an optional this-value that will be applied to the supplied callback when it's called. This is mimicking built-in functions like Array.prototype.map(), which also accept both a callback and an optional "this" parameter.
Bluebird was using .call() for two different reasons. In the first scenario, they were wanting to define similar methods on two different classes. They did so by defining them all on one class, then creating a bunch of tiny, one-line methods on the other class that basically did
FirstClass.prototype.call(specialThisValue, ...args)
. Basically, they were using .call() for code reuse purposes. In the second scenario, .call() is being used because they expose their own promise.bind() function, which accepts a "this" value, then applies that "this" to every callback in the promise chain. Part of the reason for this behavior is to allow the end user to keep state around between callbacks by mutating "this". See here for their rational behind this feature.graceful-fs was simply a case of method extraction. They pulled off fs.read, and instead of binding it on the spot, they supplied the correct "this" value via .call() later on.
Ok, I think that's a good sample-size of many different possible use cases for bind functions. I'm going to go ahead and categorize these use cases, so that we can discuss the value a bind syntax provides to these general categories.
Irrelevant
Customizing the "this" value of callbacks
Both bluebird's second example and the debug package use .call() to customize the "this" value of callbacks. In general, I think a good rule of thumb to follow is "prefer providing explicit parameters over custom, implicit 'this' parameters". In other words, it's generally better to just pass a context object to a callback instead of trying to conveniently set it's "this" value to this context object.
I'm not saying it's always bad to customize the "this" value of callbacks, I'm just saying we probably shouldn't encourage this kind of behavior, which means I would be against adding a bind syntax if this was the only use case for it.
protection from global mutations
This category wasn't found in any of the examples above, but it's still worth discussing. As shown in the README, node has a lot of code that uses .bind() type functions to protect against mutation of global properties. This is a fair point, and a valid use case for this proposal, but it can't be the only driving force. As discussed here, it seems this proposal can't feed off of "global mutation protection" alone, it needs stronger reasons to exist.
Mimicking existing language semantics
Both Q and rxjs mimic existing language semantics related to this-binding. Q provides .apply()/.call()/.bind() equivalents for promises and rxjs provides a parameter to set the "this" value of a callback.
These types of features are only useful if the end user finds it useful to bind custom this values to their callbacks, which is the type of thing we're trying to discuss right now. Perhaps, there's really not a big need for these features, and the API authors didn't really need to include them. But, then again, perhaps there are good use cases, so lets keep exploring.
Language workarounds
It seems a couple of the packages were using .call() simply because the language itself didn't provide what they needed in a direct way.
For items that fall into this category, it's probably best to analyze each situation independently and figure out what really needs to be added to the language. Sure, a this-binding shorthand could provide some minor benefits to the readability of these workarounds, but what they really need is a way to do what they're trying to do without resorting to funky workarounds. We're already actively fixing a number of problems in this category, for example, there's the mixin proposal, and there's also the recently added Object.hasOwn() function which is basically a more direct way of doing Object.prototype.hasOwnProperty.call().
Method extraction
Both pretty-format and graceful-fs are basically doing a form a method extraction, except in both cases they're choosing to supply the "this" value at the call location instead of the extract location.
This proposal stated that it doesn't wish to focus on method extraction, but ironically, I think this use case might be the most important one among the ones that have been analyzed.
Others?
Looking at the above categories of use cases, it seems that many uses of .call()/.bind()/.apply() don't actually benefit much from a this-binding syntax, however, what's been presented is not an exhaustive list of possible use cases. So I want to pose the following question:
What specific uses of .call()/.bind()/.apply() does this proposal hope to help out with? Are there use cases beyond the ones listed above that this proposal seeks to make more user-friendly? If so, what are they? This might be a good thing to eventually document in the README as well. The README explains clearly that .call() gets used everywhere, but it doesn't dig deeper to find out why it's getting used all over the place - I think those underlying use cases are the real gold mine we're trying to hit. The uses of ".call()" seems to more-often-than-not be a symptom of a deeper problem that needs addressing.
The text was updated successfully, but these errors were encountered: