Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use Buffer.from if available(usually in node.js environments) in _.isEqual #2881

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sktguha
Copy link

@sktguha sktguha commented Sep 25, 2020

use Buffer.from if available(usually in node.js environments) in _.isEqual for faster checking in case of objects which can be cast to buffer

…ter checking in case of objects which can be cast to buffer
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 95.991% when pulling de96ce1 on sktguha:isEqualOptimisation into d9741b3 on jashkenas:master.

@coveralls
Copy link

coveralls commented Sep 25, 2020

Coverage Status

Coverage decreased (-0.2%) to 96.102% when pulling 48d5730 on sktguha:isEqualOptimisation into d9741b3 on jashkenas:master.

@sktguha
Copy link
Author

sktguha commented Sep 25, 2020

Hi @jgonggrijp Julian, hope you are doing well. So I made a draft PR to use native Buffer.from in isEqual function for buffers and related objects if available(usually in node.js environments) as per discussion here.

so you can run online the test here for new and existing versions(have extracted the code as independent js files, you can check the code in the repo I have linked in next few lines) .

Or else I have setup a handy repository for running on local machine (don't worry about the sudo permission when doing npm install, its just for installing nodejs versions 10 and 12) , so there is a definite 1.5 ms or more difference. So new version is 1.5 ms or more faster than older version, In nodejs version 10 and 12 the new version may be even more faster, by 5ms or more.

Also when running newly the difference in times may be very high, like 20 ms or more. But that I have mostly observed on the online repl I have linked, not sure if I have observed on my local machine. If you are running the test on your local, do note the time taken for first time ( It may be possible that subsequent runs are optimised due to some caching, maybe even at the cpu level, even when we exit from node.js )

Also the npm install resulted in change in version of underscore itself in package-lock.json . I am not sure if that should be committed , so have skipped it for now

@sktguha sktguha changed the title use Buffer.from if available(usually in node.js environments) use Buffer.from if available(usually in node.js environments) in _.isEqual Sep 25, 2020
@jgonggrijp
Copy link
Collaborator

Thank you Saikat, I'm doing well and I hope you are doing well, too.

Your code looks sensible and I like the care you've put into benchmarking this. We need to push the CPU a bit harder, though.

Imagine building an application that makes a single comparison between two typed arrays. That application might save 20 ms. Nobody is going to care about that.

If somebody wants to save time (and energy) on comparing typed arrays, it's probably because she is comparing thousands or even millions of pairs of typed arrays. She's hoping to shave off at least a few seconds in total, or to improve her throughput by thousands of comparisons per second.

JavaScript engines behave differently when you repeat code very often. The more often a piece of code repeats, the harder they try to make it run as fast as possible. Those 20 ms you need for a single comparison, might be sufficient for 100 comparisons or more once JSCore, V8 or SpiderMonkey has fully optimized the code. The performance difference between the existing and the new implementation is likely to change under those circumstances, too.

So please extend your benchmarks to make a minimum number of repetitions (I suggest at least 10k) and also to continue repeating until at least a whole second has passed. This will make them better represent a situation where performance actually matters.

Please also make sure that you vary the buffers that are compared between repetitions. If program flow is the same every time, an optimizing engine might notice that at some point and start cheating.

An example of a microbenchmark that follows these recommendations can be found over here. Some context for that benchmark over here (you might need to click "show resolved" in order to see my comment about the benchmark).

When you notice a significant difference between the first run and subsequent runs on your local machine, it is likely that disk latency is playing a role. In that case, you should actually ignore the first run, because a significant portion of its CPU time is spent on waiting for the file system.

@sktguha
Copy link
Author

sktguha commented Sep 26, 2020

Thanks Julian @jgonggrijp for your detailed comments and the link to the benchmark. Let me check that out and come back to you maybe sometime tomorrow.

Also btw are there any usage statistics for nodejs versions , like 20% use nodejs 10, 34% use nodejs 12 etc , because I see that in nodejs 10, the new version is significantly faster than existing one, like 2 secs vs 100ms or less for existing and new, respectively in the new randomized test I am working on currently for 1 million items, as you have mentioned in your comment, for older nodejs versions the difference could be same or higher. Have to check. I was looking at https://nodejs.org/en/user-survey-report/ but can't find any data on different versions. maybe npm has it ?

Edit: I found https://nodesource.com/node-by-numbers, any more ?

@jgonggrijp
Copy link
Collaborator

I don't know about usage statistics between Node versions, but I don't think it is very important, either. I'd like to suggest to focus on recent versus old instead, because that is how usage tends to develop; if the performance difference between existing and new implementation declines towards newer Node versions, that might suggest that the optimization has little added value in the long run.

I'm getting ahead of the facts, though. Let's just look at the numbers first.

@sktguha
Copy link
Author

sktguha commented Sep 27, 2020

Hi @jgonggrijp Julian , I probably won't get time today to work on this. Let me come back to you maybe tomorrow.

…equals directly after checking if its a function and also do isFunction check before bind as bind may throw exception if not a function
@jgonggrijp
Copy link
Collaborator

@sktguha Best wishes for 2021. Any updates on this PR?

@sktguha
Copy link
Author

sktguha commented Jan 11, 2021

Oh hi @jgonggrijp . Happy new year to you too 🙂
Really sorry , got busy in other things.
I think I have linked the node.js scripts to compare the perf improvements.
I might not have time this week. Let me try next week.
If possible, can you put something like up for grabs etc ?
someone could possible take a look in meantime or even take over this if they want to.

@jgonggrijp
Copy link
Collaborator

Alright, so if I understand correctly, the link you put up is https://github.com/sktguha/underscore-isequal-tests and my comments on that benchmark still need to be addressed, which you'd happily defer to somebody else.

If at second thought, you'd rather drop this PR, that's fine by me; no hard feelings.

@sktguha
Copy link
Author

sktguha commented Jan 11, 2021

@jgonggrijp oh no no , I do intend to come back to this 🙂
Contributing to a lib like underscore is really cool

I mean otherwise someone can take over. feel free to! . Why would I want to drop it 🙂 ?
I hope you are not on some kinda deadline ?

@sktguha
Copy link
Author

sktguha commented Jan 11, 2021

you could tag people here interested to tak over I guess

@jgonggrijp
Copy link
Collaborator

@sktguha No I'm not on a deadline (not for this particular PR at least)! Please take your time. I was asking rather directly whether you still want to continue with this, just to be very clear. If your answer is "yes but probably later", that's totally fine by me.

@sktguha
Copy link
Author

sktguha commented Jan 12, 2021

@jgonggrijp ok understood, "yes but probably later" yup that's what I would say. To add it, happy to hand it over if anyone is interested 🙂

@sktguha
Copy link
Author

sktguha commented Jan 12, 2021

@jgonggrijp btw just curious, do you work in USA timezone ?

@sktguha
Copy link
Author

sktguha commented Jan 12, 2021

oh ok, you seem to be based in Netherlands, well better timezone overlap with India 🙂

@jgonggrijp
Copy link
Collaborator

Yes, Netherlands. I'm a night owl though, so that might decrease the overlap again. However, if you'd like to chat about something, you can find me on Gitter.

@sktguha
Copy link
Author

sktguha commented Jan 12, 2021

oh that's awesome. can you post your gitter username maybe ? I have made an account there with this github login only.

@jgonggrijp
Copy link
Collaborator

I found you and I left you a message on Gitter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants