Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Just vs lodash #4

Closed
franciscolourenco opened this issue Jul 17, 2016 · 32 comments
Closed

Just vs lodash #4

franciscolourenco opened this issue Jul 17, 2016 · 32 comments

Comments

@franciscolourenco
Copy link

Lodash doesn't have dependencies and its methods can be imported individually, e.g.

extend = require('lodash/extend')
// vs 
extend = require('just-extend')

How do just modules compare to equivalent lodash methods?

@franciscolourenco franciscolourenco changed the title just vs lodash Just vs lodash Jul 17, 2016
@angus-c
Copy link
Owner

angus-c commented Jul 17, 2016

Lodash (even lodash-modularized) has dependencies.

@jdalton
Copy link

jdalton commented Jul 17, 2016

@aristidesfl
FWIW Angus created just before knowing Lodash was modular.

@angus-c
The lodash package has no dependencies. It's a collection of small modules, e.g. lodash/chunk.
Because of its popularity, it's already in everyone's local npm cache, which is nice.

In addition, there are babel and webpack plugins to make cherry-picking and bundle size optimizations a breeze.

The individual modularized method packages, e.g. lodash.chunk, may or may not have dependencies. However, their dependencies are shallow and many have been inlined to keep dep counts low.

@angus-c
Copy link
Owner

angus-c commented Jul 17, 2016

Lodash is well used and well liked, and I have no problem with people using it.

The question was about the differences. Just had a philosophy of importing just one small function with zero dependencies.

Closing as this is straying off topic.

@angus-c angus-c closed this as completed Jul 17, 2016
@thibaudcolas
Copy link

@angus-c after looking at this lib's README, first thing I thought is "How does this compare to lodash?" (this is how I stumbled upon this issue). I'd suggest you address this directly in the README, as people picking a dependency for their project are likely to compare this to other libs.

@angus-c
Copy link
Owner

angus-c commented Jul 21, 2016

@ThibWeb I'm planning a blog post on the rationale for Just, which I'll link to here.

@Rhernandez513
Copy link

That would be amazing

On Thu, Jul 21, 2016, 3:46 PM angus croll notifications@github.com wrote:

@ThibWeb https://github.com/ThibWeb I'm planning a blog post on the
rationale for Just, which I'll link to here.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#4 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AKUMuVWM9m7iklArXyLXmPpnKMhwkKloks5qX9q7gaJpZM4JOODA
.

@franciscolourenco
Copy link
Author

Would be nice to have some concrete numbers in the blog post / comparison , like module size (including dependencies), and performance.

@jdalton
Copy link

jdalton commented Jul 22, 2016

@aristidesfl

It kind of feels like retrofitting an argument, or picking a fight, since Angus wasn't aware that Lodash was modular or had individual method packages. There wasn't a particular failing or slight of Lodash that sparked the creation of Just. Since Angus isn't super familiar with Lodash it'd be harder to represent Lodash fairly in comparisons. You don't know what you don't know, you know?

@erquhart
Copy link

erquhart commented Jul 22, 2016

Preface: Lodash user here, and I thought it strange for this lib to exist at first glance. Lodash is modular, duh.

Then I realized it was published by Angus Croll (Hemingway, anybody?), and had to figure out why he'd do this. Here's what I found:

  • Lodash does have dependencies - sort of. It depends on itself via shared, underlying modules, so each module isn't truly standalone. I think most of us know this, but never really stopped to consider whether it was ideal for our individual use cases.
  • Lodash is pretty verbose. Again, never thought much of it, and it's not necessarily a weakness, but it's relevant for comparison.

There are other points of general difference, but let's go with these.

Here's a great way to sum up the comparison:

lodash.union - 318 LOC: https://github.com/lodash/lodash/blob/4.1.1-npm-packages/lodash.union/index.js
just-union - 18 LOC: https://github.com/angus-c/just/blob/master/packages/array-union/index.js

Yes, the LOC count includes empty lines and source docs for both, which may not be fair since Lodash is such a mature, longstanding lib. This bears consideration.

That said, the just-union source is so succinct that I'm going to paste it right here:

module.exports = union;

/*
  union([1, 2, 5, 6], [2, 3, 4, 6]); // [1, 2, 3, 4, 5, 6]
*/

function union(arr1, arr2) {
  var result = arr1;
  var len = arr2.length;
  for (var i = 0; i < len; i++) {
    var elem = arr2[i];
    if (arr1.indexOf(elem) == -1) {
      result.push(elem);
    }
  }
  return result;
}

I love the idea of using small chunks of code for this utility stuff, chunks that are probably similar to what I'd write myself if we didn't have these libs. For debugging alone, there is a clear advantage to this approach.

Parting note: much love for Lodash, I just felt an uninvolved third party actually digging into the case for this lib could be helpful for others. There's plenty of room in the industry for both.

@jdalton
Copy link

jdalton commented Jul 22, 2016

@erquhart

Lodash's _.union is either 2kB or 1kB depending on bundle/package and is ~840x faster. Long linear lookups, as with Array#indexOf, aren't great for things like union and friends.

Lodash is in the local npm cache of most users so it's not downloaded from the npm registry.
And since Lodash depends on itself there's less transitive dep worries to boot.

@erquhart
Copy link

erquhart commented Jul 22, 2016

@jdalton yep - not pointing out weaknesses, just points of difference. That said, if Lodash's union is 840x faster than the just-union package, that'd be very compelling information for folks to be aware of. Might be worth posting a jsperf whenever they're back online.

@jdalton
Copy link

jdalton commented Jul 22, 2016

@erquhart

Dependency count (esp. low to no) isn't a thing devs should get super hung up on.
Stack Overflow style copypasta implementations aren't necessarily a great thing either.

There's plenty of other compelling info like Lodash has ~100% code coverage, environment testing, and is completely customizable. If there's gaps, bugs, or failings they're usually addressed pretty quickly too.

@angus-c
Copy link
Owner

angus-c commented Jul 22, 2016

I'm not very interested in a lodash versus just comparison. As I keep saying, I like lodash, the two libraries can and should coexist.

But if the question is why did I build a 10 line utility instead of a 1000 line utility, it's because I can.

node_modules
├── just-union
│   ├── README.md
│   ├── index.js
│   └── package.json
└── lodash.union
    ├── LICENSE
    ├── README.md
    ├── index.js
    ├── node_modules
    │   ├── lodash._baseflatten
    │   │   ├── LICENSE
    │   │   ├── README.md
    │   │   ├── index.js
    │   │   └── package.json
    │   ├── lodash._baseuniq
    │   │   ├── LICENSE
    │   │   ├── README.md
    │   │   ├── index.js
    │   │   ├── node_modules
    │   │   │   ├── lodash._createset
    │   │   │   │   ├── LICENSE
    │   │   │   │   ├── README.md
    │   │   │   │   ├── index.js
    │   │   │   │   └── package.json
    │   │   │   └── lodash._root
    │   │   │       ├── LICENSE
    │   │   │       ├── README.md
    │   │   │       ├── index.js
    │   │   │       └── package.json
    │   │   └── package.json
    │   └── lodash.rest
    │       ├── LICENSE
    │       ├── README.md
    │       ├── index.js
    │       └── package.json
    └── package.json

Also what does 840x faster mean? For arrays with Number.MAX_SAFE_INTEGER elements? For all practical purposes I'm more than happy with the results

var u1 = require('just-union')
var u2 = require('lodash.union')
var arr1 = [], arr2 = [];
var i = 1000; var j = 2000;
while(i--) arr1.push(i);
while(j = j-2) arr2.push(i);
console.time('t'); u1(arr1, arr2); console.timeEnd('t')
// t: 5ms
console.time('t'); u2(arr1, arr2); console.timeEnd('t')
// t: 5ms

Now can we just co-exist? Please?

@erquhart
Copy link

I would say the comparison is both necessary and unavoidable, not to determine superiority, but to determine which is the right tool for my project. There's no reason for such comparison to be anything other than amicable.

@jdalton there's no argument, lodash is an excellent lib. You should be proud.

@abozhilov
Copy link

abozhilov commented Jul 22, 2016

@jdalton how did you measure that 840x? I'm pretty sure that you are using O(N ^ 2) worst case time complexity in your union in non-native Set environment, am I right?
All those union, uique could be implemented with O(N) best case and O(N log N) worst case.

Test Lodash's unique against:

var unique = (function () {
    function defCompare(a, b) {
        var aType = typeof a,
            bType = typeof b;

        if (aType !== bType) {
            if (aType < bType) {
                return -1;
            }
            return 1;
        }

        if (a < b) {
            return -1;
        }
        else if (a > b) {
            return 1;
        }
        return 0;
    }

    function isSorted(arr, compareFunc) {
        for (var i = 1, len = arr.length; i < len; i++) {
            if (compareFunc(arr[i - 1], arr[i]) > 0) {
                return false;
            }
        }
        return true;    
    }

    return function (arr, compareFunc) {
        var res = [];

        if (typeof compareFunc == 'undefined') {
            compareFunc = defCompare;
        }

        if (!isSorted(arr, compareFunc)) {
            arr = arr.slice().sort(compareFunc);

            res.push(arr[0]);
            for (var i = 1, len = arr.length; i < len; i++) {
                if (compareFunc(arr[i - 1], arr[i])) {
                    res.push(arr[i]);
                }
            }
            return res;
        }

        return arr.slice();
    }    
})();

With the following input data:

var arr = [];
for (var i = 0; i < 1e6; i++) {
    arr.push(i);
}

@jdalton
Copy link

jdalton commented Jul 23, 2016

@angus-c

I'm not very interested in a lodash versus just comparison.

Me neither. See above.

But if the question is why did I build a 10 line utility instead of a 1000 line utility, it's because I can.

The number of files or lines in the lodash.union package has no bearing on the quality of the package. The end result is still one or two kB. With Lodash the dependencies of those individual packages can and are inlined to strike a balance. It's trivial for me to inline less or more as needed.

That said, I prefer the primary lodash package over the individual method packages because it enables more code sharing and opportunities for smaller builds.

Also what does 840x faster mean? For arrays with Number.MAX_SAFE_INTEGER elements? For all practical purposes I'm more than happy with the results

var union = require('just-union');
var _ = require('lodash');

Naw. You can see similar with a smaller amount.
Try a union of 2 arrays with 5,000 elements in them in Node 6.

var a1 = _.shuffle(_.range(5000));
var a2 = _.shuffle(_.range(3500, 8500));

console.time('a');
union(a1, a2);
console.timeEnd('a');
// ~6266.933ms (6 seconds)
var a1 = _.shuffle(_.range(5000));
var a2 = _.shuffle(_.range(3500, 8500));

console.time('b');
_.union(a1, a2);
console.timeEnd('b');
// ~4.234ms

Careful though, just-union mutates the input array, so you have to use a fresh one when testing them back to back.

Making packages with naive implementations is nothing new. There's plenty of them out there. The devil is in the details though. Even something as simple as array-last or lodash/last can have surprises (see #9).

Update

I've revved up inlining, so lodash.union and friends are now zero-dependency modules:

node_modules
└── lodash.union
    ├── LICENSE
    ├── README.md
    ├── index.js
    └── package.json

@abozhilov
Copy link

This is not fair! You are testing best case of Lodash against worst case of @angus-c implementation.
Please share results with the following input data:

var arr = [];
for (var i = 0; i < 100; i++) {
    arr[i] = {};
}

var arr1 = [];
var arr2 = [];

for (i = 0; i < 3000; i++) {
    arr1.push(arr[Math.floor(Math.random() * 100)]);
    arr2.push(arr[Math.floor(Math.random() * 100)]);
}

@jdalton
Copy link

jdalton commented Jul 23, 2016

@abozhilov

Linear search bogs down the longer the array is. That's just how it is.
Small arrays aren't interesting as the cost is negligible (as is the case with tiny operations).

@angus-c
Copy link
Owner

angus-c commented Jul 23, 2016

@jdalton good call with the mutation, I missed that

@abozhilov
Copy link

@jdalton obviously you are using native Set. That's why I said this is not fair, since primary idea of @angus-c is ES < 6 lib. In non-set environment both approaches will lead to O(N ^ 2) time with my input data.

@jdalton
Copy link

jdalton commented Jul 23, 2016

@abozhilov

That's naive vs. a bit more robust for ya.
Lodash does a little bit more than an Array#indexOf crawl.

I get it. I do. You'll likely not hit new territory in this thread though.

@abozhilov
Copy link

@jdalton should I? What I'm saying is: "In non ES6 environment union worst case is O(N ^ 2)". You and everyone else on that planet, cannot prove me wrong!

@jdalton
Copy link

jdalton commented Jul 23, 2016

You and everyone else on that planet, cannot prove me wrong!

In older environments I use a Set-like fallback which at its core stores values in an object so that it avoids linear lookup for all but object values. Even in new environments I do this, in many cases, because it's faster than Set.

@abozhilov
Copy link

It does not prove anything. If I pass two arrays of objects, the time complexity will be O(N ^ 2).

@jdalton
Copy link

jdalton commented Jul 23, 2016

Sure, but that's not the primary use case and not something Node folks have to worry about in 0.12 and beyond or browser folks with IE11 and beyond. So Lodash get's great perf most of the time vs. other packages. Seems like a win to me.

I'll put it another way. I pay special attention to methods like union and friends because folks tell me time and time again that the real world perf of those methods significantly improved their applications and was a motivating factor in switching packages.

Check out my ThunderPlains talk over performance. Optimizing for the common case is a theme.

FWIW, though I don't dig the technique, the core-js Set shim avoids linear search on objects too by setting a non-enumerable property on the visited object to make detecting if it's seen a breeze.

@abozhilov
Copy link

Don't get me wrong. I've never said that Lodash is bad library. My initial comment was about worst case performance time. JS is shitty in set theory API. We had to wait several years before native Set and Map which support object keys and as you told, their performance is controversial. Peace?

@jdalton
Copy link

jdalton commented Jul 23, 2016

You derailed the thread pretty good there.
Next time hit me up in email. This time sink seems unavoidable :/

@abozhilov
Copy link

Next time would better if you take algorithm course, since you don't know what time complexity means! I'm tired from all those javascripters with lack of computer science degree. Anyway, write what you want but don't tell me how your unique or union is 840x faster since you have O(N ^ 2) worst case! If you are still uncertain what big O means, every algorithm course will help you!

@jdalton
Copy link

jdalton commented Jul 23, 2016

Thanks!

I get it. ICYMI I did mention a solution that avoids linear lookup for objects.
I also covered why the edge case isn't a concern.
Having great perf most of the time is better than poor perf all of the time.

@abozhilov
Copy link

FWIW, though I don't dig the technique, the core-js Set shim avoids linear search on objects too by setting a non-enumerable property on the visited object to make detecting if it's seen a breeze.

Is this O(N) solution? What will be happen with frozen objects? Or again this is an edge case?

Your algorithm can be described with O(N) best case and O(N ^ 2) worst case. Nothing more, nothing less. If you can't get over it, this is not my problem.

@jdalton
Copy link

jdalton commented Jul 23, 2016

Is this O(N) solution? What will be happen with frozen objects? Or again this is an edge case?

Yep. If it's a frozen object (so an edge of an edge case) core-js falls back to linear search. Relying on Array#indexOf alone means O(N^2) for all cases while other more robust implementations minimize the paths which require linear search, enabling O(N) for most environments and use cases.

@madnight
Copy link

the differences to lodash should be listed in the readme.md, because that will be the first question that a typical js developer come up with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants