Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedBlackTreeStrategy throws if value is already in set? #4

Closed
mxdubois opened this issue Oct 30, 2015 · 7 comments
Closed

RedBlackTreeStrategy throws if value is already in set? #4

mxdubois opened this issue Oct 30, 2015 · 7 comments

Comments

@mxdubois
Copy link

Here's the offending line of code: https://github.com/adamhooper/js-sorted-set/blob/master/src/SortedSet/RedBlackTreeStrategy.coffee#L68

I would expect the set to no-op when the value is already in the set. Is there a reason it throws? Or perhaps that just came over from the original source?

@adamhooper
Copy link
Owner

This is by design.

There's a problem with making that error optional: there are lots of ways to handle this case and they all make sense. When you try to add a element a2 to a set that already contains a1 which compares as its equal, which of the following should happen:

  1. a1 is replaced by a2.
  2. a2 is silently ignored.
  3. a2 is silently ignored, iff a1 === a2; otherwise, we throw an exception.
  4. a2 is added alongside a1 iff a1 !== a2.
  5. insert() returns false and a2 is ignored.
  6. we throw an exception.

I hate the first four options because they encourage buggy code: the differences are so subtle that developers will assume they know what's happening when they actually don't. C++ STL chooses 2; Java, C# and Scala choose 5; JavaScript objects (if you're treating them as hashes/sets of Strings) choose 2; Ruby has undefined behavior, but the implementation seems to be 2; ditto Python.

Option 2 is compelling, but it hinders some use cases. I originally used js-sorted-set to maintain a huge list of unique points; the exception helped me pinpoint where I was making mistakes. (In other words: after 10 insert()s, I expect forEach() to call its callback 10 times.) I could put a contains() call before each insert(), but that would give a performance hit; in contrast, the exception comes for free.

Option 5 is compelling, but that makes for an unintuitive API. insert() shouldn't return anything, just by its name.

So I chose Option 6. Crashing during development is far better than embedding subtle edge cases in production.

Perhaps we should add new convenience methods: testAndInsert() (5 above, inverted) and/or insertUnlessContains() (2 above).

I welcome a pull request.

@jacoscaz
Copy link
Contributor

@mxdubois would you welcome a PR that would address this issue by introducing the concept of "Insert Strategies"? The idea would be to export an enum of possible strategies and have the code react to whichever strategy has been set at instantiation time. Something like what follows:

SortedSet.ThrowInsertStrategy = 0;
SortedSet.IgnoreInsertStrategy = 1;
SortedSet.ReplaceInsertStrategy = 2;

Then the while() loop at https://github.com/adamhooper/js-sorted-set/blob/master/sorted-set.js#L572 could become something like this:

while (true) {
  cmp = compare(value, parent.value);
  if (cmp === 0) {
    switch(this._insertStrategy) {
      case ThrowInsertStrategy:
        throw 'Value already in set';
      case IgnoreInsertStrategy:
        return parent;
      case ReplaceInsertStrategy:
        parent.value = value;
        return parent;
      default:
        throw 'Unsupported insert strategy';
    }
  }
  leftOrRight = cmp < 0 ? 'left' : 'right';
  if (parent[leftOrRight] === null) {
    break;
  }
  parent = parent[leftOrRight];
}

@jacoscaz
Copy link
Contributor

Apologies Michael, I had meant to tag @adamhooper.

@adamhooper
Copy link
Owner

@jacoscaz sure!

SortedSet.insert() must return something saying whether there was a conflict -- so a strategy can mimic Java, C# and Scala. I imagine true ("it's a new value") /false ("it isn't") ... plus README documentation explaining this.

The default should be to throw (so the programmer can learn what the options are).

@jacoscaz
Copy link
Contributor

jacoscaz commented Apr 2, 2020

Fixed by #7 !

@adamhooper
Copy link
Owner

Yep -- once we release we can close this :).

@adamhooper
Copy link
Owner

For future readers, the new feature lets you write:

const set = new SortedSet({ onInsertConflict: SortedSet.OnInsertConflictReplace })
set.insert(1)
set.insert(1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants