Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Priority queue / self-sorting list [can PR] #14

Open
Methuselah96 opened this issue Oct 16, 2020 · 6 comments
Open

Feature request: Priority queue / self-sorting list [can PR] #14

Methuselah96 opened this issue Oct 16, 2020 · 6 comments

Comments

@Methuselah96
Copy link
Owner

From mcclure on Thu, 08 Oct 2020 05:15:11 GMT

Hi, I have a need for a "priority queue" style data structure, or more specifically I need an immutable structure that I can add items to in arbitrary order and then repeatedly efficiently iterate over according to some fixed property (for example if I have a list of objects obj, I might want to keep it sorted by the key obj.date).

I need this enough I plan to attempt an implementation myself. I would be happy to write a PR if there is a specific way I could write it that you would be willing to accept.

I find a number of algorithms online for immutable priority queues, but it seems like it might be a good idea to base my implementation on the existing list structure rather than implementing a totally new structure. For this reason I am curious if there is any "contributor documentation" that would help me understand the underlying principles of immutable.js List. I am pawing through the source but not having much luck understanding where the important parts are. List.js seems to be describing something like a linked list but each node has this "array" member rather than a next reference and I'm not sure what "array" contains.

Copied from original issue: immutable-js#1791

@Methuselah96
Copy link
Owner Author

From mcclure on Sat, 10 Oct 2020 15:31:06 GMT

ownerID in List is only needed for the implementation of mutable mode, right?

Copied from original issue: #14

@Methuselah96
Copy link
Owner Author

From mcclure on Mon, 12 Oct 2020 22:52:49 GMT

Made a StackOverflow question about my difficulties running npm run test / npm run perf: https://stackoverflow.com/questions/64325833/how-to-run-perf-tests-automated-tests-in-an-immutable-js-repo

Copied from original issue: #14

@Methuselah96
Copy link
Owner Author

From mcclure on Fri, 09 Oct 2020 15:54:00 GMT

Got some help on Twitter, here's the best I can understand: List is an "Vector Trie" apparently patterned on the Scala vector type, meaning it's a tree where every leaf is an array of maximum 32 elements. Bottom leaves other than the rightmost are assumed to be filled and when you lookup an index the node you're looking for is found by pure math, which is why midlist insert is O(N).

My plan is:

  • Subclass List; take the "key function" that turns values into keys as an constructor argument
  • Change SortedList so every node at every level of the tree knows its max and min relative to the "key value"
  • Kill the random-access get() and set() functions (maybe this is a sign List should not be a superclass), only pop front, pop back and iterate are allowed
  • Since every level knows its min/max, and random access is disabled, index arithmetic is no longer needed to calculate the leaf node corresponding to a value. This means when I remove an item from a leaf node I don't have to repack, and if a mid-list insertion would push a leaf node above 32 elements I can simply split it into two 16-leaf nodes.

On top of this basic plan, I may attempt the following two nice-to-haves:

  • If I get many pushes at once via withMutations, rather than doing a search for each push I can sort the list of elements-to-push conventionally and then do something resembling a mergesort merge.
  • Since "indexes" are not meaningful with this approach, some variants of slice() or toSliceSeq() might be nice that let you say, for example, "give me all of this list newer than key X" or "give me the ten values less than or equal to key Y".

I haven't done any proofs this is actually efficient, but my intuition is that because removal is generally only done at the beginning and end and insertion cannot reduce leaf size below 16, that this will give between O(log_16(n)) and O(log_32(n)) performance on insertions, front pops and end pops, and this is provable.

This approach would resolve the problem in my application. If I get it working, would you be interested in merging it? What requirements do you have that a new feature like this has adequate perf (ie, do you require proofs, or just verification of good average-case performance in testing?)

Things I'm not clear on yet:

  • What is the "_origin" member in the List.js implementation?
  • What is the "__hash" member in the List.js implementation?
  • Is there a difference between butLast() and pop()?

I am also seeing a problem where if I run npm install in an immutable-js checkout, I get node-gyp errors related to the v8 headers and ending in "Failed at the microtime@2.1.8 install script". This is on OS X 10.13.6 with npm 6.14.8, node v14.10.1. I have not investigated this yet. I do not get this error when npm installing a project with immutable-js as a dependency so I assume this is a problem with one of the devDependencies. EDIT: fixed this with PR immutable-js#1792. However I am still unable to run npm run build or npm run test because I get "Error: Cannot find module 'react/addons'", and I am seeing this mysterious error when I run git run perf:

Pearl:immutable-js mcc$ npm run perf

> immutable@4.0.0-rc.12 perf /Users/mcc/work/h/tempt2/fork/immutable-js
> node ./resources/bench.js

ugh Error: Command failed: git show master:dist/immutable.js
fatal: Path 'dist/immutable.js' exists on disk, but not in 'master'.

    at ChildProcess.exithandler (child_process.js:308:12)
    at ChildProcess.emit (events.js:314:20)
    at maybeClose (internal/child_process.js:1047:16)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:287:5)

I can't even figure out what script is trying to invoke git show.

Copied from original issue: #14

@Methuselah96
Copy link
Owner Author

From mcclure on Tue, 13 Oct 2020 05:34:26 GMT

I've started on an implementation of this, feature branch is here https://github.com/mcclure/immutable-js/tree/pq

This adds an exported SortedList and isSortedList. SortedList has a complete implementation of one function, add(). If you call it, it crashes. (The implementation of "add" turned out to be more complicated than I thought! I don't actually know whether my approach is actually more efficient than just using a balanced tree or something.)

Will make an initial PR once I have add, pop and shift working.

Copied from original issue: #14

@Methuselah96
Copy link
Owner Author

From mcclure on Sat, 10 Oct 2020 16:30:31 GMT

Also, what are the "@@transducer/" functions for? I do not find any description of them in the docs.

Copied from original issue: #14

@Methuselah96
Copy link
Owner Author

From mcclure on Fri, 09 Oct 2020 15:56:29 GMT

(One other small thing: an alternative to using the "key function" would be make this a key/value store and sort on the key. This would result in the same implementation but a slightly different external interface. I think this approach is a little weird because you can have multiple values sharing a single key, but I don't have a strong opinion either way about it.)

Copied from original issue: #14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant