createReadStream() by e-e-e · Pull Request #35 · mafintosh/hyperdb

e-e-e · 2017-12-31T06:42:43Z

I started playing around with building a readable stream interface for hyperdb. I started this as createDiffStream is scaling badly, and wanted to see if I could get a super lean stream working.

This is really rough, and not alt all ready for merging (still using => funcs and riddled with console logs), but I figured I should open a PR to get your guys input. I have been using @noffle's architecture description as a basis, and reading your code, but still might be doing something totally stupid.

Any thoughts would be appreciated. Its quick and respects back pressure so hopefully will scale well.

One bug that I cant seem to figure out so far is illustrated in the first test - although there are multiple 'bar/cat' keys, it's only returning the latest - while for all other keys (for examplefoo/) the duplicates are returned. UPDATE: I changed the implementation of read stream so that it only returns the latest entries.

…cat'

e-e-e · 2017-12-31T06:43:56Z

also - happy new years 🍾
hope your taking a break and relaxing.

…alues between writers

…ssue

e-e-e · 2018-01-04T09:29:52Z

@mafintosh @noffle I think I have got this to a point where this is working well. Would you mind having a look and seeing if there is any thing obviously dumb or if I have really tackled this from the wrong angle completely.

There may be edge cases around conflicting nodes which I have not thought through.

I also have to write some nicer tests - the tests at the moment are just what I was playing around with to test things for myself.

Let me know your thoughts.

e-e-e · 2018-01-07T08:59:12Z

@noffle and @mafintosh I think this is ready now - if you want to take a look. I just experimented with this implementation of createReadStream on an instance of hyper-graph-db with 200k+ entries and got search times down from 30-40s to 1-3s. Its really quick.

hackergrrl · 2018-01-07T16:42:03Z

Awesome work @e-e-e! I've been following your progress somewhat, though I'm not sure when I'll have time to go over it in detail.

…and only sort when needed

e-e-e · 2018-01-08T09:17:17Z

I noticed that the node queue sort was a performance bottle neck - so just implemented a conditional check meaning it only sorts when the next node is less than the last node in the queue. Strangely though - it looks like now the queue does not actually need to be sorted at all. The next algorithm always returns them in order seemingly - I am skeptical that this is always the case so have left the sort as part of the stream - but perhaps you will have a better idea @mafintosh, and maybe we dont need it at all.

mafintosh · 2018-03-12T16:06:27Z

We merged next :)

Benjamin Forster added 3 commits December 31, 2017 17:20

add eslint config

92ee889

initial experiment in building createReadStream

a79e4cd

remove only from test - and make first test illustrate non-dupe 'bar/…

e684afe

…cat'

e-e-e added 5 commits January 2, 2018 22:05

adjust to also use sequence to order readStream - todo: conflicting v…

bd9e3ae

…alues between writers

remove .only from test...

1a03efc

simplify read stream; solve problem that was illustrated in bar/cat i…

a1a62f8

…ssue

update create readStream to only return the latest entries

5e1d349

fix failing tests.

d608f92

joehand mentioned this pull request Jan 4, 2018

Two way replication dat-ecosystem-archive/hyperdiscovery#11

Closed

Benjamin Forster added 5 commits January 7, 2018 15:09

make readStream return conflicting values for user to resolve

1088542

make more efficient by adding current path index to streamQueue

94c0a17

more and rename function

dc57307

optimise by removing visited cache

0980b12

make readStream more performant

d79d255

e-e-e changed the title ~~createReadStream() [WIP]~~ createReadStream() Jan 7, 2018

update docs

df5eeb5

enable stream to be destroyed

e69c538

Benjamin Forster added 2 commits January 8, 2018 20:08

fix test on node 4, and tidy up tests a bit

062d513

optimize readStream even more - use Array.pop instead of Array.shift …

6059b27

…and only sort when needed

mafintosh closed this Mar 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

createReadStream()#35

createReadStream()#35
e-e-e wants to merge 17 commits intomafintosh:masterfrom
e-e-e:master

e-e-e commented Dec 31, 2017 •

edited

Loading

Uh oh!

e-e-e commented Dec 31, 2017

Uh oh!

e-e-e commented Jan 4, 2018

Uh oh!

e-e-e commented Jan 7, 2018 •

edited

Loading

Uh oh!

hackergrrl commented Jan 7, 2018

Uh oh!

e-e-e commented Jan 8, 2018

Uh oh!

mafintosh commented Mar 12, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

e-e-e commented Dec 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

e-e-e commented Dec 31, 2017

Uh oh!

e-e-e commented Jan 4, 2018

Uh oh!

e-e-e commented Jan 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hackergrrl commented Jan 7, 2018

Uh oh!

e-e-e commented Jan 8, 2018

Uh oh!

mafintosh commented Mar 12, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

e-e-e commented Dec 31, 2017 •

edited

Loading

e-e-e commented Jan 7, 2018 •

edited

Loading