How to get the data? #1

ciro-maciel · 2017-10-23T18:48:27Z

@tristanls liked your project, congratulations!

I'm looking at your documentation and trying to understand the architecture. I'm not finding anything on how to get the data.

How to get the data?

Thank you

Ciro

tristanls · 2017-10-23T20:49:02Z

Hey Ciro,

I never got around to implementing consolidating the data intervals nor reading from the written data intervals. The data is stored in a LevelDB-backed storage, that's what alldata-storage-leveldb uses. In particular, here is where storage is created - https://github.com/tristanls/alldata-storage-leveldb/blob/master/index.js#L344 and a write to it https://github.com/tristanls/alldata-storage-leveldb/blob/master/index.js#L361. Reading the data would involve opening up the underlying store using levelup.open. And then reading accordingly, probably via a levelup.createReadStream.

In case you're curious, there is a similar system that has been implemented end-to-end that I saw a presentation on. It has similar design elements as alldata, and is called OK Log. Check it out: OK Log video, repo link here.

I hope this helps.

Cheers,

Tristan

Edit: Added reference to levelup.createReadStream to demo streaming all the data from an interval, once opened.

tristanls · 2017-10-23T20:53:36Z

For a visual explanation of the architecture alldata implements, you can see a slide show explaining high-level concepts, starting here.

Edit: Updated link (accidentally linked to start of presentation instead of specific section).

ciro-maciel · 2017-10-24T11:35:27Z

Hello @tristanls,

I'm going to study the material you've sent and I'll return.

Thanks for the answer!

Ciro

ciro-maciel · 2017-10-26T14:18:53Z

Hello @tristanls,

Thank you for your point of view and the ample and detailed material, fantastic!

I believe you have understood about allData working, its documentation is great.

I have some questions / ideas, what would be your opinion about it:

for a search on the data, do you think it's the best way to do a direct search on the local LevelDB instance?
regarding fulltext searches, create the indexes in these locals instances and perform the search?

Ciro

tristanls · 2017-10-26T14:44:18Z

Hello,

I think if alldata were fully implemented using the ideas described in the slides I linked before, then there might end up no room for the indices on the machines that store the data. I would expect the machines to eventually become full and migrate themselves into read-only cluster. Although, this only implies that persistent storage is "full", memory might be available and if your indices fit into memory then that might work.

Another consideration is that the data in alldata is stored in-order, so sequential access via something like levelup.createReadStream would probably be most "performant" (for some arbitrary definition of "performant"). Doing random access/searches against in-order data might be less "performant". Then again, if you created indexes in memory, it might work just fine.

Cheers,

Tristan

tristanls · 2017-10-26T14:46:39Z

Here's some info more reliable than my opinion about performance of leveldb :)
https://github.com/google/leveldb#performance

ciro-maciel · 2017-10-29T12:26:29Z

Understood @tristanls,

The numbers on the levelDB performande are interesting!

Your presentation is important, very instructive.

I found this mechanism (DHTs - bucket storage) very interesting, once I reserve more time I will start to study these structures more.

Thank you.

Ciro

tristanls · 2017-10-29T12:56:36Z

Regarding bucket storage, here's an implementation that's used in a bunch of production DHTs: k-bucket. As of v3.0.3, it is optimized to use less heap, but if you look at the code prior to v3.0.3, it is a very literal implementation of the presentation slides.

tristanls · 2017-10-29T13:03:29Z

By the way,

I'm happy to hear you find these helpful, thanks for letting me know.

Cheers,

Tristan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get the data? #1

How to get the data? #1

ciro-maciel commented Oct 23, 2017

tristanls commented Oct 23, 2017 •

edited

tristanls commented Oct 23, 2017 •

edited

ciro-maciel commented Oct 24, 2017

ciro-maciel commented Oct 26, 2017

tristanls commented Oct 26, 2017

tristanls commented Oct 26, 2017

ciro-maciel commented Oct 29, 2017

tristanls commented Oct 29, 2017

tristanls commented Oct 29, 2017

How to get the data? #1

How to get the data? #1

Comments

ciro-maciel commented Oct 23, 2017

tristanls commented Oct 23, 2017 • edited

tristanls commented Oct 23, 2017 • edited

ciro-maciel commented Oct 24, 2017

ciro-maciel commented Oct 26, 2017

tristanls commented Oct 26, 2017

tristanls commented Oct 26, 2017

ciro-maciel commented Oct 29, 2017

tristanls commented Oct 29, 2017

tristanls commented Oct 29, 2017

tristanls commented Oct 23, 2017 •

edited

tristanls commented Oct 23, 2017 •

edited