Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding compatible hypercores #24

Open
3 tasks
Tracked by #71
serapath opened this issue Jan 14, 2020 · 6 comments
Open
3 tasks
Tracked by #71

encoding compatible hypercores #24

serapath opened this issue Jan 14, 2020 · 6 comments
Assignees

Comments

@serapath
Copy link
Member

serapath commented Jan 14, 2020

@todo

  • write down and/or sketch the concept about how to deal with encoded and related hypercores (e.g. in case a dat address is a hyperdrive which comes with a related content hypercore)
  • if structures like hyperdrives are used, make sure the "datdot app" (when a user pastes a hyperdrive address) can figure out all related hypercores and have them seeded by the network
  • make a module to be used with or to patch hypercores, so that the underlying data is stored with a custom encoding and can be accessed, but a regular "dat peer" can still ask for chunks and receive the decoded chunks as if there was no encoding present in the first place
    • Check to see if replication logic uses storage directly or if things can be monkey patched
    • maybe monkey patch hypercores?
    • maybe use a custom random-access-??? module or a random-access patcher module?

@RangerMauve
Copy link

Working on this now

@RangerMauve
Copy link

Here's some notes from looking at how the replication in hypercore works from the protocol down to the storage.

When we get data from a peer, it gets invoked here, this will be the prompt to intercept the data and encode it / add it to the encoded hypercore

This in turn calls _putBuffer on the unencoded hypercore. This bit of code is super complicated to understand. 😅😅😅 Most of it seems to be checking if there's data that's missing, I guess data that we don't already have locally? Then it invokes _verifyandWrite which does some stuff to check if stuff is valid, and then invokes _write to actually write the data. It seems we can hook into this with the _onwrite hook, though it doesn't seem to do anything. Maybe this is where encoding could be messed with?

Eventually the data for the hypercore is written using the storage instance in the putData method. That will then calculate the offset within the file that it should write the data at and finally invoke write on the random-access-* instance for the data.

A custom random-access-storage thing would work, but we'd need to create the reverse of the dataOffset method where we get the data index given an offset. This might be kinda hard and I'm not sure how the method would become available within the random-access-* instance.

Another option would be to subclass Storage and provide custom putData and getData methods which would proxy to the compressed hypercore. We would need a PR to hypercore to be able to pass in a storage instance instead of creating a new one each time, or we could do a gross hack and monkey-patch the methods in the storage instance of the hypercore after it's been initialized.

@RangerMauve
Copy link

Here's a HackMD with some diagrams talking about how the communication between the compressed / uncompressed hypercores could work. https://hackmd.io/uNQsTqDORmOaUD9-48X13w

@RangerMauve
Copy link

Pausing for today

@RangerMauve
Copy link

Wrote up ideas here: https://hackmd.io/6Wyij7_uTbGxfOSSlJOgZQ

@RangerMauve
Copy link

Boom: https://github.com/RangerMauve/intercept-hypercore-storage

This lets us intercept storage events and stuff.

Next I'll work on encoders storing encoding data on hosts in a hypertrie, then I'll work on having the host set up a hypercore which will be intercepted to serve data stored in the hypertrie.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants