New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing archived versions as content-addressable hashes (i.e., content-addressable snapshots) #985

Open
karissa opened this Issue Apr 27, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@karissa
Collaborator

karissa commented Apr 27, 2018

In a public archiving and publication case (e.g., a static PDF), you might want to reference a version of the data by a content-addressable hash. This way, you can guarantee that even if someone lost the original dat keys, they can get that data if it is still on the network, and anyone can regenerate the hash consistently and re-share it if they have the content of the archive.

This would also be an interesting approach to use for implementing a CDN-style interface.

Security note: perhaps there's also a way to hide the original content-addressable hash from the network to protect reader privacy, in a similar way we are using discovery keys today. Because this method would likely be reserved for archives that are deemed public and static, it might be nice to decouple this archive from the original dat keys that generated it, to preserve the privacy of the original dat in case that's desirable.

Feature ideas:

  • a method to get the content-addressable hash of an archive
  • swarm broadcasts that hash on the network
  • when listening for a 'live' dat key, be able to find these static versions from the swarm if anyone has them

I am reporting:

  • a bug or unexpected behavior
  • general feedback
  • feature request
  • security issue
@pfrazee

This comment has been minimized.

Show comment
Hide comment
@pfrazee

pfrazee Apr 27, 2018

A couple of thoughts I've had on this:

  • It used to be possible to create a dat archive which is content-addressed, and that would be a separate archive. You basically get all the files ahead of time, write them, and then a resulting hash becomes the key (I dont know what that hash was of, details details). One method could be to bring that function back, and then a "snapshot" would basically be a totally separate archive that's hash-addressed, constructed from the state of the dynamic dat.
  • If needed, we can create identifiers that look like dat+snapshot://{key}/. Basically we can create addressing/protocol variants using the + in the scheme.

pfrazee commented Apr 27, 2018

A couple of thoughts I've had on this:

  • It used to be possible to create a dat archive which is content-addressed, and that would be a separate archive. You basically get all the files ahead of time, write them, and then a resulting hash becomes the key (I dont know what that hash was of, details details). One method could be to bring that function back, and then a "snapshot" would basically be a totally separate archive that's hash-addressed, constructed from the state of the dynamic dat.
  • If needed, we can create identifiers that look like dat+snapshot://{key}/. Basically we can create addressing/protocol variants using the + in the scheme.
@karissa

This comment has been minimized.

Show comment
Hide comment
@karissa

karissa Apr 27, 2018

Collaborator

@pfrazee nice ideas!

Bret also mentioned that people were musing about this but wanted to wait for implementation until multiwriter/hyperdb lands on mainline, which makes sense.

Collaborator

karissa commented Apr 27, 2018

@pfrazee nice ideas!

Bret also mentioned that people were musing about this but wanted to wait for implementation until multiwriter/hyperdb lands on mainline, which makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment