Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Consider using ipfs or other existing content addressable stores for tarballs. #99
In the readme you mentioned that you want to use a content addressable storage.
There are existing content addressable systems like IPFS that you can leverage.
I’ve recently spoken with IPFS engineers and they are really interested in making IPFS easy to use for package managers so they might be open to implement features you need.
IPFS perf is a worry, but as an overall approach I think pluggable backends is good. My next task for the project is to make it possible to store the content blobs in S3 & other object stores so people who have durability requirements (and don't want to deal with backing up disks) can have this option.
I am involved in the DAT community and saying HI! So: DAT is pretty cool for this, but I wouldn't use it ... yet ... because it doesn't solve enough features for a reasonably sized registry. Though that is changing: @andrewosh has progressed far in adding a hypertrie structure mafintosh/hyperdrive#233 (available in a rc-release). The new hyperdrive is tested with a lot of files and a lot of data. (terabytes, pentabyte-test still running)
This makes it an interesting candidate for a decentralized data structure:
But there are good challenges ahead why to not use DAT.
referenced this issue
Jun 11, 2019
Would a protocol like BitTorrent be an interesting technology to support a package registry? It's already widely used for torrenting and seems to be quite performant.
Similarly, as stated in #252, if you are interested in exploring blockchain options I can link a few experts from the community here (Maidsafe, Skycoin, etc.).
Let me know what you think!
Perf issues was my instinct also.
I'm all for 'decentralized' but seems if someone doesn't ensure they're the 'always on +current +connected source,' there can be no certainty of file availability, which can not happen. So there has to be 1 source of truth somewhere. But extra ad-hoc POPs for a CDN-like network is a cool idea, no matter the protocol.
May I suggest contacting jsDelivr for help with this? They built their own routing system that spreads file requests over 4+ CDNs, with backups for the backups. They might be able to host the files though jsDelivr even; they already mirror npm & a chunk of JS on Github.
is stated, is perf short for performance?
The protocol is decentralized, sure, but it is also distributed. Copies of the the files stored in this protocol are automatically distributed which means...
if the main hoster is down, then copies are still available via other peers on the network...
Ideally the file will still be available from anyone else because it is a p2p protocol as well. This essentially allows it to behave like a CDN with redundant backups all over the net.
It's also faster and more efficient because you never download from a single source from a server that may be a considerable distance away from you. Instead, you download from peers that are closest to you and grab those files incrementally from those designated peers.
Downloading from a centralized source has always been slower than downloading from a p2p source.
Sources disappearing has also always been an issue regardless of the protocol or hosting service.
I think it should be allowed if the hoster wants their files to disappear. There's a reason why GDPR was put in to effect and it's because sometimes people want this.