Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a P2P network for SponsorBlock database #1570

Open
jcastro opened this issue Nov 2, 2022 · 13 comments
Open

Make a P2P network for SponsorBlock database #1570

jcastro opened this issue Nov 2, 2022 · 13 comments

Comments

@jcastro
Copy link

jcastro commented Nov 2, 2022

I LOVE this service and would like to contribute, I don't know how to developer but I do have some 24/7 servers that I could use to host an instance of the app? Would that be possible? Or somehow have the DB somewhere where we can all update it so the service would go barely offline

@skeddles
Copy link

skeddles commented Nov 2, 2022

yes please. being able to host a local copy of the db would be nice too, or at least cache certain channels so it doesnt become useless when the server goes down.

having an icon or something that says when it's down would be nice too, instead of just silently failing.

@ajayyy
Copy link
Owner

ajayyy commented Nov 3, 2022

@jcastro
Copy link
Author

jcastro commented Nov 3, 2022

@ajayyy this is awesome! thanks so much. I was wondering if it's possible to add several address to 'SponsorBlock Server Address' config in the extension settings? so we can have several mirrors added there as well

image

@lidel
Copy link

lidel commented Nov 3, 2022

I've found this issue because I've got an error related to HTTP server hosting sponsor.ajay.app being temporarily unavailable:

2022-11-03_16-19

One way to mitigate this, is to put the DB on IPFS (as immutable snapshots) and set up either IPNS and/or DNSLink for publishing updates pointing at the latest version.

This not only allows P2P retrieval and makes it easier for people to co-host the DB, but also provides an HTTP CDN for regular browsers thanks to public gateways

Example (DNSLink):

Going IPFS route has a nice property of being backward and forward compatible:

  • Regular browsers still benefit from having multiple mirrors that can be used as a fallback if the original server is down.
  • Browsers with built-in IPFS support (like Brave) or IPFS Companion browser extension+Desktop app will be able to do opportunistic protocol upgrade and load DB P2P.

I am working on IPFS, happy to answer any questions if any of the above feels useful.

@lewisdoesstuff
Copy link

@jcastro I've just submitted PR #1572 to add this - you can build it from my fork if you don't want to wait until it's merged :)

@jcastro
Copy link
Author

jcastro commented Nov 3, 2022

@jcastro I've just submitted PR #1572 to add this - you can build it from my fork if you don't want to wait until it's merged :)

beautiful! thanks so much

@FireMasterK
Copy link

I am interested in making the database accessible over IPFS. This will require a lot less bandwidth for mirrors, as everyone would share the same files, instead of rsync mirrors which are centralized.

@lidel are you aware of any project that can follow an IPNS link and keep a mirror (even as the cid updates) of it?

This would be helpful for mirror projects like https://github.com/TeamPiped/sponsorblock-mirror and would save bandwidth for everyone.

I'm also interested in any sort of potential real-time streaming of events, which can be used to keep mirrors up-to-date at real time. (Without polling for database updates every hour)

@ajayyy
Copy link
Owner

ajayyy commented Nov 5, 2022

When looking into ideas for sb-mirror, rsync was chosen as it handled deduplication best. The way the csv's are modified makes ipfs impractical as most of the file needs redownloading

@ajayyy
Copy link
Owner

ajayyy commented Nov 5, 2022

For reference, read ajayyy/SponsorBlockServer#373 which lists the alternatives considered and reasonings

@mchangrh
Copy link
Contributor

mchangrh commented Nov 5, 2022

When looking into ideas for sb-mirror, rsync was chosen as it handled deduplication best. The way the csv's are modified makes ipfs impractical as most of the file needs redownloading

IPFS has 0 version control, none of the chunks were deduplicated when uploading a new version since their chunking is dumb, the views would return a different hash even for earlier segments.

storing a single file would take up 120% of the space and take my machine about 10 minutes to chunk, hash and split, (R5 3600 + nvme). uploading subsequent version of the files would be under new addresses in the same folder, but use up another 120% of space.

@mchangrh
Copy link
Contributor

mchangrh commented Nov 5, 2022

https://docs.google.com/document/d/1YZwsVustQxC8sXsYqsq_sWZ6xnOx01aftjGclrW9zeI/edit?usp=sharing

quick write-up as to why rsync was chosen and why p2p was quickly disregarded

@Splarkszter
Copy link

some mirrors are read only, mirrors are also third party. so if you are not getting segments on new videos make sure to disable the mirrors. as for time as writing SB main server works fine.

@ccuser44
Copy link

I think a good solution would be to have many servers in different locations and have the extension query the nearest API server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants