Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Add support for sharing Snapshots with other ArchiveBox instances (to enable distributed / federated archiving) #50

Open
pirate opened this issue Nov 2, 2017 · 4 comments
Labels
expected: maybe someday size: hard status: idea-phase Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet type: enhancement
Milestone

Comments

@pirate
Copy link
Member

pirate commented Nov 2, 2017

Next big thing I'm thinking about for BA is turning it into a distributed way-back machine! Everyone's personal archives can still be kept separate, but as part of the archive process you're prompted if you want to share the pages you've archived with a federated public archive. Each archived url then gets a deterministic "federated id" which other people will be able to use to find all archived versions of a specific url.

So when I visit my personal archive and see "example.com/blog/123.html" I can click a "show all versions" button which shows an archive in 2010 by alice, one in 2013 by bob, and one in 2017 by frank. I can click on links to view each of their versions in case mine is bad or corrupted somehow.

On the search page you'll be able to search for any url (like the wayback machine), and if it's not in your personal archive it'll show results from other people's archives.

@pirate pirate changed the title Distributed archives! Distributed/Decentralized/Federated archives! Nov 2, 2017
@pirate
Copy link
Member Author

pirate commented Nov 2, 2017

In theory, with enough people running BA we could archive a significant portion of Soundcloud before they go out of business. 😁

@pirate
Copy link
Member Author

pirate commented Nov 3, 2017

Step 1: merkel tree for identifying and querying archive blobs across a distributed system: https://gist.github.com/pirate/0a3545254615b985727b49bc5c3d99cf

@pirate pirate added the status: idea-phase Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet label Dec 20, 2017
@pirate pirate modified the milestones: v0.1.0, v0.2.0 Jan 9, 2018
@pirate pirate added status: needs followup Work is stalled awaiting a follow-up from the original issue poster or ArchiveBox maintainers and removed good-first-pr labels Apr 6, 2018
@pirate pirate pinned this issue Jan 20, 2019
@pirate pirate changed the title Distributed/Decentralized/Federated archives! Distributed/Federated archives! Jan 22, 2019
@pirate
Copy link
Member Author

pirate commented Mar 5, 2019

I like what ZeroNet is doing in this space: https://github.com/HelloZeroNet/ZeroNet

@pirate pirate unpinned this issue Mar 14, 2019
@pirate pirate added type: enhancement expected: maybe someday why: incentives and removed status: needs followup Work is stalled awaiting a follow-up from the original issue poster or ArchiveBox maintainers labels Jan 17, 2023
@pirate
Copy link
Member Author

pirate commented Jan 17, 2023

Blocked by: #74

Once we have a good unique UUID/ULID ID scheme for Snapshots we can begin thinking about how to broadcast that with some metadata to other ArchiveBox instances / endpoints.

Planned baby steps towards this goal in the far-far-future:

  1. Finalize ArchiveBox Add REST API endpoint to allow other services to POST new URLs and snapshots to ArchiveBox
  2. Add functionality for ArchiveBox to announce new snapshots to the world via RSS/webhooks/realtime endpoint of some kind:
    • rest webhook support: i.e. add the ability to configure ArchiveBox to ping outside endpoints whenever a new Snapshot/ArchiveResult is created
    • RSS feed support, i.e. publish an RSS feed on the ArchiveBox server of all recent snapshots (like Pocket does for your pocket bookmarks)
  3. Add native ArchiveBox UI support for searching some of these global federation mechanisms on your own instance so that you can browse snapshots from other instances and providers without leaving your one unified UI

External tools could then be developed that injest this feed to publish archivebox content on other platforms, e.g.:

  • archivebox RSS -> proof-of-history blockchain e.g. Solana
  • archivebox RSS -> bittorrent's magnet DHT and tracker sites
  • archivebox RSS -> IFTTT/zapier/slack/zulip/etc. webhooks

Then later we can add functionality for ArchiveBox to publish snapshots/metadata to global lookup systems like proof-of-history blockchains (e.g. Solana), DHT's like bittorrent's magent system uses, distributed filesystems like IPFS, etc.

@pirate pirate changed the title Distributed/Federated archives! Feature Request: Add support for sharing Snapshots with other ArchiveBox instances (to enable distributed / federated archiving) Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expected: maybe someday size: hard status: idea-phase Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet type: enhancement
Projects
None yet
Development

No branches or pull requests

1 participant