Mirroring Web to IPFS #96

lidel · 2016-03-26T16:28:30Z

Meta-issue tracking related work and discussions got moved to ipfs/in-web-browsers#94

Click to expand historical notes before the move

Ready to Implement

Integrate js-ipfs library to handle multipart upload to API
- ~~there are issues with browserified version that need to be resolved first: missing os module, and when all shims are enabled global.XMLHttpRequest is missing~~
Image Rehosting via HTTP API (Image Rehosting via HTTP API #59)
Save whole page to IPFS (creating a one-time shareable mirror/snapshot) (Save entire Web page to IPFS #91)

More Design Work Required

Automatic mirroring of standard websites to IPFS as you browse them
- IMMUTABLE assets: very limited feasibility, so far only two types of immutable resources on the web exist:
  - JS, CSS etc marked with SRI hash (Subresource Integrity) (mapping SRI→CID) (see discussion from 2016-03-26 below)
  - URLs for things explicitly marked as immutable via Cache-Control: public, (..) immutable (mapping URL→CID)
- MUTABLE assets: what if we we add every page to IPFS store mapping between URL and CID, then if page disappear, we could fallback to IPFS version?
  - a can of worms: a safe version would be like web.archive.org, but limited to a local machine. Sharing cache with other people would require centralized mapping service (single point of failure, vector for privacy leaks)
  - So what is needed to make it "right"?
    - keep it simple but robust: no http, no centralization, no single point of failure
    - Ideally, URL2IPFS lookups would not rely on centralized index.
      - rough idea (Automatic mirroring of HTTP websites to IPFS as you browse them #535 (comment)): what if we create pubsub-based room per URL? for example:
        
        When you open a website, you subscribe to pubsub room unique for that URL
        
        If pubsub room has entries under "keepalive" treshold, grab the latest one
        
        If room is empty or keepalive timeout is hit, fallback to HTTP, but in background add HTTP page to IPFS and announce updated hash on pubsub (with new timestamp) for next visitor
        
        There are still pubsub performance and privacy problems to solve (eg. publishing banking pages), but at least we don't rely on HTTP server anymore.
- Other notes
  - "webpackage" standard proposal surfaced recently, among other things, it aims to address website snapshoting use case in a safe and reproducible manner:
    - webpackage: Save and share a web page (Use Case)
    - Sounds super relevant to what we want as the endgame here

Related Discussions

2016-03-26

IRC log about mirroring SRI2IPFS

165958           geir_ │ lgierth: The web sites would have to link to ipfs content for this plugin to work. What i propose is a proxy that works like a transparent proxy and puts content into ipfs if it's not already there
170124            ed_t │ anyone know anything about ipfs-boards
170141            ed_t │ it keeps telling me I am in limited mode
170202            ed_t │ a full ipfs 0.40-rc3 node is running on localhost:5001
170217            ed_t │ but it does not seem to see it using the demo link
170228        +lgierth │ geir_: ah got what you wanna do -- i'm not sure you can easily just rewrite anything
170253        +lgierth │ for completely static pages, yes, but for slightly more dynamic stuff?
170303        +lgierth │ i'll be back in a bit, getting some coffee
170422           geir_ │ lgierth: I mean only for the static stuff like images, libs and so on. Should be pretty strait forward to implement. And a big bandwidth save for big networks
171542           lidel │ geir_, we are planning to add "host to ipfs" feature to the addon
171614           lidel │ when that is done, it should be easy to add option to automatically add every visited page
171634           lidel │ not sure how addon would do lookups tho
171734           lidel │ (meaning, how do i know the multihash of the page, how do we handle ipfs-cache expiration when page gets updated, etc)
171831           geir_ │ lidel: I see, thanks for the info. I still like the idea of a transparent proxy so every user/device on the network will use the "cdn" automatically
171852           lidel │ perhaps we could start with mirroring static assets that have SRI hash (https://www.srihash.org/)
171920           lidel │ and come up with a way for doing SRI2IPFS lookups

2018-01-14

https://discuss.ipfs.io/t/web-browser-with-integrated-ipfs-node-support-for-browser-cache/1799/5

2018-03-08

[Suggestion] : IPFS browser extension as lite-node? https://github.com/ipfs/ipfs/issues/310

2018-07-09

https://discuss.ipfs.io/t/mirroring-standard-websites-to-ipfs-as-you-browse-them/3355

2018-07-23

http->ipfs translator proposal Automatic mirroring of HTTP websites to IPFS as you browse them #535
webpackage standard draft
- https://github.com/WICG/webpackage/blob/master/explainer.md#save-and-share-a-web-page
- https://wicg.github.io/webpackage/draft-yasskin-webpackage-use-cases.html#snapshot

The text was updated successfully, but these errors were encountered:

victorb · 2018-03-16T15:22:23Z

I don't think it's necessary to automatically mirror any scripts with ipfs-companion.

Instead, we can have a script that checks all script tags with a integrity attribute from Alexa Top 1000 or something like that, add those to IPFS and map their hashes to IPFS hashes and ship that index with ipfs-companion.

That way, when ipfs-companion hits a resource with integrity tag, it has a big chance of already being available on IPFS.

timthelion · 2018-07-02T09:00:07Z

I'm not really sure if I fully understand this idea. Are you trying to automatically cache web-pages using IPFS? Or to fall back to IPFS when a page is down?

I came across this issue, because I want to be able to "print a webpage to IPFS". I do a lot of visual programming research and most things I find are on short lived personal sites. When I then cite or link to these articles, I find that a few years later, the web page is gone. I'd like to be able to click a button like "print to IPFS" which would generate an IPFS link that I could cite or something like that. Not automatic, but manual.

lidel · 2018-07-02T16:30:12Z

@timthelion automatic mirroring is a hard problem (for reasons noted in #96 (comment)), but creating shareable snapshots of webpages could be implemented as on-demand action.

Actually, "print a webpage to IPFS" is something we want to add to browser extension. If you have ideas on how it should work, check/comment on initial vision in #91 (comment).

lidel · 2018-07-24T13:28:43Z

FYSA meta-issue tracking related work and discussions got moved to ipfs/in-web-browsers#94

lidel added kind/enhancement A net-new feature or improvement to an existing feature kind/discussion Topical discussion; usually not changes to codebase help wanted Seeking public contribution on this issue labels Mar 26, 2016

lidel added status/blocked/missing-api Blocked by missing API and removed help wanted Seeking public contribution on this issue labels Aug 3, 2016

lidel added this to Near Future in General Roadmap Oct 2, 2017

lidel moved this from Near Future to Possible Right Now in General Roadmap Oct 2, 2017

lidel removed the status/blocked/missing-api Blocked by missing API label Oct 2, 2017

lidel added the help wanted Seeking public contribution on this issue label Jan 14, 2018

lidel added the status/ready Ready to be worked label Mar 7, 2018

lidel mentioned this issue Jul 23, 2018

Automatic mirroring of HTTP websites to IPFS as you browse them #535

Open

chpio mentioned this issue Jul 24, 2018

Save entire Web page to IPFS #91

Open

lidel mentioned this issue Jul 24, 2018

CID as a Subdomain ipfs/in-web-browsers#89

Open

ipfs locked as resolved and limited conversation to collaborators Jul 24, 2018

lidel closed this as completed Jul 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mirroring Web to IPFS #96

Mirroring Web to IPFS #96

lidel commented Mar 26, 2016 •

edited

Loading

Ready to Implement

More Design Work Required

Related Discussions

victorb commented Mar 16, 2018 •

edited

Loading

timthelion commented Jul 2, 2018

lidel commented Jul 2, 2018 •

edited

Loading

lidel commented Jul 24, 2018

Mirroring Web to IPFS #96

Mirroring Web to IPFS #96

Comments

lidel commented Mar 26, 2016 • edited Loading

Ready to Implement

More Design Work Required

Related Discussions

victorb commented Mar 16, 2018 • edited Loading

timthelion commented Jul 2, 2018

lidel commented Jul 2, 2018 • edited Loading

lidel commented Jul 24, 2018

lidel commented Mar 26, 2016 •

edited

Loading

victorb commented Mar 16, 2018 •

edited

Loading

lidel commented Jul 2, 2018 •

edited

Loading