Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Future of "accessing API of remote IPFS node" #137

Open
3 of 16 tasks
lidel opened this issue Dec 19, 2018 · 50 comments
Open
3 of 16 tasks

The Future of "accessing API of remote IPFS node" #137

lidel opened this issue Dec 19, 2018 · 50 comments
Labels
developers dif/expert Extensive knowledge (implications, ramifications) required epic needs clarification specs status/ready Ready to be worked ux

Comments

@lidel
Copy link
Member

lidel commented Dec 19, 2018

Started as a discussion between @lidel & @olizilla (2018-12-19)

Granting access to local or remote node remains a challenge both on UX and security fronts.
This is an attempt to plot possible paths for going forward.

Disclaimer: below is not a roadmap, but a "what if" exercise to acts as a starting point for discussion and experimentation that follows in comments

Initial idea is to think about the problem in three stages:

Stage 1: window.ipfs.enable(opts)

Stage 2A: Opaque Access Point with Service Worker

[Ongoing research]

  • ETA: 2019+
  • Thin static HTML+JS is loaded to establish Access Point Service Worker (APSW), which acts as a proxy to IPFS API provider and exposes limited API/Gateway endpoints
  • Progressive peer-to-peer Web Applications (PPWA) talk to IPFS over APSW
  • APSW automatically picks the best IPFS provider (js-ipfs, remote/local HTTP API, ipfs-companion)

Stage 2B: HTTP/WS /api/v1/ with access controls

Bit speculative - work on /api/v1 did not start yet, we are collecting requirements

  • ETA: 2019? 2020?
  • Websites and apps access API of IPFS Node directly
  • Access controls done by IPFS Node itself, CORS are allowed by default (*)
  • /api/v1/ can start as an experimental overlay provided by ipfs-desktop
    • OAUTH-like flow introduced in Stage 1 remains the same
    • Real-time capabilities are supported over Websockets
  • window.ipfs in ipfs-companion implemented as a preconfigured js-ipfs-http-client rather than a proxy
    • The overhead of postMessage is removed
    • Access controls removed from ipfs-companion and now done by ipfs daemon itself

Stage 3: Nodes talking to each other over libp2p

This is highly speculative idea with a lot of details to figure out, but the general idea to replace legacy transports (HTTP, Websockets) with libp2p

  • ETA: 2020+
  • Prerequisites:
    • pubsub is enabled by default and works in browser contexts
  • ipfs-companion == IPFS node (eg. runs embedded node js-ipfs by default)
  • window.ipfs.enable() (and future API-provider libraries) give access to API from Stage 2 over p2p connection (eg. via ipfs p2p)
  • "follow" semantics exist and allow setting up various sync policies between nodes

Parking this here for now, would appreciate thoughts in comments below.

@lidel lidel added specs needs clarification epic ux developers dif/expert Extensive knowledge (implications, ramifications) required labels Dec 19, 2018
@lidel lidel changed the title Rethink "accessing API of remote node" The Future of "accessing API of remote IPFS node" Dec 19, 2018
lidel added a commit to ipfs/roadmap that referenced this issue Dec 19, 2018
@mitra42
Copy link

mitra42 commented Dec 19, 2018

Stage 2 is when it gets interesting. Stage 1 requires installing ipfs companion, and then any browser based application detecting the presence of both ipfs companion and the local IPFS nodes, it complications to the point of being unlikely to happen.

If Stage 2 - or some version of it was implemented - then for example the dweb.archive.org UI could detect the presence of a local node and use that as a persistent cache rather than using js-ipfs with all the limitations that come from running in the browser (including lack of persistence after browser window closed and the extreme load on CPU which encourages people to close pages that are running IPFS;)

Obviously relying on CORS in a content-addressed filesystem makes no sense to me since both trusted & untrusted content could come from anywhere (e.g. from https://ipfs.io), one option I think would be worth considering along with authentication would be allowing a subset of the API to run without authentication - e.g. get, add, urlstore, pin , while saving more sensitive operations (like editing the config) until authentication was implemented.

@lidel
Copy link
Member Author

lidel commented Dec 20, 2018

@Gozala shared some relevant ideas in Progressive peer-to-peer web applications (PPWA). I need to think about this more, but my gut feeling is Stage 2 could be refined by introducing sw/iframe-based API provider as the universal entry point.

We could do access control there (before it lands in the actual API), and also iterate on graceful fallback / opportunistic upgrade mechanisms (eg. internally using window.ipfs if ipfs-companion is present, or trying local node directly via js-ipfs-http-client before falling back to js-ipfs).

@mitra42 we started experimenting with a subset of the API to run without authentication in ipfs-companion's window.ipfs proxy, current whitelist is here. The lack of permission prompt comes at a price of risking rogue website preloading malicious content to your node via dht.get or finding out your identity by adding unique content and doing dht.findprovs. It is also possible in the old web and XHRs, but in IPFS node is also sharing preloaded data, which may be problematic in some scenarios.

@mitra42
Copy link

mitra42 commented Dec 21, 2018

We really dont want to be running this through ipfs-companion. We want to run IPFS in the web browser and have the libraries (js-ipfs and js-ipfs-api) integrated in the page so that the user doesnt' NEED to do anything other than visit the page, but we do want to take advantage of a local peer if one exists. I acknowledge the risks, but I think they are much smaller than the loss of functionality from not being able to use a local IPFS peer at all, or even worse the current situation where people running a peer have the choice between not being able to use it for anything local (leaving CORS on) or exposing themselves to all kinds of malicious attacks by turning CORS off since there is no authentication even for damaging activities.

@fiatjaf
Copy link

fiatjaf commented Dec 21, 2018

To me it seems that IPFS Companion is great, because it enables opt-in. I really don't want websites using my local IPFS node just because I have one. But if I enable IPFS Companion then I'm telling then they can.

At the same time IPFS Companion abstracts way the need to inject IPFS libraries and/or do manual calls to the IPFS API from webapps that may use a local IPFS node. You can just use window.ipfs (if it is present and allowed) and that's it, otherwise you don't use it, or fail entirely and tell the user about it.

@Gozala
Copy link

Gozala commented Dec 21, 2018

To be clear what I was suggesting is to make say companion.ipfs.io facilitate pretty much what ipfs companion add-on does today through service worker. If you also happen to have addon installed sw could laverage that as well.

As of opting-in / permissions companion.ipfs.io could do that based in client origin

@lidel lidel added the status/ready Ready to be worked label Dec 21, 2018
@mitra42
Copy link

mitra42 commented Dec 21, 2018

@fiatjaf and @Gozala - I can't figure out how to make either of those suggestions work in practice. Assume a website (such as dweb.archive.org that wants to run in any situation, it can bundle js-ipfs and (js-ipfs-api) but it cant require users to download anything. We have code to try and autodetect in our IPFSAutoConnect function at [https://github.com/internetarchive/dweb-transports/blob/TransportIPFS.js#L81]. It fails in most cases currently because the local IPFS peer refuses CORS.

A vanishingly small portion of the users will have IPFS companion installed because (as far as I can tell) it doesnt add anything unless they want to interact with IPFS directly. Some might have IPFS or a nearby IPFS node as part of the dweb-mirror project. We could include the IPFS code from ipfs-companion into the Wayback Machine extension which a larger number will have installed, but we haven't had anyone (volunteer or paid) with the bandwidth and browser-extension expertise to either bundle js-ipfs directly into our extension, or bundle some part of ipfs-companion and figure out all the browser limitations.

@Gozala
Copy link

Gozala commented Dec 27, 2018

@fiatjaf and @Gozala - I can't figure out how to make either of those suggestions work in practice. Assume a website (such as dweb.archive.org that wants to run in any situation, it can bundle js-ipfs and (js-ipfs-api) but it cant require users to download anything. We have code to try and autodetect in our IPFSAutoConnect function at [https://github.com/internetarchive/dweb-transports/blob/TransportIPFS.js#L81]. It fails in most cases currently because the local IPFS peer refuses CORS.

I am building a proof of concept of proposed idea. I'll be happy to share it here once it's ready.

@Gozala
Copy link

Gozala commented Dec 30, 2018

I’ve put together a prove of concept that attempts that proposed idea is possible. There are some good news and bad news. I’ll start with what I have working:

https://github.com/gozala/lunet

  • I have static site that installs APSW - Access Point Service Porker. At the moment it just acts as a proxy to native app service, but obviously it could also talk to gateways and what not in the future.

  • Above static site also serves bridge.html file that is meant to be embedded through an iframe. It expects message from embedder with a MessagePort and forward that to APSW along with an origin of the embedder. That allows APSW to check permissions per origin and either accept connection, reject or consent a user.

  • I have simple systray app that exposes REST API on https://127.0.0.0:9000. On first run it also generates self signed SSL key+certificate and adds it to trusted roots on the system. That way it can serve over HTTPS but sadly Firefox does not consult system whether root is trusted so it does require you to Accept the Risk and Continue.

  • At the moment I have https://lunet.link DNS records configured to resolve to 127.0.0.1 so if you do have npm run remote running loading that URL will install APSW.

  • I also have a demo on https://ipfs.io/ipfs/QmSYickcuNoda1ZNShbtT5WpRxMG1jqzUqEypYYpvYBLAY/ that will attempt to connect to APSW installed by https://lunet.link and do all the ceremony to serve content from systray app.
    P.S.: I expect that apps sites would instead have their own custom domains.

As of bad news:

  • Keeping APSW alive is virtually impossible. I have imperfect workaround in place that will repeat reconnect flow if APSW is no longer alive. I think it can be improved enough to be smooth.

  • Another problem with APSW lifetime is that it could be terminated as it serves data through MessagePort, that is because SW are only kept alive if they are responding to fetch / install / activate events. Which is the problem because if it serves say video stream through MessagePort it can be terminated in the process. Pub / Sub is also going to be problem :(

    I think there is a way to overcome this limitation by keeping iframe around and making it poll SW on regular bases. Unfortunately that is far from ideal, mostly because app will need to consciously keep some iframe in the DOM. It might be still possible to make it smooth by having some script that creates and adds that iframe into head rather than body and does some coordination between SW and iframe, but I was really hoping for a smoother experience.

@Gozala
Copy link

Gozala commented Jan 6, 2019

I made little more progress in my prototype:

  • I have updated DNS records now and the static site that does APSW registration is now hosted on gh-pages https://github.com/Gozala/lunet/tree/master/docs

    Wanted to host it on IPFS itself, but I did not managed to get reasonable deployment option that would keep it up, have https on my domain, reflect changes in reasonable timeframe.

  • In theory IPFS HTTP API should be sufficient, but because it's http it only works on chrome. So I still have native app that pretty much acts as an https proxy for IPFS HTTP API (assuming one runs on http://127.0.0.1:5001/) so it would work on Safari & Firefox.

    I think it would make most sense to do the self signed certificate trick from the IPFS HTTP API itself and extend it in few ways to handle permissions. @lidel I'd like to participate in API v2 design if possible.

So with npm run local running & IPFS daemon running (with Access-Control-Allow-Origin configured to respond to https://lunet.link) I'm able to access IPFS content through the Service Worker, in fact I'm able to load webui and it seems to work with no changes (except of safari because it blocks http://127.0.0.1 from https, should be fairly easy to fix webui would just need to talke to SW instead).

screen shot 2019-01-05 at 7 17 25 pm

Disclaimer: I need to fix how SW updates, right now only way to get it updated to manually unregister from devtools and then load https://lunet.link so it can install a fresh one.

@Gozala
Copy link

Gozala commented Jan 6, 2019

Next thing I want to do is create another site say https://gozala.io/webui-demo that would embed lunet.link to host just webui.

BTW I think IPFS-HTTP-API would need to learn picking up some config changes through API itself. Like ideally https://gozala.io/webui-demo during first run will do oauth like flow with http://lunet.link and through that configure IPFS-HTTP-API so that Access-Control-Allow-Origin will include https://gozala.io/ origin.

@Gozala
Copy link

Gozala commented Jan 10, 2019

After more research I am considering an alternative approach, I think it would work better better than current approach where App SW needs to connect to Daemon SW approach because SWs are really eager to terminate and that problem is multiplied by the fact that we're trying to have Daemon SW alive and connected to the app SW, as they both race to terminate either of them succeeding breakes a MessageChannel which also happens to be impossible (without hacks) to detecting) on the other end.

This is why I'm considering an alternative approach

Daemon site (one that is embedded in iframe) will spawn a SharedWorker (and fall back to Dedicated Worker pool if API is not available, Thanks Apple 😢). This way we don't have to fight Daemon SW to keep it alive, as long as one Daemon page is around worker will be able to keep the connection alive. In practice that should be the case as long as there is at least one active client app. Only case that is not true if all apps have being closed and later you do open one and that case is fairly easy to detect (SW has no clients) in which case it can serve page that just embeds Daemon iframe and once connection between Deamon Worker and SW is established then redirect to the action page that was requested (Please note that this sounds complicated, but that is what is happening in current setup and works remarkably well).

It does imply that client apps need to embed Daemon iframe or else corresponding worker will terminate. However that was more or less a problem already, and I was already considering to workaround that by appending to navigation responses. Additionally in that added markup can be used to do user prompting for permissions (and it needs to be with-in the iframe so privileges can't be escalated).

This approach has additional advantage for in browser node case as frequent terminations don't exactly mix with well with that.

Trickiest bit is going to support browsers without SharedWorker API. In that case idea is following once iframe with Daemon loads it will say "hello" on BroadcastChannel if there any document that has already spawned a Worker (lets call it supervisor) it will respond back with a MessagePort connected to an own worker and index it was assigned (by is incrementing) . If no one responds in short time frame document assumes a supervision and starts index. Supervisor on beforeunload event broadcasts "goodby" message with an index of next supervisor being nominated, at which point next on in line spawns worker and acts as supervisor. Every document messages supervisor on beforeunload so supervisor can nominate new supervisor on exit. That does mean that worker lifetime is inconsistent, however even in worst possible scenario it would be probably still better than SW already is.

It is also worth considering that if Daemon manages to connect to a companion add-on or a local Daemon through REST API there will be no need to even spawn any workers. Still there will be some extra work to consider like propagating content added to the in worker node to the local Daemon.

Edit: Not sure what I was supposed to follow this Unfortunately it

@lidel
Copy link
Member Author

lidel commented Jan 14, 2019

This is great. I've been thinking what developer-facing artifacts could be extracted from this and I think drop-in library/toolkit that acts as a replacement for standalone js-ipfs is a way to go, as it should help with addressing two high level problems:

  1. "Running the same website (Origin) in multiple tabs without spawning multiple instances of js-ipfs"
    • Every website runs their own node once per Origin(s)
    • No user prompt, hardcoded access control: simple ability to whitelist multiple Origins to share the same worker would make various deployments a lot easier.
  2. "Global, shared ipfs instance that can be used by any Origin" (the original endgoal)
    • Possible to run own, but most of the people will use default provided by the library
    • User prompt for access control.

@Gozala I agree that SharedWorker is worth investigating. To remove need for access control and keep things simpler we may want to focus on (1) initially, as security perimeter is easier to understand.

Unfortunately it

...? (the suspense is killing me 😅)

@Gozala
Copy link

Gozala commented Jan 14, 2019

...? (the suspense is killing me 😅)

Oops I'm not sure how my comment ended like that, nor I can remember if there was anything specific I was going to say. Sorry

@Gozala
Copy link

Gozala commented Jan 14, 2019

I spend little more time on this and currently implemented something in between what I originally made and alternative option I described. Current status is: Things work really well on Chrome and Firefox but I'm struggling to identify the issue with Safari.

At the moment setup looks as follows:

Client App / Site

Client site in the example I had https://gozala.io/peerdium needs to serve two files:

  1. index.html that bootstraps everything up. It looks like this:

     <meta name="mount" content="/ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/" />
     <script type="module" async src="https://lunet.link/lunet/client.js"></script>

    Where lunet/client.js does a ceremony of embedding https://lunet.link in iframe. And registering service worker ./lunet.js (second file described below) path is also configurable via meta tag.

  2. ./lunet.js (second file described below) just imports the https://lunet.link/lunet/proxy.js that takes care of serving content under the mounted path (as seen in meta tag). Meaning that https://gozala.io/peerdium/index.html will map to /ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/index.html and will be served through a client by means of an iframe it set up. lunet.js looks as follows:

    importScripts("https://lunet.link/lunet/proxy.js")

In terms of interaction this is what happens:

  • index.html is loaded, which sets up an embedded iframe + SW.
  • Once all is set client.js will fetch location.href this time it will go through SW and there for response will be for /ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/.
  • Document tree is updated with a response. It is important to update document rather then reload because SW will message client (that will be this document) which then obtains response through embedded iframe. If you were to reload there would be no client for SW to get data through. In fact if you navigate to any page, SW will respond with <script type="module" async src="https://lunet.link/lunet/client.js"></script> that will do the same thing fetch own location and update document with response.

Host

Document that client embeds in iframe is what I refer to as host. Host document is also pretty much just this <script type="module" src="htts://lunet.link/lunet/host.js"></script> and is what is being served under https://lunet.link, which is to say that interesting stuff happens in lunet/host.js which is:

  • It registers lunet.link/service.js SW.
  • It listens on message events from the embedder.
  • Adds listener to message.ports[0] which is what lunet/client.js will pass during init.
  • Messages on ports are just serialized Request instances which it de-serializes and passes onto own SW, which in turn talks to ipfs-desktop (in the future will also have embedded js-ipfs) to get response and forwards response to a requesting message port, that lunet/client.js will then forward to lunet/proxy.js SW.

Wishlist

Here are the things I would like to change about this setup

  1. As you can see only piece that matters in the client app is the IPFS path. Everything else is pretty static. Ideally it should just take dnlink TXT record should be all it takes.

  2. On one hand host should not need to register SW because in practice SharedWorker would do a better job here. I still have it though so that it can load https://lunet.link/lunet/host.js while offline, however it would make sense to figure out a way to do it without SW.

@Gozala
Copy link

Gozala commented Jan 14, 2019

It turns out Safari does not implement BroadcastChannel either so my SharedWorker polyfill ideas isn't going to work out :(

@Gozala
Copy link

Gozala commented Jan 14, 2019

Alright I think something else could be done on Safari (or anywhere where SharedWorker isn't available but ServiceWorker is) we can spawn a Service Worker, which once activated will start broadcasting ping to all it's clients, that in turn will respond with pong message back, and keep repeating this

const extendLifetime = async() {
  await sleep(1000 * 60 * 4) // Firefox will wait for 5mins on extendable event than abort.
  const clients = await self.clients.matchAll({ includeUncontrolled: true })
  for (const client of clients) {
     client.postMessage("ping")
  }
  await when("message", self)
}

const when = (type, target) =>
  new Promise(resolve => target.addEventListener(type, resolve, {once:true})

self.addEventListener("activate", event => event.waitUntil(extendLifetime())
self.addEventListener("message", event =>  event.waitUntil(extendLifetime())

I believe this should keep Service Worker alive and going as long as there are clients talking to it, which is in fact a case for SharedWorker.

@Gozala
Copy link

Gozala commented Jan 14, 2019

Got it working across Firefox, Chrome & Safari!

screen shot 2019-01-14 at 3 44 45 pm

@lidel
Copy link
Member Author

lidel commented Jan 15, 2019

This is fantastic, especially getting it work on Safari 👍

I really like the mount metaphor and how small is the amount of code end developer needs to put on the static page. This is exactly what we should aim for.

@Gozala Regarding the first item from your Wishlist:

As you can see only piece that matters in the client app is the IPFS path. Everything else is pretty static. Ideally it should just take dnlink TXT record should be all it takes.

We have an API for DNSLink lookups, but may want to support <meta> header as an optional fallback. Something like this:

  1. Try to get the latest value for the mount point by reading DNSLink:
    • ipfs.dns(window.location.hostname, {r: true})
        .then(dnslinkPresent)
        .catch(dnslinkMissing)
  2. If (1) returned error (no DNSLink or API down) then fallback to version from <meta> (if present)

ps2. I see how a hybrid approach could be supported as well, where static HTML with regular website is returned with one extra <script>, then PPWA library replaces document with more recent version read from DNSLink.

@Gozala
Copy link

Gozala commented Jan 15, 2019

@lidel I’ve considered doing dns lookup instead of meta tag (as per your suggestion), however goal is for user not to have to static site for bootstrapping in first place, basically I want flow to be ipfs add -r ./ and adding hash to dns record. Which is to say ideally gateway would just serve bootstrap page and lunet.js

@Gozala
Copy link

Gozala commented Jan 24, 2019

Partially related to my last comment. In Mozilla Devtools protocol I worked on a thing that allowed server to describe it's protocol during connection via passing down an API spec, that then was used by a client side to generate full client to avoid having to maintain client that more or less just encode / decode / track exchange and occasionally gets out of sync with server or breaks due to some human error. It also made it possible to generate clients in other languages.

IPFS API would be even easier as there is no notion of GC-able objects in the server process (which complicates things) that you want to reflect in the client. How visible is it to compel stakeholders to do something along those lines ?

@lidel
Copy link
Member Author

lidel commented Jan 24, 2019

I think it would make much more sense if there was something like IPFSDaemon() API that would just take Request complying with REST API and return Response also corresponding to REST API. That way it would avoid multiple serialization, deserialization, dispatch, etc... steps.

@Gozala I fully agree. The code responsible for "REST API" (HTTP Gateway) in js-ipfs is disabled in browser context because it can't open TCP port, but we could expose it as a JS function to simplify use in contexts like Service Worker. See my proposal in ipfs/js-ipfs#1820, if you have ideas on how should the browserified Gateway API should look like, comment there.

[..] passing down an API spec, that then was used by a client side to generate full client to avoid having to maintain client. [..] How visible is it to compel stakeholders to do something along those lines ?

It sounds like something worth discussing when IPFS API v2 is designed this year. I am sure any idea that could decrease maintenance burden of js-ipfs-http-client, ipfs-postmsg-proxy etc will be taken into consideration. Will CC you when there is a meta-issue on this.

@Gozala
Copy link

Gozala commented Jan 31, 2019

Status update

I spend more time on this to get the in-browser fallback working. It took quite a bit more effort than I anticipated but good news is it works. Below is the an image of fully functional webui loaded via lunet from ipfs through in-browser node & 0 changes (to the webui code)

screen shot 2019-01-31 at 9 03 15 am

My peerdium demo also works with in-browser node with 0 changes as well. 🎉

At the moment this version lives in a separate branch because:

  • I commented out local gateway proxy version to ease the development.
  • Code could use some cleaning up.
  • Need to add logic that will try optimistic path with local gateway and degrade to in-browser node version.

Details

  • I end up replicating a lot of logic from js-ipfs that deals with Daemon / Geatway REST endpoint handling. Better long term solution would be to factor out relevant pieces from js-ipfs as suggested by Factor out Daemon / Gateway endpoints handlers  js-ipfs#1855 and make it server / browser agnostic. I have considered doing that, but decided that it was not most effective path for making process on this exploration as there are bunch of non-trivial constraints to consider.
  • Host document (one in the iframe) spawns a SharedWorker

Open questions (would love feedback)

  • If local gateway is unavailable host document will spawn an in-browser node, but what if local node was temporarily down and it does come back up later ? Should we occasionally try to connect to local node ?

  • User should never loose data or system is broken! This is really tricky to get right I'm afraid, specifically challenges being:

    • What to do when user used in-browser first and then native IPFS node. It's possible possibly to replicate data from in-browser node to the native one, but it does not seem trivial to deal with conflicts or timing in case there is a lot of data to be pushed to.
    • What to do when user primarily uses native node, but later in-browser node. This is trickier situation as presumably native node is unreachable so data can by synced. Maybe in-browser node should always be active and replicating ?

    I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

  • Daemon REST API is not optimized for browser clients. For example tar encoding file lists are served on some requests, that require both server and client to use library to deal with that. On the client side that requires pulling in additional library for very little benefit, on the server it matters less, however given that in-browser node serves same API through Worker it not only requires that library on the other end but also wastes CPU cycles first encoding then decoding. Can we redesign v1 version to be optimized for a client instead, succh that plain fetch would be very reasonable choice.

  • OMG sooo nodejs! js-ipfs is definitely optimized for node, and undoubtedly that made a lot of sense at a time. However there is huge amount of code bringing node stuff that then you need to adapt to browser like streams, buffers, pull-streams, http... Browsers have come a long way since though, most of these have if not superior than definitely a good enough built-in alternatives. Any chance we could embrace all this progress in browser land ? It would both make it easier to use it in browser and reduce size of the library.

@Gozala
Copy link

Gozala commented Feb 1, 2019

I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

Should I be looking at IPFS Cluster stuff for this stuff ?

@MidnightLightning
Copy link

@Gozala What to do when user used in-browser first and then native IPFS node. It's possible possibly to replicate data from in-browser node to the native one, but it does not seem trivial to deal with conflicts or timing in case there is a lot of data to be pushed to.

Should I be looking at IPFS Cluster stuff for this stuff?

I found this issue through searching around related to a concept I've been thinking more about. That idea of both an in-browser node and a native/standalone node I think is something that should be fleshed out as a user norm. Using as an example how users interact with services like Dropbox, I have Dropbox clients on my desktop, laptop, and phone, and have different files "starred" for offline use on my phone than I use most frequently on my laptop or desktop. I think it would be ideal that among the standard peer discovery methods that any given IPFS node (in-browser or standalone) has, that it additionally allows a user to indicate another node as "theirs" (add authentication/credentials?), and then those nodes actively sync pins/virtual filesystem structures between them. In that way, I could have a standalone node running on my workstation, and when I open a browser on my workstation, laptop, or phone, they all create an in-browser node, and I end up with four nodes that are all "me" and storing my data. I'd probably want to configure the in-browser node on my workstation to do minimal storage (since there's another node on that machine that should be primary), and would like the control to indicate my workstation node should pin/keep a copy of everything the others pin (primary backup), the laptop should as well, when it's online (secondary backup), and the phone node would only pin important things (space concerns), but being able to browse "known" hashes/files on the workstation/laptop nodes would be ideal.

From that perspective, it would be fine if all in-browser nodes stayed in-browser nodes (no need to "change over" to a standalone node if it came back online), but pinned/known file syncing could be very useful.

@mikeal
Copy link

mikeal commented Feb 4, 2019

I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

One question I have that might bring some clarity to these questions: what is the delta between what we think an ideal “integrated-in-browser-IPFS-node” and a js-ipfs service worker?

The reason I’d like to think about things this way is that this excercise might surface differences between “features missing in the web platform” that we don’t have without native integration and features the platform doesn’t have because of legitimate security and isolation concerns between applications.

The security story for the current locally running server (either Go or IPFS Desktop) is practically non-existent. Having a similarly scoped shared resource will need to drastically improve that security story and it’s not yet clear to me where this is the responsibility of IPFS or if we’re actually missing a feature or integration in the browser.

@Gozala
Copy link

Gozala commented Feb 4, 2019

One question I have that might bring some clarity to these questions: what is the delta between what we think an ideal “integrated-in-browser-IPFS-node” and a js-ipfs service worker?

I can only speak for myself and what I think & going for is "browser is your IPFS node". JS-IPFS, SW, IPFS-Desktop etc... are just polyfills to deliver / explore that experience.

The reason I’d like to think about things this way is that this excercise might surface differences between “features missing in the web platform” that we don’t have without native integration and features the platform doesn’t have because of legitimate security and isolation concerns between applications.

I think there is general assumption that web platform lacks features to implement full fledged IPFS node in the web content context. I think it's an incorrect way to look at things. Even if browsers exposed all the low level networking primitives (which is highly unlikely) to allow that, each browser tab running own IPFS node would be a terrible experience.

That is to suggest that if / when IPFS is adopted by a browser, browser itself will become IPFS node and expose limited API to access & store content off of network. And yes it will impose same / similar origins separation concerns as it does today.

Goal of this exploration is to polyfill described experience through variety of tools available:

  • If IPFS-Desktop is available use that as it provides actual p2p access to the network.
  • Otherwise fallback to the JS IPFS and routing servers instead.

That way it applications:

  1. Can be loaded of IPFS network
  2. Be isolated via origin
  3. Have a way to read / write data (based on the origin)

The security story for the current locally running server (either Go or IPFS Desktop) is practically non-existent.

Yes and that is a huge issue waiting to be exploited. I would absolutely encourage to lock it down. Last time I checked ipfs deamon / gateway comes with default of Access-Control-Allow-Origin: * which means any site could take control of it and exploit it.

Both should be locked down to a one single origin maybe access.ipfs.io or something along those lines. Same is true for the routing services that js-ipfs connects to they should be open only to a single origin controlled by PL.

Having a similarly scoped shared resource will need to drastically improve that security story and it’s not yet clear to me where this is the responsibility of IPFS or if we’re actually missing a feature or integration in the browser.

No features are missing on that end. What this PoC does is uses special origin (in this case lunet.link, but should be access.ipfs.io or something) that provides access to the IPFS network and mediates access control with user. This way it is able to control access rights based on users consent rather than allow each app / site do whatever they want. This also allows user to access library of all data in one place. While that is what browser should be doing, I think it is worth doing it on IPFS end right now to exercise this and learn from experience and also establish a cowpath.

@Gozala
Copy link

Gozala commented Feb 7, 2019

I have cluster branch now which runs in-browser node and attempts to use local native node through REST API simultaneously. At the moment it's pretty dumb it just forwards requests to both nodes and attempts to serve response from native node with fallback to in-browser node.

At the moment there is no attempt to sync two, for that it would probably make most sense to borrow the logic from ipfs-cluster rather than trying to hack things together.

I'll focus on getting this working in Safari through SW poly-fill for SharedWorker and then deploy current version.

@Gozala
Copy link

Gozala commented Feb 7, 2019

Responding to @MidnightLightning

Using as an example how users interact with services like Dropbox, I have Dropbox clients on my desktop, laptop, and phone, and have different files "starred" for offline use on my phone than I use most frequently on my laptop or desktop.

That is good point! However case with native node vs in-browser node is different as from user point of view it's the same device.

I think it would be ideal that among the standard peer discovery methods that any given IPFS node (in-browser or standalone) has, that it additionally allows a user to indicate another node as "theirs" (add authentication/credentials?), and then those nodes actively sync pins/virtual filesystem structures between them.

I have being thinking about this in a slightly different way. I imagine library organized as "collections" (or threads in Textile terms). Ideal is that you can invite others to collaborate on those collections. Those others can be your other devices, your friends or pinning services.

In that way, I could have a standalone node running on my workstation, and when I open a browser on my workstation, laptop, or phone, they all create an in-browser node, and I end up with four nodes that are all "me" and storing my data. I'd probably want to configure the in-browser node on my workstation to do minimal storage (since there's another node on that machine that should be primary), and would like the control to indicate my workstation node should pin/keep a copy of everything the others pin (primary backup), the laptop should as well, when it's online (secondary backup), and the phone node would only pin important things (space concerns), but being able to browse "known" hashes/files on the workstation/laptop nodes would be ideal.

I think all those use cases fit nicely with an above described solution, further more it follows the interaction flow. User during sharing / publishing phase chooses who to share it with.

Implementation vise it seems that "collection" should just be a "ipfs-cluster".

@lidel
Copy link
Member Author

lidel commented Feb 11, 2019

@Gozala I like the idea of a seamless/self-healing abstraction for browser context (Access Point Facade), but figuring out how to handle the surface of IPFS API when API provider is a facade on top of multiple nodes going online and offline will be a challenge.

Agree we should look at ipfs-cluster for inspiration, but security considerations will not fit exactly. eg. in browser we want to build security perimeter around Origin and based on it introduce key/mfs write/read scoping/sandboxing, limit access to sensitive endpoints such as ipfs.config etc

Seems that the MVP would need:

  • add,cat, ls, refs, object, dag, block – for low level file, dag and block operations
    • i think dag will supersede object at some point but for now we need both
  • scoped per origin: files – for per-dapp storage
  • scoped per origin: name + key – for publishing to IPNS
    • potential open issue: how to share publishing key between nodes? key import/export is only supported by js-ipfs, go-ipfs only allows to create a new key
  • some API to control "follow"/"sync" between nodes behind the Access Point Facade – even if it is just add/remove for new node

@Gozala
Copy link

Gozala commented Feb 11, 2019

@Gozala I like the idea of a seamless/self-healing abstraction for browser context (Access Point Facade), but figuring out how to handle the surface of IPFS API when API provider is a facade on top of multiple nodes going online and offline will be a challenge.

I agree. I am also getting more and more convinced that exposing full IPFS API may not be a good idea in first place. While it is cool to have webui running over this, I think it's a wrong abstraction for most apps.

Agree we should look at ipfs-cluster for inspiration, but security considerations will not fit exactly.

I need to write a coherent story about the experience I have in mind, but before I'll get around to doing it here is a gist:

  • API exposed allows embedder to save data into library without any permission requests. However data has quota (to reduce abuse) per origin. Such data is local to a device (even though local go-ipfs and in browser js-ipfs will be syncing in case one goes down). Think of it as local "draft" data.
  • There will be an API to publish "draft". On publish request lunet will trigger user interaction (under it's own origin, to prevent embedder from escalate privileges). Think of it as "save as..." except user will have a choice to publish publicly under own identity or privately / with a group in which case actual data is encrypted and published to IPFS. Providing user a URL that embeds both CID+Decryption key. On the recipient side key will be saved and data decrypted on reads by lunet. Goal is that app knows nothing about encryption / description going on at either end of the pipe.
  • During publishing flow user will be able add "bots" that are essentially replication nodes forming an IPFS cluster. Actual UX needs quite a bit thinking through, but general idea is that there will be kind of "groups" (or threads in textile terms) that form IPFS-Cluster to which you publish. Which is to say IPFS clusters will just need to orchestrate replication & they don't need to have full IPFS API.

eg. in browser we want to build security perimeter around Origin and based on it introduce key/mfs write/read scoping/sandboxing, limit access to sensitive endpoints such as ipfs.config etc

Absolutely! However I think that should happen at the lunet (Access Point Facade) level before any calls are issued to any of the IPFS nodes.

On the sandboxing I'm still working out some details in my head but I think there is a real opportunity to improve on the mess we're in the conventional web by limiting read / write access to only a app resources / document being operated on. Largest issues on the web are due to third parties doing tracking and aggregating user data on some servers. I think it would be really great if we enforced a setup as something like ://${app}/${data} where app maps to some CID and there for app is able to read / load stuff from it. And ${data} is MFS entry that app is able both read / write to.

In this setup app can't really spy on user, sure it can save some data however that data is local and user personally needs to choose to share it and even then app isn't really able to let own server know where to grab it from.

There are things to be worked out but I'm inclined to think that combination of SW & sandbox-ed iframes might allow for such sandboxing.

@Gozala
Copy link

Gozala commented Feb 13, 2019

I got it finally working in Safari 🥳 (Debugging SW in Safari is quite a throwback to old days of JS with no debuggers, except there’s no alert or reliable way to print output either 😂)

Now peerdium fork loads with no changes, using in-browser node running SharedWorker polyfill using ServiceWorker.

Issues

However aome content, like posts created by me seem to fail to load, specifically js-ipfs call ipfs.get(cid) returns promise that never resolves nor fails. I also do corresponding request to a (gateway?) server which succeeds with 200 but still nothing from js-ipfs and given no reliable debugging in SW on Safari I was unable to track down what’s the issue. Also same code with a same codepath works as expected in Firefox and Chrome so 🤷‍♂️

I’ll do more digging tomorrow, but thought I’d post in case this is known issue

@Gozala
Copy link

Gozala commented Feb 13, 2019

Safari also seems to reject POST requests with form data as body.
Edit: It also appears that FormData isn't available is ServiceWorker context in Safari, which might be a an underlying reason.

@Gozala
Copy link

Gozala commented Feb 13, 2019

It seems that in Safari call to resolver.cid(ipfsNode, "/ipfs/QmedJqYfddTzygxpcDrtTJBupHn4qGHntPXBx8APNM5gE1") never resolves or fails which is from ipfs-http-response library, and specifically:

https://github.com/ipfs/js-ipfs-http-response/blob/7746dab433e3a57652e3222bb1cc6051a09576be/src/index.js#L63

@Gozala
Copy link

Gozala commented Feb 16, 2019

I am exploring attentive approach for loading this described here Gozala/lunet#2 (comment)

@Winterhuman
Copy link

Winterhuman commented Dec 21, 2022

Just to partly revive the discussion, https://datatracker.ietf.org/doc/draft-ietf-dnsop-alt-tld seems like an interesting idea to remember. It reserves .alt as a non-DNS namespace which applications can reserve for themselves, like CID.ipfs.alt leading to a local IPFS node if desired (and it says the names don't need to be DNS compliant!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developers dif/expert Extensive knowledge (implications, ramifications) required epic needs clarification specs status/ready Ready to be worked ux
Projects
None yet
Development

No branches or pull requests

8 participants