Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC6249: Metalink/HTTP: Mirrors and Hashes #179

Open
lidel opened this issue Feb 13, 2021 · 2 comments
Open

RFC6249: Metalink/HTTP: Mirrors and Hashes #179

lidel opened this issue Feb 13, 2021 · 2 comments
Labels
need/analysis Needs further analysis before proceeding need/community-input Needs input from the wider community needs clarification P2 Medium: Good to have, but can wait until someone steps up topic/http-gateway

Comments

@lidel
Copy link
Member

lidel commented Feb 13, 2021

This is a more powerful alternative to Alt-Svc discussed in #144
There is also IETF draft with Content-Digest and Want-Content-Digest headers – we track that in #185, but this one seems to be more flexible.

RFC6249 enables metalink hints to be returned as HTTP response headers:

1. Introduction

Metalink/HTTP is an alternative and complementary representation of
Metalink information, which is usually presented as an XML-based
document format RFC5854. Metalink/HTTP attempts to provide as much
functionality as the Metalink/XML format by using existing standards,
such as Web Linking RFC5988, Instance Digests in HTTP RFC3230,
and Entity Tags (also known as ETags) RFC2616. Metalink/HTTP is
used to list information about a file to be downloaded. This can
include lists of multiple URIs (mirrors), Peer-to-Peer information,
cryptographic hashes, and digital signatures.

1.1. Example Metalink Server Response

This example shows a brief Metalink server response with ETag,
mirrors, Peer-to-Peer information, Metalink/XML, OpenPGP signature,
and a cryptographic hash of the whole file:

   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Link: <http://www2.example.com/example.ext>; rel=duplicate
   Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
   Link: <http://example.com/example.ext.torrent>; rel=describedby;
   type="application/x-bittorrent"
   Link: <http://example.com/example.ext.meta4>; rel=describedby;
   type="application/metalink4+xml"
   Link: <http://example.com/example.ext.asc>; rel=describedby;
   type="application/pgp-signature"
   Digest: SHA-256=9HVXcpSXzGTuTNHu/JcJIggAJSzgRWF8GzWGCMe8hgo=

Ideas how to use this on HTTP Gateways

Having this as part of HTTP spec makes it much easier for us to implement things which we always wanted, but did not want to invent IPFS-specific proprietary semantics. Below is a short list with the most obvious things, but comments with additional ideas are welcome.

(A) Return hash in Digest field to use HTTP-native semantics to enable verifiable gateway response (#128)

If we are returning a small file that fits in a single IPFS block, and was hashed with SHA (or other function supported by the web platform) we could return it as-is.

We could also return raw Multihash or a CID of entire DAG. Details would have to be determined around our plans to standardize Multihash before CID etc, broad brush strokes around something like (either MH or CID):

    Digest: SHA-256=e7EpE2zVw5H2okAeXLcxdXXc95NSJJU2vqOpN675vZw=
    Digest: MH=QmWfVY9y3xjsixTgbd9AorQxH7VtMpzfx2HaWtsoUYecaX
    Digest: CID=bafybeid3weurg3gvyoi7nisadzolomlvoxoppe2sesktnpvdve3256n5tq     

(B) URI hint that the content is available on IPFS

Opening https://en.wikipedia-on-ipfs.org/wiki/ would return mutable and immutable links to content on IPFS:

    Link: <ipns://en.wikipedia-on-ipfs.org/wiki/>; rel=duplicate
    Link: <ipfs://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq/wiki/>; rel=duplicate

To facilitate automated fallback, the list of supported formats (ipfs/kubo#8234) could be included as well.
For example, a dag-cbor CID could have:

Link: <ipfs://bafy?format=block>; rel=describedby; type="application/octet-stream"
Link: <ipfs://bafy?format=car>; rel=describedby; type="application/octet-stream"    
Link: <ipfs://bafy?format=dag-json>; rel=describedby; type="application/json"
Link: <ipfs://bafy?format=dag-cbor>; rel=describedby; type="application/cbor"

(C) URI hint that the content is available on other Peered gateways

go-ipfs already has a concept of Peering, which means friendly peers can add each other to Peering section in config and that will ensure they are always connected to each other and can engage in bitswap without the need of DHT.

I believe we could add opt-in field for a name of a subdomain gateway backed by a peer, and when present, return Link header for each "peered gateway".

For example, if dweb.link was peered with cf-ipfs.com (Cloudflare), example.com and example.net, response for https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.dweb.link could include:

    Link: <https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.cf-ipfs.com>; rel=duplicate
    Link: <https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.example.com>; rel=duplicate
    Link: <https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.example.net>; rel=duplicate    
@lidel lidel added P1 High: Likely tackled by core team if no one steps up need/analysis Needs further analysis before proceeding need/triage Needs initial labeling and prioritization topic/http-gateway labels Feb 13, 2021
@bertrandfalguiere
Copy link

I love (B). Brave could remember that for visited websites and fetch automaticaly via IPFS if the source website is down. Should probably be opt-in.

(A) could also enable torrent websites to indicate that the file is also available on IPFS. If they do, some people should be able to (aka: will) make indexes of equivalence between torrent haches and ipfs CIDs. Torrent clients could then choose trustworthy indexes to rely on, and multiply sources of fetching (even though you can't download partly from Bittorrent and partly from IPFS, alternative sources could be useful for poorly seeded files).

(D). A website like DTube could point at peers having requested the file recently, so they can seed from each other and offload the server. The server would then act more as a coordinator and a last-resort seeder rather than the main provider). Similar to (B), but with the extra step of explicitely giving lilely providers to accelerate the discovery step of IPFS fetching.

@lidel lidel added need/community-input Needs input from the wider community needs clarification P2 Medium: Good to have, but can wait until someone steps up and removed need/triage Needs initial labeling and prioritization P1 High: Likely tackled by core team if no one steps up labels May 14, 2021
@SuzanneSoy
Copy link

See also ipfs/ipfs-companion#1013, in the same spirit it would be nice to have a way to indicate these headers directly in the HTML (<link rel="duplicate" href="ipfs://…" />, <link rel="canonical" href="ipfs://…" />, <meta http-equiv="Link" content="<ipfs://bafy…>; rel=duplicate" /> would be possible choices. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link seems to say that the Link HTTP header is equivalent to the <link…/> HTML tag, so supporting the <link…/> tag in addition to the header seems desirable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/analysis Needs further analysis before proceeding need/community-input Needs input from the wider community needs clarification P2 Medium: Good to have, but can wait until someone steps up topic/http-gateway
Projects
None yet
Development

No branches or pull requests

3 participants