Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support ED25519 at subdomain gw with TLS #7441

Merged
merged 1 commit into from
Jul 10, 2020

Conversation

lidel
Copy link
Member

@lidel lidel commented Jun 8, 2020

The goal of this PR is to be alternative to #7358 and solve #7318 in a way that makes it possible to have functional TLS for IPNS on public subdomain gateways.

Rationale

If we split original CID into multiple sub labels (#7358) DNS resolution will work, but no public gateway will be able to get TLS certificate that supports double wildcards like *.*.ipfs.example.com.

When splitting is used, visitors using web browsers will see a TLS Error page, which effectively kills UX and defeats the purpose of subdomain gateway (one step forward by increasing security by providing Origin isolation, two steps backward by removing ability to do TLS validation when loading website backed by a long CID).

Given this reality, I believe a better trade off could be to try shortening CID by converting it to Base36, and if that fails return an error before redirect to subdomain occurs.

This PR

  • adds subdomain gateway support for ED25519 CIDs in a way that fits in a single DNS label to enable TLS for every IPNS website.
  • cleans up subdomain redirect logic and adds more explicit error handling so user gets human-readable error with proper IPFS context instead of generic DNS failure

TL;DR on router logic:

When CID is longer than 63 characters, router at /ipfs/* and /ipns/* converts to Base36, and if that does not help, returns a human readable 400 Bad Request error.

Examples

Given a subdomain gateway at example.com:

ED25519 fits perfectly when converted to Base36, so it gets loaded just fine:

http://example.com/ipns/12D3KooWP3ggTJV8LGckDHc4bVyXGhEWuBskoFyE6Rn2BJBqJtpa 
  → HTTP301 – http://k51qzi5uqu5dl2yn0d6xu8q5aqa61jh8zeyixz9tsju80n15ssiyew48912c63.ipns.example.com

CID produced by ipfs add --cid-version 1 --hash sha2-512 is too long, so an error is returned instead:

http://example.com/ipfs/bafkrgqhhyivzstcz3hhswshfjgy6ertgmnqeleynhwt4dlfsthi4hn7zgh4uvlsb5xncykzapi3ocd4lzogukir6ksdy6wzrnz6ohnv4aglcs 
  → HTTP 400 "CID incompatible with DNS label length limit of 63"

Note that returning this error overrides #6982 which means tools like curl won't be able to load content from long CIDs via the subdomain gateway. I don't think that is a bug, pretty sure it is how we want it, but mentioning here for completenes.

Why we dropped the previous iteration

We looked into automatically replacing root node which has shorter CID, but as noted in #7441 (comment) its not worth it.

Click here to expand notes from that older version

Note: This is WIP – don't look at the code yet, opening draft PR for early feedback on the idea

TL;DR

The idea evaluated here is to redirect example.com/ipfs/$CID to $dnsCID.ipfs.example.com where $dnsCID is a new CID created with a hash function known to fit in a single DNS label (<=63chars).

This way public gateways can enable HTTPS without TLS errors for content roots backed by longer hash functions.

Rationale

If we split original CID into multiple sub labels (#7358) DNS resolution will work, but no public gateway will be able to get TLS certificate that supports double wildcards like *.*.ipfs.example.com. This means errors like this.

This means when splitting is used, visitors using web browsers will see a TLS Error page, which effectively kills UX and defeats the purpose of subdomain gateway (one step forward by increasing security by providing Origin isolation, two steps backward by removing ability to do TLS validation when loading website backed by a long CID).

Given this reality, I believe a better trade off could be to create a new root CID that is compatible with DNS and make websites work in existing user agents that way.

Thoughts?

Open questions

  • Is this worth it?
  • Should we redirect to a warning page, that explains original CID can't be loaded over DNS, and gives user a choce if they want to rehash and proceed to "new CID" (..or open it from a well-known non-subdomain gateway?)
  • Which API to use? How to ensure newly minted CID is provided to the network?
  • Should we do the same from IPNS? (un-inlining ed25519) or do we want to opportunistically use Base36 there instead?
  • Should we keep support for split CIDs, and redirect them to a new CID that fits in a single label?

core/corehttp/hostname.go Outdated Show resolved Hide resolved
core/corehttp/hostname.go Outdated Show resolved Hide resolved
@ribasushi

This comment has been minimized.

@lidel lidel force-pushed the fix/long-cids-in-subdomain-gw-with-tls branch from cebf519 to 842ff8e Compare June 29, 2020 17:51
@ribasushi
Copy link
Contributor

@lidel I love this last iteration! :shipit:
Though this makes the case for #7357 even stronger ( reduce the "why is my hash different?!?!" questions )

@lidel lidel force-pushed the fix/long-cids-in-subdomain-gw-with-tls branch from 842ff8e to 4081ba1 Compare June 30, 2020 11:45
@@ -334,11 +390,16 @@ func toSubdomainURL(hostname, path string, r *http.Request) (redirURL string, ok
// if object turns out to be a valid CID,
// ensure text representation used in subdomain is CIDv1 in Base32
// https://github.com/ipfs/in-web-browsers/issues/89
rootID, err = cid.NewCidV1(multicodec, rootCid.Hash()).StringOfBase(mbase.Base32)
rootCID = cid.NewCidV1(multicodec, rootCID.Hash())
rootID, err = rootCID.StringOfBase(mbase.Base32)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we still default to Base32 here..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that even if we changed the default libp2p-key output to be base36 that people who put base32 RSA libp2p-keys into the browser domain or path wouldn't end up with the domain (and therefore origin) changed?

Similarly, is this solution too flexible? Do we need a canonical base encoding for the domains so that data stored in browser local storage isn't lost?

Copy link
Member Author

@lidel lidel Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea here is to PR a surgical change to use Base36 only where it is really needed and continue with Base32 as default for subdomains (so already existing subdomain Origins don't change, and we don't need to worry about any data lost).

I may miss your point(?), but I don't believe we need to add any additional normalization.

This comment was marked as duplicate.

Copy link
Contributor

@aschmahmann aschmahmann Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was basically about what we want to happen if I use base36 where I could've used base32.

Leaving as is may lead to me having two different views (with different local data) for the same website.

Making them canonical is both a bit of a pain and uses an arbitrary rule that we likely end up stuck with (like the Cidv0-1 size cutoff).

Wanted to double check we are ok with the pros and cons of our solution now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formal 👍 from me on flipping as much as possible to b36. I fully recognize the pain it will cause. That pain is utterly dwarfed by the cost of doing another switch further down the road.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to be as consistent as possible. Given that IPNS isn't widely used for obvious reasons anyways, I'd suggest at least using base36 for IPNS. I'm fine not canonicalizing CIDs.

The same HTML can be loaded from different hostnames on the web, and each gets distinct Origin.

That's fine. The problem is if we decide to start canonicalizing CIDs to base36 for consistency in the future. If we do that, we'll be changing the origin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I tend to think that we might as well do the b36 encoding for IPNS. IPNS users tend to be pretty happy when fixes show up and there'll probably be at least a few fixes in 0.7 or 0.8 that make it worth their while.

As an aside I've been thinking about this a bit and I suspect the confusion here stems from us not really telling people what the "correct" way is for users to describe an IPFS/IPNS URL. Do we even have a single answer here, or is it context dependent? For example, we'd like people to ideally use ipfs://b32encodedID, but since ipfs:// isn't yet widely supported we probably like dweb.link/ipfs/b32encodedID. Not having an canonical answer means we're stuck guessing about how people "could" be using URLs and so makes reasoning about upgrade paths difficult.

Copy link
Member Author

@lidel lidel Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I've updated this PR to use b36 for ALL libp2p-keys in subdomains (79d571f)

@aschmahmann I believe that was the last thing blocking this PR – mind taking a final look and merging if it is ok?

ps. regarding canonical address, for the time being it is:

https://dweb.link/ipfs/{cid}
https://dweb.link/ipns/{libp2p-key}

Works everywhere, can be shared and loads fine, and is future proof.
We need to work with gui/docs/ecosystem to make that more evident in our apps, docs and comms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks will take a look. Is there any base encoding recommended for the dweb.link paths, or is that left unspecified? 💯 on documenting the path based recommended addressing though since it will at least allow us to point people at something if they ever run into issues copy-pasting a subdomain-based URL.

@lidel
Copy link
Member Author

lidel commented Jun 30, 2020

Finished tests (go&sharness), and updated PR description to reflect current version.
CI is green – ready for review / comments.

cc @autonome @Gozala @MichaelMure for visibility as this impacts behavior of subdomain gateways in the browser

@lidel lidel changed the title WIP: support long CIDs in subdomains with TLS feat: support ED25519 at subdomain gw with TLS Jun 30, 2020
@lidel lidel marked this pull request as ready for review June 30, 2020 12:15
@lidel lidel added topic/cidv1b32 Topic cidv1b32 topic/ed25519 Issues related to ed25519 Peer IDs topic/gateway Topic gateway labels Jun 30, 2020
Copy link
Contributor

@jacobheun jacobheun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall approach looks good to me. Deferring code review to @aschmahmann

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Left a few questions to clarify what's going on. Thanks 😀

core/corehttp/hostname.go Outdated Show resolved Hide resolved
core/corehttp/hostname.go Outdated Show resolved Hide resolved
core/corehttp/hostname.go Show resolved Hide resolved
core/corehttp/hostname.go Outdated Show resolved Hide resolved
core/corehttp/hostname_test.go Outdated Show resolved Hide resolved
@@ -334,11 +390,16 @@ func toSubdomainURL(hostname, path string, r *http.Request) (redirURL string, ok
// if object turns out to be a valid CID,
// ensure text representation used in subdomain is CIDv1 in Base32
// https://github.com/ipfs/in-web-browsers/issues/89
rootID, err = cid.NewCidV1(multicodec, rootCid.Hash()).StringOfBase(mbase.Base32)
rootCID = cid.NewCidV1(multicodec, rootCID.Hash())
rootID, err = rootCID.StringOfBase(mbase.Base32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that even if we changed the default libp2p-key output to be base36 that people who put base32 RSA libp2p-keys into the browser domain or path wouldn't end up with the domain (and therefore origin) changed?

Similarly, is this solution too flexible? Do we need a canonical base encoding for the domains so that data stored in browser local storage isn't lost?

core/corehttp/hostname.go Show resolved Hide resolved
core/corehttp/hostname.go Show resolved Hide resolved
lidel added a commit that referenced this pull request Jul 1, 2020
#7441 (review)

License: MIT
Signed-off-by: Marcin Rataj <lidel@lidel.org>
lidel added a commit that referenced this pull request Jul 10, 2020
Consensus reached in
#7441 (comment)
#7441 (comment)
#7441 (comment)

License: MIT
Signed-off-by: Marcin Rataj <lidel@lidel.org>
Copy link
Contributor

@ribasushi ribasushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful! Thank you so much for shepherding this through

Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for getting this shipped 🚢

@aschmahmann
Copy link
Contributor

aschmahmann commented Jul 10, 2020

@lidel. I merged the other PR you tagged me in first which led to this having a small merge conflict. Do you mind rebasing, pushing, and tagging me for merge?

This:

- adds subdomain gateway support for ED25519 CIDs in a way that fits in
  a single DNS label to enable TLS for every IPNS website.

- cleans up subdomain redirect logic and adds more explicit error
  handling.

TL;DR on router logic:

When CID is longer than 63 characters, router at /ipfs/* and /ipns/*
converts to Base36, and if that does not help, returns a human readable
400 Bad Request error.

Addressing code review:
#7441 (review)

refactor: use b36 for all libp2p-keys in subdomains
Consensus reached in
#7441 (comment)
#7441 (comment)
#7441 (comment)
@lidel lidel force-pushed the fix/long-cids-in-subdomain-gw-with-tls branch from 357155c to b0af543 Compare July 10, 2020 18:44
@lidel
Copy link
Member Author

lidel commented Jul 10, 2020

@aschmahmann rebased, merge at will 👍

@aschmahmann aschmahmann merged commit 231fab8 into master Jul 10, 2020
petar pushed a commit that referenced this pull request Jul 13, 2020
#7441 (review)

License: MIT
Signed-off-by: Marcin Rataj <lidel@lidel.org>
@lidel lidel deleted the fix/long-cids-in-subdomain-gw-with-tls branch July 13, 2020 23:41
lidel added a commit to multiformats/cid-utils-website that referenced this pull request Aug 26, 2020
This adds Base36 version  of CIDv1 if it fits inside of a single DNS
label. Follows convention introduced in:
ipfs/kubo#7441
lidel added a commit to multiformats/cid-utils-website that referenced this pull request Sep 18, 2020
This adds Base36 version  of CIDv1 if it fits inside a single DNS
label. Follows convention introduced in:
ipfs/kubo#7441
hacdias pushed a commit to ipfs/boxo that referenced this pull request Jan 27, 2023
This:

- adds subdomain gateway support for ED25519 CIDs in a way that fits in
  a single DNS label to enable TLS for every IPNS website.

- cleans up subdomain redirect logic and adds more explicit error
  handling.

TL;DR on router logic:

When CID is longer than 63 characters, router at /ipfs/* and /ipns/*
converts to Base36, and if that does not help, returns a human readable
400 Bad Request error.

Addressing code review:
ipfs/kubo#7441 (review)

refactor: use b36 for all libp2p-keys in subdomains
Consensus reached in
ipfs/kubo#7441 (comment)
ipfs/kubo#7441 (comment)
ipfs/kubo#7441 (comment)


This commit was moved from ipfs/kubo@231fab8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/cidv1b32 Topic cidv1b32 topic/ed25519 Issues related to ed25519 Peer IDs topic/gateway Topic gateway
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants