Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP: Gateway _redirects File #290

Merged
merged 32 commits into from
Sep 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
b695910
Initial pass at gateway redirects spec
Jun 15, 2022
0abadca
Clarify error handling and fix some typos
Jun 17, 2022
abb348d
Wording cleanup. Make things more clear.
Jun 17, 2022
6c354a2
The redirects file can have comments.
Jun 17, 2022
00c2215
More corrections and clarifications.
Jun 17, 2022
b6347d6
Mention DNSLink as well
Jun 17, 2022
851de62
Redirects File, not Redirect File
Jun 24, 2022
5d3ab19
Update RFC/0000-gateway-redirects.md
Jun 27, 2022
d16c76d
Update http-gateways/REDIRECTS_FILE.md
Jun 27, 2022
c78507e
Update RFC/0000-gateway-redirects.md
Jun 27, 2022
877a334
Reorg test fixtures, per feedback
Jul 2, 2022
e231593
Reorg security, per feedback
Jul 2, 2022
9ccd750
linting
Jul 11, 2022
b2f7dad
more linting, and move to IPIP folder
Jul 11, 2022
418b6b4
more linting
Jul 11, 2022
4fee0a7
Update IPIP/0000-gateway-redirects.md
Aug 11, 2022
9b8574e
Update IPIP/0000-gateway-redirects.md
Aug 11, 2022
d8d76f3
Update IPIP/0000-gateway-redirects.md
Aug 11, 2022
97a0257
Update IPIP/0000-gateway-redirects.md
Aug 11, 2022
5e198d2
Update http-gateways/REDIRECTS_FILE.md
Aug 11, 2022
6086582
Update http-gateways/REDIRECTS_FILE.md
Aug 11, 2022
5be7fc3
Update http-gateways/REDIRECTS_FILE.md
Aug 11, 2022
2b46076
Address feedback
Aug 11, 2022
8b330b6
Update CID for test fixture
Aug 11, 2022
82175fe
More feedback
Aug 11, 2022
d065377
Update CIDs for test cases
Aug 12, 2022
f3c1d57
Give comments and line termination to their own headings
Aug 19, 2022
9b8cb7d
- Adding missing TOC entries
Sep 16, 2022
5525995
Rename to IPIP 00002
Sep 16, 2022
8b8ccf9
Update CIDs
Sep 22, 2022
67f7c7b
Update CIDs
Sep 23, 2022
33f4f44
IPIP 0002: final editorial changes
lidel Sep 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions IPIP/0002-gateway-redirects-file.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# IPIP 0002: _redirects File Support on Web Gateways

- Start Date: (format: 2022-06-15)
- Related Issues:
- [ipfs/specs/issues/257](https://github.com/ipfs/specs/issues/257)
- [ipfs/kubo/pull/8890](https://github.com/ipfs/kubo/pull/8890)
- [ipfs-docs/pull/1275](https://github.com/ipfs/ipfs-docs/pull/1275)

## Summary

Provide support for URL redirects and rewrites for web sites hosted on Subdomain or DNSLink Gateways, thus enabling support for [single-page applications (SPAs)](https://en.wikipedia.org/wiki/Single-page_application), and avoiding [link rot](https://en.wikipedia.org/wiki/Link_rot) when moving to IPFS-backed hosting.

## Motivation

Web sites often need to redirect from one URL to another, for example, to change the appearance of a URL, to change where content is located without breaking existing links (see [Cool URIs don't change](https://www.w3.org/Provider/Style/URI), [link rot](https://en.wikipedia.org/wiki/Link_rot)), to redirect invalid URLs to a pretty 404 page, or to enable URL rewriting.
URL rewriting in particular is a critical feature for hosting SPAs, allowing routing logic to be handled by front end code. SPA support is the primary impetus for this RFC.

Currently the only way to handle URL redirects or rewrites is with additional software such as NGINX sitting in front of the Gateway. This software introduces operational complexity and decreases the uniformity of experience when navigating to content hosted on a Gateway, thus decreasing the value proposition of hosting web sites in IPFS.

This IPIP proposes the introduction of redirect support for content hosted on Subdomain or DNSLink Gateways, configured via a `_redirects` file residing underneath the root CID of the web site.

## Detailed design

Allow developers to configure redirect support by adding redirect rules to a file named `_redirects` stored underneath the root CID of their web site.
The format for this file is similar to those of [Netlify](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file) and [Cloudflare Pages](https://developers.cloudflare.com/pages/platform/redirects) but only supporting a subset of their functionality.

The format for the file is `from to [status]`.

- `from` - specifies the path to intercept (can include placeholders and a trailing splat)
- `to` - specifies the path or URL to redirect to (can include placeholders or splat matched in `from`)
- `status` - optional [HTTP status code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) (301 if not specified)

Rules in the file are evaluated top to bottom.

For performance reasons this proposal does not include forced redirect support (i.e. redirect rules that are evaluated even if the `from` path exists). In other word, redirect logic will be evaluated if and only if the requested path does not exist. If the requested path exists, we won't even check for the existence of the `_redirects` file.

If a `_redirects` file exists but is unable to be processed, perhaps not even parsing correctly, errors will be returned to the user viewing the site via the Gateway.

The detailed specification is added in [`http-gateways/REDIRECTS_FILE.md`](../http-gateways/REDIRECTS_FILE.md).

### Test fixtures
QmQyqMY5vUBSbSxyitJqthgwZunCQjDVtNd8ggVCxzuPQ4

See spec for testing details.

## Design rationale

Popular services today such as [Netlify](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file) and [Cloudflare Pages](https://developers.cloudflare.com/pages/platform/redirects) allow developers to configure redirect support
using a `_redirects` file hosted at the top level of the web site. While we do not intend to provide all of the same functionality, it seems desirable to use a similar approach to provide a meaningful subset of the functionality offered by these services.

- The format is simple and low on syntax
- Many developers are already familiar with this file name and format
- Using a text file for configuration enables developers to make changes without using other IPFS tools
- The configuration can be easily versioned in both version control systems and IPFS by virtue of the resulting change to the root CID for the content

### User benefit

Provides general URL redirect and rewrite support, which enables three important features:
1. Developers will be able to host single-page applications in IPFS.
2. Same configuration file used for setting up pretty 404 pages.
3. The cost of switching hosting of an existing website to IPFS is lowered by making it possible to keep all legacy URLs working.

### Compatibility

If by some chance developers are already hosting sites that contain a `_redirects` file that does something else, they may need to update the contents of the file to match the new functionality. Errors returned to the user due to parsing errors will guide them regarding the required updates.

### Alternatives

- There was some discussion early on about a [manifest file](https://github.com/ipfs/specs/issues/257) that could be used to configure redirect support in addition to many other things. While the idea of a manifest file has merit, manifest files are much larger in scope and it became challenging to reach agreement on functionality to include.
There is already a large need for redirect support for SPAs, and this proposal allows us to provide that critical functionality without being hampered by further design discussion around manifest files.
In addition, similar to how Netlify allows redirect support to be configured in either a `_redirects` file or a more general [configuration file](https://docs.netlify.com/configure-builds/file-based-configuration/#redirects), there is nothing precluding IPFS from allowing developers to configure redirect support in an app manifest later on.
- There was some discussion with the [n0](https://github.com/n0-computer/) team about potential ways to improve the performance of retrieving metadata such as redirect rules, possibly including it as metadata with the root CID such that it would be included with the request for the CID to begin with.
I believe the performance concerns are alleviated by not providing forced redirect support, and looking for `_redirects` only if the DAG is missing a requested path. Never the less, if a more generic metadata facility were to be introduced in the future, it may make sense to reconsider how redirect rules are specified.

### Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
5 changes: 5 additions & 0 deletions http-gateways/DNSLINK_GATEWAY.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ In short:
- [HTTP Response](#http-response)
- [Appendix: notes for implementers](#appendix-notes-for-implementers)
- [Leveraging DNS for content routing](#leveraging-dns-for-content-routing)
- [Redirects, single-page applications, and custom 404s](#redirects-single-page-applications-and-custom-404s)

# HTTP API

Expand Down Expand Up @@ -98,3 +99,7 @@ Same as [HTTP Response section in `PATH_GATEWAY.md`](./PATH_GATEWAY.md#http-resp
TXT records with known content providers for the data behind a DNSLink. IPFS
clients will be able to detect DNSAddr and preconnect to known content
providers, removing the need for expensive DHT lookup.

## Redirects, single-page applications, and custom 404s

DNSLink Gateway implementations are free to include `_redirects` file support defined in [`REDIRECTS_FILE.md`](./REDIRECTS_FILE.md).
1 change: 1 addition & 0 deletions http-gateways/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,4 @@ model](https://en.wikipedia.org/wiki/Same-origin_policy).

* [SUBDOMAIN_GATEWAY.md](./SUBDOMAIN_GATEWAY.md)
* [DNSLINK_GATEWAY.md](./DNSLINK_GATEWAY.md)
* [REDIRECTS_FILE.md](./REDIRECTS_FILE.md)
204 changes: 204 additions & 0 deletions http-gateways/REDIRECTS_FILE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
# `_redirects` File Specification

![draft](https://img.shields.io/badge/status-draft-yellow.svg?style=flat-square)

**Authors**:

- Justin Johnson ([@justincjohnson](https://github.com/justincjohnson))

----

**Abstract**

The Redirects File specification is an extension of the Subdomain Gateway and DNSLink Gateway specifications.

Developers can enable URL redirects or rewrites by adding redirect rules to a file named `_redirects` stored underneath the root CID of their web site.

This can be used, for example, to enable URL rewriting for hosting a single-page application, to redirect invalid URLs to a pretty 404 page, or to avoid [link rot](https://en.wikipedia.org/wiki/Link_rot) when moving to IPFS-based website hosting.

# Table of Contents

- [File Name and Location](#file-name-and-location)
- [File Format](#file-format)
- [From](#from)
- [To](#to)
- [Status](#status)
- [Placeholders](#placeholders)
- [Splat](#splat)
- [Comments](#comments)
- [Line Termination](#line-termination)
- [Max File Size](#max-file-size)
- [Evaluation](#evaluation)
- [Subdomain or DNSLink Gateways](#subdomain-or-dnslink-gateways)
- [Order](#order)
- [No Forced Redirects](#no-forced-redirects)
- [Error Handling](#error-handling)
- [Security](#security)
- [Appendix: notes for implementors](#appendix-notes-for-implementors)
- [Test fixtures](#test-fixtures)

# File Name and Location

The Redirects File MUST be named `_redirects` and stored underneath the root CID of the web site.

# File Format
justindotpub marked this conversation as resolved.
Show resolved Hide resolved

The Redirects File MUST be a text file containing one or more lines with the following format (brackets indication optionality).

```
from to [status]
```

## From

The path to redirect from.

## To

The URL or path to redirect to.

## Status

An optional integer specifying the HTTP status code to return from the request. Supported values are:

- `200` - OK
- Redirect will be treated as a rewrite, returning OK without changing the URL in the browser.
- `301` - Permanent Redirect (default)
- `302` - Found (commonly used for Temporary Redirect)
- `303` - See Other (replacing PUT and POST with GET)
- `307` - Temporary Redirect (explicitly preserving body and HTTP method of original request)
- `308` - Permanent Redirect (explicitly preserving body and HTTP method of original request)
- `404` - Not Found
- Useful for redirecting invalid URLs to a pretty 404 page.
justindotpub marked this conversation as resolved.
Show resolved Hide resolved
- `410` - Gone
- `451` - Unavailable For Legal Reasons

## Placeholders

Placeholders are named variables that can be used to match path segments in the `from` path and inject them into the `to` path.

For example:

```
/posts/:month/:day/:year/:slug /articles/:year/:month/:day/:slug
```

This rule will redirect a URL like `/posts/06/15/2022/hello-world` to `/articles/2022/06/15/hello-world`.

### Splat

If a `from` path ends with an asterisk (i.e. `*`), the remainder of the `from` path is slurped up into the special `:splat` placeholder, which can then be injected into the `to` path.

For example:

```
/posts/* /articles/:splat
```

This rule will redirect a URL like `/posts/2022/06/15/hello-world` to `/articles/2022/06/15/hello-world`.

Splat logic MUST only apply to a single trailing asterisk, as this is a greedy match, consuming the remainder of the path.
lidel marked this conversation as resolved.
Show resolved Hide resolved

justindotpub marked this conversation as resolved.
Show resolved Hide resolved
### Comments

Any line beginning with `#` will be treated as a comment and ignored at evaluation time.

For example:

```
# Redirect home to index.html
/home /index.html 301
```

is functionally equivalent to

```
/home /index.html 301
```

### Line Termination

Lines MUST be terminated by either `\n` or `\r\n`.

### Max File Size

The file size MUST NOT exceed 64 KiB.

# Evaluation

## Subdomain or DNSLink Gateways

Rules MUST only be evaluated when hosted on a Subdomain or DNSLink Gateway, so that we have [Same-Origin](https://en.wikipedia.org/wiki/Same-origin_policy) isolation.

## Order

Rules MUST be evaluated in order, redirecting or rewriting using the first matching rule.

## No Forced Redirects

All redirect logic MUST only be evaluated if the requested path is not present in the DAG. This means that any performance impact associated with checking for the existence of a Redirects File or evaluating redirect rules will only be incurred for non-existent paths.

# Error Handling

If the Redirects File exists but there is an error reading or parsing it, the errors MUST be returned to the user with a 500 HTTP status code.

# Security

This functionality will only be evaluated for Subdomain or DNSLink Gateways, to ensure that redirect paths are relative to the root CID hosted at the specified domain name.

Parsing of the `_redirects` file should be done safely to prevent any sort of injection vector or daemon crash.

The [max file size](#max-file-size) helps to prevent an additional [denial of service attack](https://en.wikipedia.org/wiki/Denial-of-service_attack) vector.

# Appendix: notes for implementors

## Test fixtures

Sample files for various test cases can be found in `QmQyqMY5vUBSbSxyitJqthgwZunCQjDVtNd8ggVCxzuPQ4`.
Implementations are free to use it for internal testing.

```
$ ipfs ls QmQyqMY5vUBSbSxyitJqthgwZunCQjDVtNd8ggVCxzuPQ4
QmcBcFnKKqgpCVMxxGsriw9ByTVF6uDdKDMuEBq3m6f1bm - bad-codes/
QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj - examples/
QmU7ysGXwAtiV7aBarZASJsxKoKyKmd9Xrz2FFamSCbg8S - forced/
QmWHn2TunA1g7gQ7q9rwAoWuot2hMpojZ6cZ9ERsNKm5gE - good-codes/
QmRgpzYQESidTtTojN8zRWjiNs9Cy6o7KHRxh7kDpJm3KH - invalid/
QmYzMrtPyBv7LKiEAGLLRPtvqm3SjQYLWxwWQ2vnpxQwRd - newlines/
QmQTfvjGmvTfxFpUcZNLdTLuKV227KJkGiN6xooHVeVZAS - too-large/
```

For example, the "examples" site can be found in `QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj`.

```
$ ipfs ls /ipfs/QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj
Qmd9GD7Bauh6N2ZLfNnYS3b7QVAijbud83b8GE8LPMNBBP 7 404.html
QmSmR9NShZ89VEBrn9SBy7Xxvjw8Qe6XArD5GqtHvbtBM3 7 410.html
QmVQqj9oZig9tH3ENHo4bxV5pNgssUwFCXUjAJAVcZVbJG 7 451.html
QmZU3kboiyi9jV59D8Mw8wzuvsr3HmvskqhYRRhdFA8wRq 317 _redirects
QmaWDLb4gnJcJbT1Df5X3j91ysiwkkyxw6329NLiC1KMDR - articles/
QmS6ZNKE9s8fsHoEnArsZXnzMWijKddhXXDsAev8LdTT5z 9 index.html
QmNwEgMrExwSsE8DCjZjahYfHUfkSWRhtqSkQUh4Fk3udD 7 one.html
QmVe2GcTbEPZkMbjVoQ9YieVGKCHmuHMcJ2kbSCzuBKh2s - redirected-splat/
QmUGVnZaofnd5nEDvT2bxcFck7rHyJRbpXkh9znjrJNV92 7 two.html
```

The `_redirects` file is as follows.

```
$ ipfs cat /ipfs/QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj/_redirects
/redirect-one /one.html
/301-redirect-one /one.html 301
/302-redirect-two /two.html 302
/200-index /index.html 200
/posts/:year/:month/:day/:title /articles/:year/:month/:day/:title 301
/splat/* /redirected-splat/:splat 301
/not-found/* /404.html 404
/gone/* /410.html 410
/unavail/* /451.html 451
/* /index.html 200
```

The non-existent paths that are being requested should be intercepted and redirected to the destination path and the specified HTTP status code returned. The rules are evaluated in the order they appear in the file.

Any request for an existing file should be returned as is, and not intercepted by the last catch all rule.
4 changes: 4 additions & 0 deletions http-gateways/SUBDOMAIN_GATEWAY.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Summary:
- [DNS label limits](#dns-label-limits)
- [Security considerations](#security-considerations)
- [URI router](#uri-router)
- [Redirects, single-page applications, and custom 404s](#redirects-single-page-applications-and-custom-404s)

# HTTP API

Expand Down Expand Up @@ -269,3 +270,6 @@ which in turn should redirect to

From there, regular subdomain gateway logic applies.

## Redirects, single-page applications, and custom 404s

Subdomain Gateway implementations are free to include `_redirects` file support defined in [`REDIRECTS_FILE.md`](./REDIRECTS_FILE.md).