Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP: Gateway _redirects File #290

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2549df3
Initial pass at gateway redirects spec
justincjohnson Jun 15, 2022
6a7da01
Clarify error handling and fix some typos
justincjohnson Jun 17, 2022
3fedc57
Wording cleanup. Make things more clear.
justincjohnson Jun 17, 2022
1105f22
The redirects file can have comments.
justincjohnson Jun 17, 2022
92e66e6
More corrections and clarifications.
justincjohnson Jun 17, 2022
90767b7
Mention DNSLink as well
justincjohnson Jun 17, 2022
4da43e4
Redirects File, not Redirect File
justincjohnson Jun 24, 2022
fc8c485
Update RFC/0000-gateway-redirects.md
justincjohnson Jun 27, 2022
9be3f91
Update http-gateways/REDIRECTS_FILE.md
justincjohnson Jun 27, 2022
d9fd9cc
Update RFC/0000-gateway-redirects.md
justincjohnson Jun 27, 2022
a622b73
Reorg test fixtures, per feedback
justincjohnson Jul 2, 2022
e3f89f6
Reorg security, per feedback
justincjohnson Jul 2, 2022
e5036f3
linting
justincjohnson Jul 11, 2022
0e0837c
more linting, and move to IPIP folder
justincjohnson Jul 11, 2022
43f49a9
more linting
justincjohnson Jul 11, 2022
4aff981
Update IPIP/0000-gateway-redirects.md
justincjohnson Aug 11, 2022
8a927bf
Update IPIP/0000-gateway-redirects.md
justincjohnson Aug 11, 2022
71a9230
Update IPIP/0000-gateway-redirects.md
justincjohnson Aug 11, 2022
9babd0c
Update IPIP/0000-gateway-redirects.md
justincjohnson Aug 11, 2022
167c1af
Update http-gateways/REDIRECTS_FILE.md
justincjohnson Aug 11, 2022
14d5435
Update http-gateways/REDIRECTS_FILE.md
justincjohnson Aug 11, 2022
695d2db
Update http-gateways/REDIRECTS_FILE.md
justincjohnson Aug 11, 2022
d7e3b00
Address feedback
justincjohnson Aug 11, 2022
a726d13
Update CID for test fixture
justincjohnson Aug 11, 2022
50d5678
More feedback
justincjohnson Aug 11, 2022
976b2d9
Update CIDs for test cases
justincjohnson Aug 12, 2022
c15ae5c
Give comments and line termination to their own headings
justincjohnson Aug 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
@@ -0,0 +1,76 @@
# IPIP 0000: Gateway Redirects

- Start Date: (format: 2022-06-15)
- Related Issues:
- [ipfs/specs/issues/257](https://github.com/ipfs/specs/issues/257)
- [ipfs/kubo/pull/8890](https://github.com/ipfs/kubo/pull/8890)

## Summary

Provide support for URL redirects and rewrites for web sites hosted on Subdomain or DNSLink Gateways, thus enabling support for [single-page applications (SPAs)](https://en.wikipedia.org/wiki/Single-page_application), and avoiding [link rot](https://en.wikipedia.org/wiki/Link_rot) when moving to IPFS-backed hosting.

## Motivation

Web sites often need to redirect from one URL to another, for example, to change the appearance of a URL, to change where content is located without breaking existing links (see [Cool URIs don't change](https://www.w3.org/Provider/Style/URI), [link rot](https://en.wikipedia.org/wiki/Link_rot)), to redirect invalid URLs to a pretty 404 page, or to enable URL rewriting.
URL rewriting in particular is a critical feature for hosting SPAs, allowing routing logic to be handled by front end code. SPA support is the primary impetus for this RFC.

Currently the only way to handle URL redirects or rewrites is with additional software such as NGINX sitting in front of the Gateway. This software introduces operational complexity and decreases the uniformity of experience when navigating to content hosted on a Gateway, thus decreasing the value proposition of hosting web sites in IPFS.

This IPIP proposes the introduction of redirect support for content hosted on Subdomain or DNSLink Gateways, configured via a `_redirects` file residing underneath the root CID of the web site.

## Detailed design

Allow developers to configure redirect support by adding redirect rules to a file named `_redirects` stored underneath the root CID of their web site.
The format for this file is similar to those of [Netlify](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file) and [Cloudflare Pages](https://developers.cloudflare.com/pages/platform/redirects) but only supporting a subset of their functionality.

The format for the file is `from to [status]`.

- `from` - specifies the path to intercept (can include placeholders and a trailing splat)
- `to` - specifies the path or URL to redirect to (can include placeholders or splat matched in `from`)
- `status` - optional [HTTP status code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) (301 if not specified)

Rules in the file are evaluated top to bottom.

For performance reasons this proposal does not include forced redirect support (i.e. redirect rules that are evaluated even if the `from` path exists). In other word, redirect logic will be evaluated if and only if the requested path does not exist. If the requested path exists, we won't even check for the existence of the `_redirects` file.
justincjohnson marked this conversation as resolved.
Show resolved Hide resolved

If a `_redirects` file exists but is unable to be processed, perhaps not even parsing correctly, errors will be returned to the user viewing the site via the Gateway.

The detailed specification is added in [`http-gateways/REDIRECTS_FILE.md`](../http-gateways/REDIRECTS_FILE.md).

### Test fixtures
QmcZzEbsNsQM6PmnvPbtDJdRAen5skkCxDRS8K7HafpAsX

See spec for testing details.

## Design rationale

Popular services today such as [Netlify](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file) and [Cloudflare Pages](https://developers.cloudflare.com/pages/platform/redirects) allow developers to configure redirect support
using a `_redirects` file hosted at the top level of the web site. While we do not intend to provide all of the same functionality, it seems desirable to use a similar approach to provide a meaningful subset of the functionality offered by these services.

- The format is simple and low on syntax
- Many developers are already familiar with this file name and format
- Using a text file for configuration enables developers to make changes without using other IPFS tools
- The configuration can be easily versioned in both version control systems and IPFS by virtue of the resulting change to the root CID for the content

### User benefit

Provides general URL redirect and rewrite support, which enables three important features:
1. Developers will be able to host single-page applications in IPFS.
2. Same configuration file used for setting up pretty 404 pages.
3. The cost of switching hosting of an existing website to IPFS is lowered by making it possible to keep all legacy URLs working.

### Compatibility

If by some chance developers are already hosting sites that contain a `_redirects` file that does something else, they may need to update the contents of the file to match the new functionality. Errors returned to the user due to parsing errors will guide them regarding the required updates.

### Alternatives

- There was some discussion early on about a [manifest file](https://github.com/ipfs/specs/issues/257) that could be used to configure redirect support in addition to many other things. While the idea of a manifest file has merit, manifest files are much larger in scope and it became challenging to reach agreement on functionality to include.
There is already a large need for redirect support for SPAs, and this proposal allows us to provide that critical functionality without being hampered by further design discussion around manifest files.
In addition, similar to how Netlify allows redirect support to be configured in either a `_redirects` file or a more general [configuration file](https://docs.netlify.com/configure-builds/file-based-configuration/#redirects), there is nothing precluding IPFS from allowing developers to configure redirect support in an app manifest later on.
- There was some discussion with the [n0](https://github.com/n0-computer/) team about potential ways to improve the performance of retrieving metadata such as redirect rules, possibly including it as metadata with the root CID such that it would be included with the request for the CID to begin with.
I believe the performance concerns are alleviated by not providing forced redirect support, and looking for `_redirects` only if the DAG is missing a requested path. Never the less, if a more generic metadata facility were to be introduced in the future, it may make sense to reconsider how redirect rules are specified.

### Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
@@ -40,3 +40,4 @@ model](https://en.wikipedia.org/wiki/Same-origin_policy).

* [SUBDOMAIN_GATEWAY.md](./SUBDOMAIN_GATEWAY.md)
* [DNSLINK_GATEWAY.md](./DNSLINK_GATEWAY.md)
* [REDIRECTS_FILE.md](./REDIRECTS_FILE.md)
@@ -0,0 +1,190 @@
# Redirects File Specification

![](https://img.shields.io/badge/status-wip-orange.svg?style=flat-square)

**Authors**:

- Justin Johnson ([@justincjohnson](https://github.com/justincjohnson))

----

**Abstract**

The Redirects File specification is an extension of the Subdomain Gateway and DNSLink Gateway specifications.

Developers can enable URL redirects or rewrites by adding redirect rules to a file named `_redirects` stored underneath the root CID of their web site.

This can be used, for example, to enable URL rewriting for hosting a single-page application, to redirect invalid URLs to a pretty 404 page, or to avoid [link rot](https://en.wikipedia.org/wiki/Link_rot) when moving to IPFS-based website hosting.

# Table of Contents

- [File Name and Location](#file-name-and-location)
- [File Format](#file-format)
- [From](#from)
- [To](#to)
- [Status](#status)
- [Placeholders](#placeholders)
- [Splat](#splat)
- [Evaluation](#evaluation)
- [Subdomain or DNSLink Gateways](#subdomain-or-dnslink-gateways)
- [Order](#order)
- [No Forced Redirects](#no-forced-redirects)
- [Error Handling](#error-handling)
- [Security](#security)
- [Appendix: notes for implementors](#appendix-notes-for-implementors)
- [Test fixtures](#test-fixtures)

# File Name and Location

The Redirects File MUST be named `_redirects` and stored underneath the root CID of the web site.

# File Format
justincjohnson marked this conversation as resolved.
Show resolved Hide resolved

The Redirects File MUST be a text file containing one or more lines with the following format (brackets indication optionality).

```
from to [status]
```

## From

The path to redirect from.

## To

The URL or path to redirect to.

## Status

An optional integer specifying the HTTP status code to return from the request. Supported values are:

- `200` - OK
- Redirect will be treated as a rewrite, returning OK without changing the URL in the browser.
- `301` - Permanent Redirect (default)
- `302` - Found (commonly used for Temporary Redirect)
- `303` - See Other (replacing PUT and POST with GET)
- `307` - Temporary Redirect (explicitly preserving body and HTTP method of original request)
- `308` - Permanent Redirect (explicitly preserving body and HTTP method of original request)
- `404` - Not Found
- Useful for redirecting invalid URLs to a pretty 404 page.
justincjohnson marked this conversation as resolved.
Show resolved Hide resolved
- `410` - Gone
- `451` - Unavailable For Legal Reasons

## Placeholders

Placeholders are named variables that can be used to match path segments in the `from` path and inject them into the `to` path.

For example:

```
/posts/:month/:day/:year/:slug /articles/:year/:month/:day/:slug
```

This rule will redirect a URL like `/posts/06/15/2022/hello-world` to `/articles/2022/06/15/hello-world`.

### Splat

If a `from` path ends with an asterisk (i.e. `*`), the remainder of the `from` path is slurped up into the special `:splat` placeholder, which can then be injected into the `to` path.

For example:

```
/posts/* /articles/:splat
```

This rule will redirect a URL like `/posts/2022/06/15/hello-world` to `/articles/2022/06/15/hello-world`.

Splat logic MUST only apply to a single trailing asterisk, as this is a greedy match, consuming the remainder of the path.

justincjohnson marked this conversation as resolved.
Show resolved Hide resolved
### Comments

Any line beginning with `#` will be treated as a comment and ignored at evaluation time.

For example:

```
# Redirect home to index.html
/home /index.html 301
```

is functionally equivalent to

```
/home /index.html 301
```

### Line Termination

Lines MUST be terminated by either `\n` or `\r\n`.

# Evaluation

## Subdomain or DNSLink Gateways

Rules MUST only be evaluated when hosted on a Subdomain or DNSLink Gateway, so that we have [Same-Origin](https://en.wikipedia.org/wiki/Same-origin_policy) isolation.

## Order

Rules MUST be evaluated in order, redirecting or rewriting using the first matching rule.

## No Forced Redirects

All redirect logic MUST only be evaluated if the requested path is not present in the DAG. This means that any performance impact associated with checking for the existence of a Redirects File or evaluating redirect rules will only be incurred for non-existent paths.

# Error Handling

If the Redirects File exists but there is an error reading or parsing it, the errors MUST be returned to the user with a 500 HTTP status code.

# Security

This functionality will only be evaluated for Subdomain or DNSLink Gateways, to ensure that redirect paths are relative to the root CID hosted at the specified domain name.

Parsing of the `_redirects` file should be done safely to prevent any sort of injection vector or daemon crash.

# Appendix: notes for implementors

## Test fixtures

Sample files for various test cases can be found in QmfHFheaikRRB6ap7AdL4FHBkyHPhPBDX7fS25rMzYhLuW, which comes from
sharness test data for the implementation of this feature in Kubo.

```
ipfs ls QmfHFheaikRRB6ap7AdL4FHBkyHPhPBDX7fS25rMzYhLuW
QmcBcFnKKqgpCVMxxGsriw9ByTVF6uDdKDMuEBq3m6f1bm - bad-codes/
QmcZzEbsNsQM6PmnvPbtDJdRAen5skkCxDRS8K7HafpAsX - examples/
QmU7ysGXwAtiV7aBarZASJsxKoKyKmd9Xrz2FFamSCbg8S - forced/
QmWHn2TunA1g7gQ7q9rwAoWuot2hMpojZ6cZ9ERsNKm5gE - good-codes/
QmRgpzYQESidTtTojN8zRWjiNs9Cy6o7KHRxh7kDpJm3KH - invalid/
QmYzMrtPyBv7LKiEAGLLRPtvqm3SjQYLWxwWQ2vnpxQwRd - newlines/
```

For example, the "examples" site can be found in QmcZzEbsNsQM6PmnvPbtDJdRAen5skkCxDRS8K7HafpAsX.

```
$ ipfs ls /ipfs/QmcZzEbsNsQM6PmnvPbtDJdRAen5skkCxDRS8K7HafpAsX
Qmd9GD7Bauh6N2ZLfNnYS3b7QVAijbud83b8GE8LPMNBBP 7 404.html
QmUaEwhw7255s4M2abktMYFL8pwCDb1v5yi6fp7ExJv3e7 270 _redirects
QmaWDLb4gnJcJbT1Df5X3j91ysiwkkyxw6329NLiC1KMDR - articles/
QmS6ZNKE9s8fsHoEnArsZXnzMWijKddhXXDsAev8LdTT5z 9 index.html
QmNwEgMrExwSsE8DCjZjahYfHUfkSWRhtqSkQUh4Fk3udD 7 one.html
QmVe2GcTbEPZkMbjVoQ9YieVGKCHmuHMcJ2kbSCzuBKh2s - redirected-splat/
QmUGVnZaofnd5nEDvT2bxcFck7rHyJRbpXkh9znjrJNV92 7 two.html
```

The `_redirects` file is as follows.

```
$ ipfs cat /ipfs/QmcZzEbsNsQM6PmnvPbtDJdRAen5skkCxDRS8K7HafpAsX/_redirects
/redirect-one /one.html
/301-redirect-one /one.html 301
/302-redirect-two /two.html 302
/200-index /index.html 200
/posts/:year/:month/:day/:title /articles/:year/:month/:day/:title 301
/splat/* /redirected-splat/:splat 301
/not-found/* /404.html 404
/* /index.html 200
```

The non-existent paths that are being requested should be intercepted and redirected to the destination path and the specified HTTP status code returned. The rules are evaluated in the order they appear in the file.

Any request for an existing file should be returned as is, and not intercepted by the last catch all rule.