[Static Website] Getting 404 on nested paths, and security headers are not applied #172

adriencaccia · 2022-02-27T19:11:13Z

Description

When trying to directly access a nested path of a SPA deployed with the static website construct, we get a 404 response and the security headers are not applied.

Although a 404 is received, the content is the page is still the desired one.

Regarding the 404:

When building a SPA with a modern framework, we want to leave the handling of "bad routes" to the application itself. Since we cannot know in advance which routes are served by the application, I think a better approach would be to always return a 200 on nested routes.

Why we get the 404

The construct is deploying a Cloudfront with a S3 with website hosting as an origin. The default behavior of a S3 with website hosting when trying to request a non-existing file is to return a 404 and display the error document if it is set.
For SPAs, we want this error document to be the same as the root document of our app. Lift is configured to work like this by default.
So, when hitting a nested route, Cloudfront receives a 404 from S3 and then return it to the client.

Why the security headers are not applied

The security headers are applied thanks to a Cloudfront function deployed by Lift, which is set up to be invoked by viewer response events.
Per the Restrictions on edge functions docs, CloudFront does not invoke edge functions for viewer response events when the origin returns HTTP status code 400 or higher.
Thus when trying to get a nested route, Cloudfront will not apply the function because S3 (the origin) is returning a 404.

Proposed solution

Adding a Lambda@Edge triggered by the origin response trigger to rewrite 4xx to 200 will solve the problem.
This is a common use case for Lambda@Edge, per the docs.
I will submit a PR following this issue to implement it !

Kudos to @CorentinDoue and @guillaumem-theodo for finding this bug !

How to Reproduce

A created a single file SPA with react and nested routes to showcase the problem.
Accessing the deployed cloudfront domain with the nested routes /login or /register will result in the bug described above.

Lift config in serverless.yml:

constructs:
  landing:
    type: 'static-website'
    path: 'build'

build/index.html

<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8" />
    <link rel="shortcut icon" type="image/jpg" href="logo.svg" />

    <!-- Note: When deploying to production, replace "development.js"
      with "production.min.js" in each of the following tags -->

    <!-- Load React and React DOM -->
    <!-- See https://reactjs.org/docs/add-react-to-a-website.html to learn more -->
    <script
      src="https://unpkg.com/react@>=17/umd/react.development.js"
      crossorigin
    ></script>
    <script
      src="https://unpkg.com/react-dom@>=17/umd/react-dom.development.js"
      crossorigin
    ></script>

    <!-- Load history -->
    <script
      src="https://unpkg.com/history@5/umd/history.development.js"
      crossorigin
    ></script>

    <!-- Load React Router and React Router DOM -->
    <script
      src="https://unpkg.com/react-router@6/umd/react-router.development.js"
      crossorigin
    ></script>
    <script
      src="https://unpkg.com/react-router-dom@6/umd/react-router-dom.development.js"
      crossorigin
    ></script>

    <!-- Load babel standalone -->
    <script src="https://unpkg.com/@babel/standalone/babel.min.js"></script>
  </head>
  <body>
    <div id="root"></div>
    <script type="text/babel" data-type="module">
      const Link = ReactRouterDOM.Link;
      const Route = ReactRouterDOM.Route;
      const BrowserRouter = ReactRouterDOM.BrowserRouter;
      const Routes = ReactRouterDOM.Routes;

      const App = () => (
        <BrowserRouter>
          <Routes>
            <Route path="/" element={<Home />} />
            <Route path="/login" element={<Login />} />
            <Route path="/register" element={<Register />} />
          </Routes>
          <ul>
            <li>
              <Link to="/">Home</Link>
            </li>
            <li>
              <Link to="/login">Login</Link>
            </li>
            <li>
              <Link to="/register">Register</Link>
            </li>
          </ul>
        </BrowserRouter>
      );

      const Home = () => <h1>Home</h1>;
      const Login = () => <h1>Login</h1>;
      const Register = () => <h1>Register</h1>;

      ReactDOM.render(<App />, document.querySelector("#root"));
    </script>
  </body>
</html>

Additional Information

Versions of npm packages:

  npmPackages:
    serverless: ^3.2.0 => 3.2.0 
    serverless-lift: ^1.12.1 => 1.12.1

The text was updated successfully, but these errors were encountered:

mnapoli · 2022-02-27T19:37:52Z

Adding a Lambda@Edge triggered by the origin response trigger to rewrite 4xx to 200 will solve the problem.
This is a common use case for Lambda@Edge, per the docs.

Thanks a lot for all these details!

Lambda@Edge is unfortunately very invasive and unpractical, at least from my original experience (creates Lambda functions in many regions, logs in many different regions). For hosting a static website I'd love to avoid that if possible.

Do you think there is any other option than using Lambda@Edge?

adriencaccia · 2022-02-27T19:51:10Z

If we want to keep the S3 set up with website hosting, thus leaving the configuration of error behaviors on its side, I'm afraid we cannot 😕 .

A solution that does not require Lambda@Edge would be to revert #165, manually generate the OAI to allow s3:ListBucket and confgure the 404 behavior on cloudfront side.
But even then, the security headers will still not be applied, because the error behavior (turning 404 into 200) is applied after the attempt to invoke the viewer response cloudfront function.

I am not 100% sure of this, I'm guessing that it is the problem that @CorentinDoue is facing.
He is using serverless-lift@1.11.0 (a version with error behavior on cloudfront side), and he receives 200 on nested routes, but the security headers are not applied.

CorentinDoue · 2022-02-28T10:55:33Z

Yes, if cloudfront gets an error from S3 (a 404 for example), the CloudFront functions are not triggered. Even if this error is explicitly handled (returning a 200 and index.html for example).

With serverless-lift@1.11.0 I don't have problems with 404, the errors are handled by CloudFront. But there are no security headers on nested routes because the CloudFront function responsible for adding them is not triggered

CorentinDoue · 2022-02-28T11:10:00Z

We also found another possible solution that doesn't require Lambda@Edge on StackOverflow: https://stackoverflow.com/a/69479703/11788663

It adds a viewer request CloudFront function to transform the request from my-site.com/my-sub-route into mysite.com/index.html for all resources except a list of allowed extensions. The S3 will find the index.html and respond a 200. The CloudFront function will add the security headers.

@adriencaccia suggest we split the static-website construct into static-website and spa-website construct because spa requires special treatments

adriencaccia · 2022-02-28T11:17:16Z

@adriencaccia suggest we split the static-website construct into static-website and spa-website construct because spa requires special treatments

Indeed, we can see that to make SPA work we will require special configurations.

What do you think of adding a SPA construct with the solution specified in @CorentinDoue comment ? ie: using a viewer request cloudfront function to rewrite the URI to index.html ?

mnapoli · 2022-02-28T12:01:30Z

Yeah, the more we progress on this construct the more it seems like a separating SPA from purely static websites makes sense.

This is a big change though (which I personally still prefer over Lambda@Edge), @fredericbarthelet is that OK with you?

fredericbarthelet · 2022-03-01T08:56:57Z

Thanks @adriencaccia for opening this issue and @CorentinDoue for the additional insights.
I took a quick look at what Amplify offers as port of their hosting strategy in order to have a best practice point of comparaison of what we're trying to achieve here.

Amplify provides 2 hosting strategies:

Amplify console hosting
AWS Cloudfront + S3

The later one, which is closer to our current solution does not implement any lambda@edge. It is not possible to add custom headers as port of Cloudfront response.

The first one however supports the requested feature described above. You can:

set custom headers in the Cloudfront response, and especially implement security headers - https://docs.aws.amazon.com/amplify/latest/userguide/custom-headers.html#example-security-headers
implement a SPA redirect strategy - https://docs.aws.amazon.com/amplify/latest/userguide/redirects.html#redirects-for-single-page-web-apps-spa

The SPA redirect strategy uses a dedicated regex pattern to ensure only SPA paths are redirected to index.html. All other static assets are not rewritten. Would such strategy with the correct regex allow to maintain a single construct ? Static site which are not SPA would not be match by the regex and will allow those kind of app to function as expected. It would still require the implementation of a lambda@edge to enforce such redirect.

What I suggest:

not split the construct and implement the lamda@edge at viewer request level to redirect requests not made to assets
if this configuration works fine both for SPA and static website, we ship it as is
it this configuration does not work with both (regex does not allow distinction in some case), we build 2 constructs, but keep core logic as an abstract construct being extended by 2 specific usable constructs - one for SPA, one for static sites. Allowing most of the logic not to be duplicated.

@adriencaccia @CorentinDoue @mnapoli, WDYT ?

adriencaccia · 2022-03-01T17:28:16Z

It seems to me that using the regex for redirects in static website (not SPAs) is risky. You can see that the regex of amplify does redirect *.html files, which is something we do not want to do for static websites.

I think your recommendation to have 2 constructs is great.

not split the construct and implement the lamda@edge at viewer request level to redirect requests not made to assets

A small correction, it will be a cloudfront function and not a lambda@edge

If we all agree, I'd be happy to implement it 🙂

adriencaccia added the bug Something isn't working label Feb 27, 2022

adriencaccia mentioned this issue Mar 16, 2022

Add single-page-app construct #175

Merged

fredericbarthelet closed this as completed in #175 Apr 4, 2022

mnapoli mentioned this issue Apr 28, 2022

Static websites fail to rewrite URL for sub directories #192

Closed

fargito mentioned this issue May 8, 2022

fix(static-website): re-enable static website hosting #202

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Static Website] Getting 404 on nested paths, and security headers are not applied #172

[Static Website] Getting 404 on nested paths, and security headers are not applied #172

adriencaccia commented Feb 27, 2022 •

edited

Loading

mnapoli commented Feb 27, 2022 •

edited

Loading

adriencaccia commented Feb 27, 2022 •

edited

Loading

CorentinDoue commented Feb 28, 2022

CorentinDoue commented Feb 28, 2022

adriencaccia commented Feb 28, 2022

mnapoli commented Feb 28, 2022

fredericbarthelet commented Mar 1, 2022

adriencaccia commented Mar 1, 2022 •

edited

Loading

[Static Website] Getting 404 on nested paths, and security headers are not applied #172

[Static Website] Getting 404 on nested paths, and security headers are not applied #172

Comments

adriencaccia commented Feb 27, 2022 • edited Loading

Description

Regarding the 404:

Why we get the 404

Why the security headers are not applied

Proposed solution

How to Reproduce

Additional Information

mnapoli commented Feb 27, 2022 • edited Loading

adriencaccia commented Feb 27, 2022 • edited Loading

CorentinDoue commented Feb 28, 2022

CorentinDoue commented Feb 28, 2022

adriencaccia commented Feb 28, 2022

mnapoli commented Feb 28, 2022

fredericbarthelet commented Mar 1, 2022

adriencaccia commented Mar 1, 2022 • edited Loading

adriencaccia commented Feb 27, 2022 •

edited

Loading

mnapoli commented Feb 27, 2022 •

edited

Loading

adriencaccia commented Feb 27, 2022 •

edited

Loading

adriencaccia commented Mar 1, 2022 •

edited

Loading