Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preventing 301 redirects on URLs with no trailing slashes (Netlify) #9207

Closed
lloydh opened this issue Oct 18, 2018 · 66 comments
Closed

Preventing 301 redirects on URLs with no trailing slashes (Netlify) #9207

lloydh opened this issue Oct 18, 2018 · 66 comments

Comments

@lloydh
Copy link
Contributor

@lloydh lloydh commented Oct 18, 2018

Summary

URLs with no trailing slash on sites hosted by Netlify lead to an immediate 301 redirect to the page with a trailing slash.

foo.com/bar --> foo.com/bar/

This has a performance cost and implications for SEO.

Is there a Netlify configuration that resolves these URLs without redirecting?

Relevant information

While this question is specific to Netlify, I did a quick review of other Gatsby sites featured in the Showcase and saw the same behaviour in many, but not all cases, for example:

Hopper /company - 301 redirect (Netlify)
Impossible Foods /mission - 301 redirect (unknown)
Cajun Bow Fishing /bows - 301 redirect (Netllify)
Braun /shavers-for-men - 200 no redirect (unknown)

Environment (if relevant)

Same behaviour in Gatsby v1 and v2.
I'm using gatsby-plugin-remove-trailing-slashes and gatsby-plugin-netlify.
Within the project all Links point to the non-trailing slash version.

@Yurickh
Copy link
Contributor

@Yurickh Yurickh commented Oct 18, 2018

I'm not really familiar on how Netlify runs the static site, but I know that this is the default behaviour for express when serving static folders.
I'm not sure which are the performance costs you're implying here, can you link me to a reference? Would love to read more about that.

@lloydh
Copy link
Contributor Author

@lloydh lloydh commented Oct 18, 2018

The performance cost is the synchronous delay for the first byte of useful data caused by the redirect.

Using the Hopper example above, visiting https://www.hopper.com/company takes 150-300ms for the redirect, before any page data is received. On cellular connections with high latency it can add up to 1s.

@Yurickh
Copy link
Contributor

@Yurickh Yurickh commented Oct 18, 2018

I see.

From my experience with static sites, we always redirect to the /-ending url whenever it represents a folder (with an implicit index.html), not a file (like shavers-for-men.html). If you don't, the static resolution will simply fail.

This is also what the "pretty url" option on netlify does.

I guess the best approach here is to use the /-ending url as the canonical one, unless you have really solid reasons to do otherwise.

EDIT: Note that my position here is nowhere near an official position from the gatsby team. This is solely my personal opinion on the subject.

@lloydh
Copy link
Contributor Author

@lloydh lloydh commented Oct 20, 2018

@Yurickh I agree specifying the /-ending urls as canonical is a pragmatic option but it does seem like advocating a schizophrenic url scheme for something that was easily solved with nginx / apache but not today's popular static site hosts.

If this really is the best approach then perhaps gatsby-plugin-remove-trailing-slashes should have this as an option (at least mentioned in the docs). Alternatively gatsby-plugin-canonical-urls could have an option for trailing slashes. I haven't found any other plugins that can set canonical meta tags.

I did come across #9025 but to be honest I'm surprised this hasn't been a bigger issue for a lot of folks given the popularity of /-less Gatsby sites.

@kakadiadarpan
Copy link
Contributor

@kakadiadarpan kakadiadarpan commented Oct 23, 2018

@lloydh did you had a chance to look at this documentation of Redirects|Netlify?

@luukdv
Copy link

@luukdv luukdv commented Oct 26, 2018

Same issue here. Turned on 'Pretty URLs' in Netlify, but when I visit a page and remove the trailing slash in the address bar afterwards, I land on the non-trailing variant.

Maybe there should be an option (or by default?) in gatsby-plugin-canonical-urls to enforce a trailing slash. Right now it's not really a canonical, since the current pathname is used (https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-plugin-canonical-urls/src/gatsby-browser.js#L9). @kakadiadarpan what do you think?

@Yurickh
Copy link
Contributor

@Yurickh Yurickh commented Oct 26, 2018

Just to clarify, the 'Pretty URLs' option in netlify WILL redirect you to the trailing slash variant:

In addition to forwarding paths like /about to /about/ (a common practice in static sites and single page apps), it will also rewrite paths like /about.html to /about/.

@dimoFeeld
Copy link

@dimoFeeld dimoFeeld commented Nov 2, 2018

We are haveing exactly the same problem. And we are now getting penelties from google. Is there a solution?

@lloydh
Copy link
Contributor Author

@lloydh lloydh commented Nov 2, 2018

Just to clarify, the 'Pretty URLs' option in netlify WILL redirect you to the trailing slash variant

This is true but I haven't noticed any difference in behaviour with or without "Pretty URLs"; /foo redirects to /foo/ in either case. The desired behaviour is to rewrite slashless urls to resolve without redirecting.

Making /foo/ canonical is a workaround for the SEO penalties but AFAIK a full solution would only be possible in Netlify's routing layer… or by abandoning Netlify in favour of custom url rewriting …or by adopting slash/ urls. It's an unfortunate situation.

@gatsbot
Copy link

@gatsbot gatsbot bot commented Jan 27, 2019

Old issues will be closed after 30 days of inactivity. This issue has been quiet for 20 days and is being marked as stale. Reply here or add the label "not stale" to keep this issue open!

@gatsbot gatsbot bot added the stale? label Jan 27, 2019
@abohannon
Copy link
Contributor

@abohannon abohannon commented Jan 29, 2019

I'm not having this issue with Netlify specifically, but my UTM query params are being deleted for this same reason. /mypage?utm_source=google becomes /mypage/ and this is causing tracking issues.

@gatsbot
Copy link

@gatsbot gatsbot bot commented Feb 9, 2019

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

Thanks again for being part of the Gatsby community!

@gatsbot gatsbot bot closed this Feb 9, 2019
@dja
Copy link

@dja dja commented Mar 4, 2019

This is definitely still an issue, and we’ve experienced SEO penalties as well. Additionally, turning off Netlify’s pretty URLs feature seems to result in errors stating “Missing resources for /“ or “Missing resources for /slash/“. We’ve tried solutions recommended here: #11524 but haven’t had any luck.

@0505gonzalez
Copy link

@0505gonzalez 0505gonzalez commented Mar 6, 2019

Experiencing this issue as well.

@0505gonzalez
Copy link

@0505gonzalez 0505gonzalez commented Mar 6, 2019

@dja Are you also on Netlify or are you using Github pages?

@dja
Copy link

@dja dja commented Mar 6, 2019

@0505gonzalez
Copy link

@0505gonzalez 0505gonzalez commented Mar 6, 2019

@dja I'm on github pages, but the issue you're facing might be the same as mine. I've opened a new ticket and plan to create a PR shortly: #12364

@0505gonzalez
Copy link

@0505gonzalez 0505gonzalez commented Mar 6, 2019

TLDR: Github pages (and probably Netlify) add trailing forward slashes to folders. If you have something like /public/somepage/index.html and you visit https://yourpage.com/somepage, Github (and probably Netlify) will add a trailing slash because somepage is a directory.

@dja
Copy link

@dja dja commented Mar 6, 2019

@0505gonzalez
Copy link

@0505gonzalez 0505gonzalez commented Mar 7, 2019

@dja That's what I'm currently trying to figure out in the other github issue I opened

@himynameistimli
Copy link
Contributor

@himynameistimli himynameistimli commented Apr 3, 2019

@0505gonzalez were you able to figure out a solution for this issue?

Just landed here as we're having the same problems with the url parameters getting lost in the redirect from the version without the trailing slash to the version with the trailing slash.

Only difference is that we're on S3 + Cloudfront.

We might look into using Lambda@Edge to handle the redirect unless we can figure out a way to get it to work with gatsby.

Update:

For our case, we implemented the fix from https://www.ximedes.com/2018-04-23/deploying-gatsby-on-s3-and-cloudfront/ with the following lambda js function:

const querystring = require('querystring');
exports.handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;

    /* Parse request query string to get javascript object */
    const params = querystring.parse(request.querystring.toLowerCase());
    const sortedParams = {};
    const uri = request.uri;

    /* Sort param keys */
    Object.keys(params).sort().forEach(key => {
        sortedParams[key] = params[key];
    });

	/* Simple way return the index.html */
    if (uri.endsWith('/')) {
        request.uri += 'index.html';
    } else if (!uri.includes('.')) {
        request.uri += '/index.html';
    }

    /* Update request querystring with normalized  */
    request.querystring = querystring.stringify(sortedParams);

    callback(null, request);
};

I'm still trying to figure out what's the best thing to do here, because I think as-is, there's a negative impact to our SEO just because now we're delivering the same page for the trailing and non-trailing slash version. Likely I will add a permanent redirect for the trailing slash version here too.

If you're not using AWS/Cloudfront, I think you'd be able to accomplish this with Cloudflare Workers.

@0505gonzalez
Copy link

@0505gonzalez 0505gonzalez commented Apr 9, 2019

@himynameistimli I did find a solution, proposed a code change in another thread. But seems like it might not be accepted so have not created a PR.

The gist of it:

  • Hosting on github pages. Github follows the directory structure when serving files. E.g. if you hit /somepage, it will redirect to /somepage/ because it's a directory (actual file is /somepage/index.html.
  • My proposed solution was that gatsby generate /somepage.html instead of /somepage/index.html

@garethgd
Copy link

@garethgd garethgd commented Apr 15, 2019

Still experiencing this trailing slash redirect with or without Netlify's pretty URL option enabled.

@decimoseptimo
Copy link

@decimoseptimo decimoseptimo commented May 1, 2019

@KyleAMathews

@himynameistimli I did find a solution, proposed a code change in another thread. But seems like it might not be accepted so have not created a PR.

The gist of it:

  • Hosting on github pages. Github follows the directory structure when serving files. E.g. if you hit /somepage, it will redirect to /somepage/ because it's a directory (actual file is /somepage/index.html.
  • My proposed solution was that gatsby generate /somepage.html instead of /somepage/index.html

@KyleAMathews this is seo danger
This subject isn't explained at all in https://www.gatsbyjs.org/docs/gatsby-link/
If someones decides to use no-trailing-slash-urls, a couple of days later the google serp becomes full of 301 redirections for your site.

@dospolov
Copy link
Contributor

@dospolov dospolov commented Aug 5, 2020

Two years later - still an issue

@stldo
Copy link

@stldo stldo commented Aug 6, 2020

I made a plugin from a code that I normally use in websites hosted at Netlify. It creates a .html file for each page, which disables the 301 redirect for paths without trailing slashes. It works very well with simple websites, I hope it helps someone. More info can be found on the plugin page.

@ayZagen
Copy link

@ayZagen ayZagen commented Aug 27, 2020

this also happens with nginx server.

EDIT: this issue has nothing with gatsby. it is web server misconfiguration. I have checked output files and it seems gatsby creates a directory for each page and and index.html in it. So I had to change my nginx url resolving as following:

location / {
  try_files $uri $uri/index.html $uri.html =404;
}

The $uri/index.html is resolving correct file without redirect. If that doesn't exists or it is $uri/ ( most nginx conf examples uses that ) it will create a redirect with trailing slash. It is also stated in nginx documantation.

In response to a request with URI equal to this string, but without the trailing slash, a permanent redirect with the code 301 will be returned to the requested URI with the slash appended.

I have not used Netlify but I believe same thing would apply to it.

@lyxious
Copy link

@lyxious lyxious commented Sep 7, 2020

this also happens with nginx server.

EDIT: this issue has nothing with gatsby. it is web server misconfiguration. I have checked output files and it seems gatsby creates a directory for each page and and index.html in it. So I had to change my nginx url resolving as following:

location / {
  try_files $uri $uri/index.html $uri.html =404;
}

The $uri/index.html is resolving correct file without redirect. If that doesn't exists or it is $uri/ ( most nginx conf examples uses that ) it will create a redirect with trailing slash. It is also stated in nginx documantation.

In response to a request with URI equal to this string, but without the trailing slash, a permanent redirect with the code 301 will be returned to the requested URI with the slash appended.

I have not used Netlify but I believe same thing would apply to it.

This is not the case, even if you have a blank nginx config, any attempt to access a valid directory will result in a 301 with the trailing slash. It's a common misconception that it is related to the try_files.
The documentation is also poor surrounding this, the location documentation makes it seem like the 301 redirect is only for routes that are proxied.

@atkinson
Copy link

@atkinson atkinson commented Sep 9, 2020

@ayZagen
Copy link

@ayZagen ayZagen commented Sep 9, 2020

This is not the case, even if you have a blank nginx config, any attempt to access a valid directory will result in a 301 with the trailing slash. It's a common misconception that it is related to the try_files.
The documentation is also poor surrounding this, the location documentation makes it seem like the 301 redirect is only for routes that are proxied.

There is nothing said about try_files it is about how file resolving works in nginx. try_files is just a helper directive. That code is just an example to show my usage and solution. I agree with you about documentation being poor. The redirection to a directory with a slash will be performed by an undocumented module named ngx_http_static_module. If you want to disable that behaviour you need to compile nginx yourself. I believe no one here would try this hard for it.

@lyxious
Copy link

@lyxious lyxious commented Sep 17, 2020

This is not the case, even if you have a blank nginx config, any attempt to access a valid directory will result in a 301 with the trailing slash. It's a common misconception that it is related to the try_files.
The documentation is also poor surrounding this, the location documentation makes it seem like the 301 redirect is only for routes that are proxied.

There is nothing said about try_files it is about how file resolving works in nginx. try_files is just a helper directive. That code is just an example to show my usage and solution. I agree with you about documentation being poor. The redirection to a directory with a slash will be performed by an undocumented module named ngx_http_static_module. If you want to disable that behaviour you need to compile nginx yourself. I believe no one here would try this hard for it.

OP's question is about why foo.com/bar redirects to foo.com/bar/. Your initial answer suggests it has to do with the try_files config, but it doesn't. The documentation you provided also has nothing to do with why this request returns a 301. The location documentation, with respect to 301, is in regards to specific *_pass processes which is independant of try_files.

@wardpeet wardpeet self-assigned this Sep 17, 2020
@wardpeet
Copy link
Member

@wardpeet wardpeet commented Sep 17, 2020

Hey, sorry I don't have an update right now but I'll read over this thread again and see if I can get some action items out of it and create a task list so y'all can help us fix these issues 🙏

@mlenser
Copy link

@mlenser mlenser commented Sep 21, 2020

This is a bug in the Netlify UI.

Here's a fix: https://community.netlify.com/t/remove-trailing-slash-redirect-for-gatsby-gatsby-cloud-netlify-website/20976/8

This is indeed the case.

Here is how your netlify config should look like:
image

Disabling optimization at the top level apparently turns on the pretty URLs, even though it visually looks like that isn't the case:
image

So don't check the checkbox next to "Disable asset optimization"

johnnyoshika added a commit to jobcast/jobcast-www that referenced this issue Sep 30, 2020
@flackjap
Copy link

@flackjap flackjap commented Oct 22, 2020

This is a bug in the Netlify UI.
Here's a fix: https://community.netlify.com/t/remove-trailing-slash-redirect-for-gatsby-gatsby-cloud-netlify-website/20976/8

This is indeed the case.

Here is how your netlify config should look like:
image

Disabling optimization at the top level apparently turns on the pretty URLs, even though it visually looks like that isn't the case:
image

So don't check the checkbox next to "Disable asset optimization"

I've just lost 3 hours because of this.

BOLEST.

@alvinometric
Copy link

@alvinometric alvinometric commented Oct 25, 2020

@jlengstorf Do you know if this is intended behaviour ?

@sohamsj
Copy link

@sohamsj sohamsj commented Nov 1, 2020

slash

@jlengstorf
Copy link
Contributor

@jlengstorf jlengstorf commented Nov 3, 2020

@alvinometric I'm not sure — I've sent this over to our UI team for review. it does look like if this isn't a bug, it could do with some clarification

@leomelzer
Copy link

@leomelzer leomelzer commented Nov 11, 2020

Thanks for all the helpful comments which lead us in the right direction.

If it still doesn't work after setting it to @mlenser's comment (#9207 (comment)), make sure to check your netlify.toml for pretty_urls.

Settings in the toml take precedence, see the docs: https://docs.netlify.com/configure-builds/file-based-configuration/#deploy-contexts

UI settings are overridden if a netlify.toml file is present in the root folder of the repo and there exists a setting for the same property/redirect/header in the toml file.

@LekoArts
Copy link
Contributor

@LekoArts LekoArts commented Nov 27, 2020

Since multiple people have reported this as a bug in Netlify's UI / the behavior being the result of a misconfigured hosting, I'll close this one here as resolved (and not an issue with Gatsby). Please follow the linked issues to see how/when it's resolved. Thanks for providing your context and solutions here for future Google users (hello 👋 ).

If you see this issue on another platform than Netlify, please create a new issue with a reproduction -- as this issue here is specific to Netlify, it's resolved.

@LekoArts LekoArts closed this Nov 27, 2020
@jon-sully
Copy link

@jon-sully jon-sully commented Dec 29, 2020

Hey 👋🏻

I know this issue lapsed and got closed but I really think it's important to recap on a couple of things here. The impetus for this issue is that a) there's a disconnect between Gatsby's routing defaults and Netlify's routing configurations, and b) there are serious SEO penalties in play if a Gatsby site doesn't have the trailing-slash / no-trailing-slash issue solved, since a site serving the same content on both URLs (duplicate content) gets knocked on SEO pretty hard. Technically this isn't Gatsby's fault, but as it pertains to all Gatsby users hosting on Netlify, it does seem like a major issue.. or a major risk at the very least.

Solving this problem by disabling "Pretty URLs" in the (yes, awfully borked / painful UX'd) Netlify Asset Optimization panel can open your site up to the duplicate content issue since content may be available at both the un-slashed and the slashed version of your URL path. It's important too to note that if a Gatsby site is available on both /test and /test/ but 'fixes itself', you may just be seeing the Gatsby runtime adjust your address bar via the Browser History API - the super important part has nothing to do with what happens when Gatsby actually runs in the browser - it's the part where Netlify is serving the same content on multiple URLs - the slash and the non-slash paths.

This is fixable and there is a way to get everything working smoothly and on a unified path / slash structure, but it's not disabling 'Pretty URLs'. The tl;dr: is that Netlify really works best / has biases toward using the trailing slash, and unified content pathing on Netlify requires the trailing slash. I elaborated on this in another Gatsby GH thread here:

#27889 (comment)

But I would definitely urge folks to carefully check (from a CLI HTTP tool preferably) which paths (slash and/or no-slash) are resolving to their content on their sites. If both the slash and no-slash paths are resolving to your content, your SEO will hurt for it.

Hope that helps 😕

@yanneves
Copy link

@yanneves yanneves commented Feb 10, 2021

Given gatsbyjs.com itself resolves HTTP 200 with or without trailing slash (duplicate content), it seems this issue is a bit of an afterthought.

$ curl -I https://www.gatsbyjs.com/plugins/
HTTP/2 200
$ curl -I https://www.gatsbyjs.com/plugins
HTTP/2 200

I noticed this issue like others here when I migrated from WordPress to Gatsby, where the previous WordPress configuration stripped trailing slashes. The only way I can see to avoid duplicate content is to cave to the 301 redirects and introduce trailing slashes. Is the SEO penalty for redirecting existing pages like this still relevant? It could be this is a lingering idea in SEO that no longer matters. But otherwise it's a dangerous assumption in Gatsby.

This appears to be baked into the directory structure of the static generated site:

public/
├── index.html
├── some-other-page
│   └── index.html
└── some-page
    └── index.html

A browser would interpret that as example.com/some-page/ and it would be hacky to force it to remove the trailing slash. Generated content would need to respect a configuration option to instead output the following when we want to remove trailing slash:

public/
├── index.html
├── some-other-page.html
└── some-page.html

The above would then be interpreted by the browser as example.com/some-page.

Do we know if Gatsby core is strictly expecting a directory structure for pages? There may be assumptions elsewhere that these are always output as directories. If we can identify those assumptions (or the lack of) configuring a non-trailing slash output like above would solve this issue.

@krzysieqq
Copy link

@krzysieqq krzysieqq commented Mar 30, 2021

Any updates?

@MakowskiHubert
Copy link

@MakowskiHubert MakowskiHubert commented Apr 4, 2021

The same problem with the nginx server and solved by #9207 (comment)

@hazem3500
Copy link
Contributor

@hazem3500 hazem3500 commented Jul 30, 2021

If you want your website to not have any trailing slash and also work with Netlify you can use the gatsby-plugin-netlify and in gatsby-node.js add

const replacePath = path => (path === `/` ? path : path.replace(/\/$/, ``))

exports.onCreatePage = ({ page, actions }) => {
  const { createRedirect } = actions
  if(!page.path.includes('.html') && page.path !== '/') {
    createRedirect({ fromPath: `${page.path}/`, toPath: page.path, isPermanent: true })
  }
}

this will redirect all trailing slash to non-slash paths with 301 status code.
Note if you are also using createPages Gatsby node API you'll need to add it there also

exports.createPages = async ({ actions, graphql }) => {
    const { createPage, createRedirect } = actions
       // ...
       pages.forEach(page => {
          // ...
          createRedirect({ fromPath: `${page.path}/`, toPath: page.path, isPermanent: true })
       })
    })
}

@WhiteHoodHacker
Copy link

@WhiteHoodHacker WhiteHoodHacker commented Aug 15, 2021

By disabling "Pretty URLs" in Netlify and ending up with duplicate content at trailing-slash URLs and non-trailing-slash URLs, wouldn't a simple fix be to add a <link> tag with the canonical URL using the preferred scheme? Plenty of sites serve duplicate content fine with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.