New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 301 redirect to trailing slash when not specified in createPage.path #19567
Fix 301 redirect to trailing slash when not specified in createPage.path #19567
Conversation
Tests pass locally, maybe I missed something? |
@@ -7,7 +7,8 @@ const generatePathToOutput = outputPath => { | |||
let outputFileName = outputPath.replace(/^(\/|\\)/, ``) // Remove leading slashes for webpack-dev-server | |||
|
|||
if (!/\.(html?)$/i.test(outputFileName)) { | |||
outputFileName = path.join(outputFileName, `index.html`) | |||
outputFileName = | |||
outputFileName + `${outputFileName.endsWith(`/`) ? `index` : ``}.html` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is dangerous - it relies on hosting matching /path/something
to /path/something.html
which I'm not sure how common is (probably less common than matching index.html).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any ideas on how we could test this? Are you just thinking this is a difference in OS's potentially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see your point. Do you have any other idea to avoid the 301 redirect? For example, loading this page triggers it: https://reactjs.org/languages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any ideas on how we could test this?
Other than trying it out on top X of hosting services/software (or reading docs for them), I see no other way :(
Ah, I see your point. Do you have any other idea to avoid the 301 redirect?
Ideally we would only change gatsby serve
here and not the output that Gatsby produces right now (that change have breaking potential). Ideally we could configure express.static
to not do redirects, but last time I checked this wasn't possible. We might need to:
- make feature request to
serve-static
to make 301 behaviour configurable (so it use regular 200 response instead of 301 for those paths) - vendor
serve-static
and apply patch we need ourselves? - or look for something that will not do 301 instead of
serve-static
?
(serve-static
is package used by express.static
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, fact that we need to change express.static
config to support new format is what I worry about with this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the implications... And I see why my changes are too dangerous.
What about... looping through the public folder instead of using express.static()? Pseudo-code:
const serve = (dir = './public') => {
const folders = listFolders(dir);
const files = listFiles(dir);
for each folder in folders {
serve(dir + folder);
}
[custom logic to serve HTML files without trailing slash]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few additional alternatives:
- We can add "200 redirect" ourselves in express middleware before we register
express.static
:
router.use((req, res, next) => {
if (!/.+\..+$/g.test(req.url) && req.url.slice(-1) !== '/') {
req.url = `${req.url}/`;
}
next();
})
this would "200 redirect non-slash non-extension requests"
Problem is that those might be actual files :/
- We can use
redirect: false
(and maybefallthrough: true
) in option forexpress.static
and add new route handler afterexpress.static
to try to send file on our own?
router.use((req, res, next) => {
if (!/.+\..+$/g.test(req.url) && req.url.slice(-1) !== '/') {
// let's try adding trailing `/index.html`, see if file exist and send it if it does
} else {
next();
}
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good idea. I just updated the PR, and also updated the test repository, it works well! Just re-install node_modules and run patch-package to test it.
const matchPaths = await readMatchPaths(program) | ||
router.use(matchPathRouter(matchPaths, { root })) | ||
router.use((req, res, next) => { | ||
if (!/.+\..+$/g.test(req.url) && req.url.slice(-1) !== `/`) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for clarity, could we store this regex in a variable that has a descriptive name? mentally grokking this regex is a bit confounding 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or at least add a comment above explaining what this is doing, what it's solving, etc. I'm just worried about the long term maintenance of this fix getting lost in 6-12m
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is copied from my examples, and this is probably not good test to have in retrospect (it infer that resource should be file if it has .
in it, but that's true at all), so we should remove at least first part of condition leaving only test for trailing slash.
I wonder what we will get now for legitimate 404s tho, now that serve-static
won't immediately return 404
@adrienharnay any chance for an update? |
Hey, sorry for being silent so long! I've updated the PR (and the example repository), and tested that the 404s behave the same before and after the patch (they do). Please let me know if I can do anything else :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I'll get this merged and released soon! Thanks for your hard work and going back and forth with us @adrienharnay !
@adrienharnay unfortunately, it looks like this fix is breaking some other things. Our e2e tests for image path prefixes is breaking. Here's a screenshot from the issue. I suppose we probably should check that the file exists before sending it? |
Maybe the tests are testing an implementation detail of the previous 404 page, and now it's changed so the test fails? Can you give me pointers on what is exactly tested here please, not sure I understand the error message |
I'm not 100% sure this is the correct path for I think this approach also fixes your issue. The above comment is just my opinion, so I'm happy to hear your views. |
if (req.url.slice(-1) !== `/`) { | ||
res.sendFile(req.url + `/index.html`, { | ||
root, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the file doesn't exist? A callback argument can be added here to deal with this case and allow for the regular 404 logic to occur:
res.sendFile(...., (error) => {
if (error) next()
// if there's no error, request.end() has been called and there's nothing to do.
})
@adrienharnay @wardpeet the diff in this PR is rather small. Can we come to a conclusion on what to do here? Merge or close the PR? |
Well, I'm still convinced the choice to force trailing slashes or not should belong to the user, and this PR solves this. But I won't lie, the lag on this PR got me pretty demotivated, and I now depend on a fork. If the maintainers want me to finalize the PR so that it can get merged, I'll gladly do it though. |
I've been playing with this PR a bit and what a "default" webserver does. I'm wondering why does My demo of a default nginx/apache server running on Plesk gives me the same behaviour as serve gives me. |
@wardpeet I had a look at what's happening at the HTTP level. Gatsby and the server disagree on whether a trailing slash is needed.
So really gatsby is hiding the redirects the server is doing by doing its own clientside redirects. Our team had a similar issue with gatsby before, and the google SEO tools told us about it. Eventually we moved to serving our things from AWS S3 so we stopped needing this, but I'm very partial to slash-less URLs. They're just pretty :) |
I'm aware of gatsby is hiding this implementation and that a redirection is happening. The problem is that when we build, we always build like you have trailing slashes and it's the server implementation that needs to handle pretty URLs, not us.
So I'm still unsure what the use case is of not doing a 301 in serve. |
Hey @adrienharnay! First and foremost, I want to thank you for opening this PR and taking the time to improve Gatsby! It has been 14 days until a response. Gatsby serve is doing exactly what a basic webserver would do. Serve is a small shell around express. You can create your own if you need to. |
@fabiosantoscode If I can ask, how did you fix this in S3? I still see this redirect behavior using standard |
Description
When using
gatsby serve
, for pages created withcreatePage
, for a given path/hello
there is a 301 redirect to/hello/
even whenpath
does not end with a slash.The issue can be reproduced on this repository.
The fix is to keep generating
public/hello/index.html
ifcreatePage.path === '/hello/'
, but instead to generatepublic/hello.html
ifcreatePage.path === '/hello
.Related Issues
Fixes #19543