New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable compressed sitemap.xml #1130
Conversation
/cc @foxik0070 |
Don't worry about the CI failure, we need to remove Python 2.6 :) |
This is good, but do we want to perhaps have a more comprehensive solution which can compress multiple files, not just the sitemap? For that matter, this may be a good candidate for a Plugin? Although, I can see the value in having this feature out-of-the-box. |
Hmm, now that I think more about it. Do we even need this in MkDocs at all? I think most web servers can be configured to automatically serve gzip files. Wouldn't it be better for them to handle it? For example, we already have gzip files on MkDocs.org because GitHub pages does it automatically. I tested with:
|
True, but most basic shared hosting services don't let the end user configure such things. And what do services like ReadTheDocs or pythonhosted.org do (not sure, I didn't check)? I seem to recall that some servers only return the gziped file if it already exists. In order words, MkDocs needs to create it before the server will return it. The client would request In other words, this may need some investigation. |
There's another point of view, you can specify
|
@waylan |
@davidhrbac I'm assuming that is a server config for a specific server (Apache?). If users are using something different (nginx) then the solution would be different. And many users may not actually be able to configure anything. One of the benefits of a static site generator is that you can upload the output to a cheap shared host. The downside is that those cheap shared hosts offer almost no configurability. Which is my point. We need a solution that works for most users. Requiring server configuration is not such a solution. |
I understand, but what I'm saying is that (I think) some servers fake it with static content. They only serve the gzipped response if a gzipped static file exists on the file system next to the non-gzipped one. Therefore, there is a valid reason for MkDocs to generate gzipped content. Unfortunately, I don't recall which servers do that or where I got that idea from. Although on further reflection it occurs to me that your request may not be about |
@waylan yes, we are on the same page. That's why there's this RP. Anyway, this solution reduces the resources. Use-case is the robots.txt file. We can also extend it to |
Most users don't care about gzip either 😄 |
Where (or by whom) does this get requested as a gzipped file? |
Webmaster is the one to define Every byte saved counts on communication... |
Yes, but do web crawls request the robot.txt (or its alias) specifically with the |
True, which is why I suggested this might be a good candidate for a plugin. Those who do care should be able to easily work out how to install and enable a plugin (let alone configure their server), while the rest of the users never even need to give it a thought. |
Web crawls request always robots.txt. There's no robots.txt.gz file and no request for the robots.txt.gz file. Client and server can agree on the gzipped communication. Here come You can declare sitemap file and sitemap file type in robots.txt. Sitemap can be plain text or compressed. So you can define This is the very same concept why we minimise JS and CSS files.
It's also clean that Google is trying to get
|
Sorry, it's been a long time since I've looked at how robots.txt works. For some reason I was thinking that a server specific configuration was aliasing robots.txt to the sitemap. Instead, the robots.txt file just needs to point to the sitemap. And that is server agnostic and something any user should be able to do. In the event a server is not configured to return I was hung up on why only the sitemap should be gzipped and not any other files (html, css, js, ect). I suppose in this case, the robot.txt file can point directly to it regardless of whether the requesting client includes the Given the above, I don't see any reason to not accept this as-is. |
Re status: As the Python 2.6 tests are failing, I'm just waiting for us to officially drop 2.6 support (remove tests, update docs, etc) before merging this. We might do a small bugfix release or two in the 0.16.x series and this should wait till after that. |
We can create a branch for 0.16 from the tag and backport if you like. I don't have strong feelings here, but I don't want any progress to be blocked. |
No rush. I do not need it to be back-ported. I can create the compressed version in CI script. |
@d0ugal correction: I do NOT need... |
I've updated this to the latest code in master, which includes the Plugin API. It feels really weird to have this there. Various refactors have removed any 'single page' specific code, so this feels out-of-place. It would be very easy to do this via a plugin. I'm not so convinced this should be there by default, despite the benefits. In fact, I think a more general solution for gziping various files would make for a better default solution. If someone really wants a gzipped sitemap.xml file, then adding a plugin for that special case should be trivial. After all, that user would need to create their custom robots.txt file to point to the gzipped file anyway. Finally, if this were to be accepted, it should have a test or two first. The failing appveyor tests can be ignored. they are being addressed in #1299. |
7ed9e30
to
7b246f0
Compare
A small patch to introduce compressed sitemap.xml.gz