[FR]: Support for sitemap.xml in standard plugin #1129
Labels
Component-Plugins-Standard
Status-Fixed
Ticket is resolved.
Type-Enhancement
This is request for brand new feature.
Milestone
Brief description of the feature request
Nearly each site has "sitemap".
https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap#xml
https://www.sitemaps.org/protocol.html
https://developers.google.com/search/docs/crawling-indexing/sitemaps/large-sitemaps
Sitemaps are usually linked from "robots.txt" file and can be easily discovered and they are simple alternative to RSS/ATOM.
Some websites extend sitemaps with "sitemap-news" extension. Also, there are other supported extensions, see first link.
For example this site: https://www.techwar.gr/robots.txt links to "sitemap-new.xml" file https://www.techwar.gr/sitemap-news.xml which has list of all articles in readable XML format.
Like 95 % of good websites do provide sitemaps and many of those include "news" additional metadata.
More examples:
https://www.idnes.cz/sitemap
https://www.allstate.com/sitemap-video.xml
https://www.allstate.com/sitemap-image.xml
https://www.hobbyconsolas.com/sitemap-video.xml
https://www.allstate.com/sitemap.xml
https://howolddoyou.com/sitemap-news.xml
https://www.dell.com/index-sitemap.xml.gz
https://www.dell.com/gaming-hub-href-sitemap.xml.gz
Sometimes sitemap or sitemapindex might be compressed with gzip/deflate.
The text was updated successfully, but these errors were encountered: