You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generating a sitemap is really expensive, every time the /sitemap.xml is loaded we make:
an API call to the index topic to get the list of URLs and topics.
and API call to the topic for EACH post to get the last modified date
This is really expensive when the amount of topics grows. For example, the engage pages have 278 discourse topics. This means we make 279 API calls (index topic + each topic) to generate the sitemap. The makes the engage page sitemap timeout constantly (https://ubuntu.com/engage/sitemap.xml).
We are already using the Discourse Data Explorer plugin [1] with a custom query to fetch multiple topics [2]. Why not create another custom query to get the last updated at field for multiple topics? Should be very cheap to execute in batch and much smaller datawise if we limit the query for a single date field.
Generating a sitemap is really expensive, every time the
/sitemap.xml
is loaded we make:This is really expensive when the amount of topics grows. For example, the engage pages have 278 discourse topics. This means we make 279 API calls (index topic + each topic) to generate the sitemap. The makes the engage page sitemap timeout constantly (https://ubuntu.com/engage/sitemap.xml).
To solve this we could paginate the sitemap:
/sitemap.xml
would become a sitemap index/sitemap-<PAGE>.xml
where PAGE is the page where 10(to define) topics are listedThis would make the sitemaps lighter, much faster to load and easier to parse in case we need to process them.
The text was updated successfully, but these errors were encountered: