-
Notifications
You must be signed in to change notification settings - Fork 26
Implement a performant handling of sitemap pages #39
Comments
v3 "Supporting 50k/(2626) = 76 post-types. Scales up to 2626*2000 = 1.3 million posts per post-type."
|
Blocked until the technical doc is updated with @joemcgill and @swissspidy thoughts. |
Thanks for kicking off this discussion, @svandragt. If I'm understanding your above description correctly, you're exploring the idea of a hash lookup table where we would automatically create sitemap/buckets for evenly distributing a large number of URLs into groups where we could quickly look up the location of each object based on some deterministic algorithm (in this case, based on object type and ID). This is a really smart solution for doing fast lookups, but I'm concerned that we'll end up with a large number of buckets containing artificially low numbers of objects on sites that have a large number of custom post types and/or custom taxonomy types, which could create performance issues when generating the sitemap index. Ideally, I think we want to come up with a solution that optimizes the objects:buckets ratio so that we can pack a large number of objects into the smallest number of buckets possible, while still being able to quickly look up which bucket an object is in so we can update/delete buckets whenever an objects within that bucket is updated/deleted. The simplest solution for looking up which would be to save the bucket ID as metadata of the object (e.g., post_meta for a post), but as you pointed out in the requirements above, that would lead to a huge increase in meta rows in the database as we add references for each object. If we're storing each bucket as a post of a custom post type, perhaps we can save the maximum and minimum post ID from each bucket in the post meta of each bucket and give each bucket a name which identifies which object type it includes, then we could look up all buckets for a particular object type using a |
I've started on a proof of concept in #64 based on a more fleshed out description detailed on the (still in progress) design document from #11. |
Closing this one for now, as this optimization is off the table for now. |
Description
A performant scalable way of assigning posts of all registered content types to sitemap pages and processing updates / deletions.
A sitemap page is a sitemap linked from the index containing a subset of posts. A post is a piece of content with any registered post type.
Which feature is your enhancement request related to?
#31
Describe the solution you'd like
WIP
Acceptance Criteria
The text was updated successfully, but these errors were encountered: