Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WordPress 5.5 Core XML Sitemap Includes Hidden WooCommerce Products In the Sitemap #27954

Open
5 tasks done
linux4me opened this issue Oct 13, 2020 · 7 comments
Open
5 tasks done
Labels
focus: product management Related to product creation and editing. plugin: woocommerce Issues related to the WooCommerce Core plugin. priority: normal The issue/PR is of normal priority—not many people are affected or there’s a workaround, etc. team: Mothra type: enhancement The issue is a request for an enhancement.

Comments

@linux4me
Copy link

Prerequisites (mark completed items with an [x]):

  • I have have carried out troubleshooting steps and I believe I have found a bug.
  • I have searched for similar bugs in both open and closed issues and cannot find a duplicate.

Describe the bug
I'm using WP 5.5.1 and WooCommerce 4.5.2. I just discovered that products with their catalog visibility set to "hidden" in WooCommerce show up in the WordPress XML sitemap, and are then crawled and indexed by Google.

I first posted this to WP Core, but I was told it wasn't a Core issue and to post here.

Expected behavior
Hidden products shouldn't be in the sitemap.

Actual behavior
Products with their catalog visibility set to "hidden" appear in WordPress XML Sitemap.

Steps to reproduce the bug (We need to be able to reproduce the bug in order to fix it.)
Steps to reproduce the bug:

  1. Create a new product in WooCommerce
  2. In the "Publish" section in the right sidebar, set the "catalog visibility" to "hidden."
  3. Save the product.
  4. Copy the link to the product.
  5. Browse to the site's sitemap.
  6. Select the posts, product sitemap.
  7. Use your browser's search function to search for the product's URL.

Isolating the problem (mark completed items with an [x]):

  • I have deactivated other plugins and confirmed this bug occurs when only WooCommerce plugin is active.
  • This bug happens with a default WordPress theme active, or Storefront.
  • I can reproduce this bug consistently using the steps above.

WordPress Environment
We use the WooCommerce System Status Report to help us evaluate the issue.
Without this report we won't be able to fully evaluate this issue.

``` ` ### WordPress Environment ###

WordPress address (URL): https://thedomain.com
Site address (URL): https://thedomain.com
WC Version: 4.5.2
REST API Version: ✔ 4.5.2
WC Blocks Version: ✔ 3.1.0
Action Scheduler Version: ✔ 3.1.6
WC Admin Version: ✔ 1.5.0
Log Directory Writable: ✔
WP Version: 5.5.1
WP Multisite: –
WP Memory Limit: 256 MB
WP Debug Mode: –
WP Cron: ✔
Language: en_US
External object cache: –

Server Environment

Server Info: Apache
PHP Version: 7.4.11
PHP Post Max Size: 8 MB
PHP Time Limit: 90
PHP Max Input Vars: 1000
cURL Version: 7.29.0
NSS/3.44

SUHOSIN Installed: –
MySQL Version: 5.5.5-10.3.25-MariaDB
Max Upload Size: 8 MB
Default Timezone is UTC: ✔
fsockopen/cURL: ✔
SoapClient: ✔
DOMDocument: ✔
GZip: ✔
Multibyte String: ✔
Remote Post: ✔
Remote Get: ✔

Database

WC Database Version: 4.5.2
WC Database Prefix: wp_
Total Database Size: 237.68MB
Database Data Size: 136.85MB
Database Index Size: 100.83MB
wp_woocommerce_sessions: Data: 2.02MB + Index: 0.05MB + Engine InnoDB
wp_woocommerce_api_keys: Data: 0.02MB + Index: 0.03MB + Engine InnoDB
wp_woocommerce_attribute_taxonomies: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_woocommerce_downloadable_product_permissions: Data: 0.02MB + Index: 0.06MB + Engine InnoDB
wp_woocommerce_order_items: Data: 4.52MB + Index: 1.52MB + Engine InnoDB
wp_woocommerce_order_itemmeta: Data: 28.55MB + Index: 21.06MB + Engine InnoDB
wp_woocommerce_tax_rates: Data: 0.02MB + Index: 0.06MB + Engine InnoDB
wp_woocommerce_tax_rate_locations: Data: 0.02MB + Index: 0.05MB + Engine InnoDB
wp_woocommerce_shipping_zones: Data: 0.02MB + Index: 0.00MB + Engine InnoDB
wp_woocommerce_shipping_zone_locations: Data: 0.02MB + Index: 0.05MB + Engine InnoDB
wp_woocommerce_shipping_zone_methods: Data: 0.02MB + Index: 0.00MB + Engine InnoDB
wp_woocommerce_payment_tokens: Data: 0.05MB + Index: 0.02MB + Engine InnoDB
wp_woocommerce_payment_tokenmeta: Data: 0.09MB + Index: 0.11MB + Engine InnoDB
wp_woocommerce_log: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_actionscheduler_actions: Data: 0.27MB + Index: 0.42MB + Engine InnoDB
wp_actionscheduler_claims: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_actionscheduler_groups: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_actionscheduler_logs: Data: 0.22MB + Index: 0.19MB + Engine InnoDB
wp_commentmeta: Data: 0.41MB + Index: 0.41MB + Engine InnoDB
wp_comments: Data: 10.52MB + Index: 9.09MB + Engine InnoDB
wp_links: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_options: Data: 3.45MB + Index: 0.56MB + Engine InnoDB
wp_postmeta: Data: 54.58MB + Index: 34.11MB + Engine InnoDB
wp_posts: Data: 6.52MB + Index: 2.59MB + Engine InnoDB
wp_relevanssi: Data: 9.48MB + Index: 16.81MB + Engine InnoDB
wp_relevanssi_log: Data: 0.02MB + Index: 0.03MB + Engine InnoDB
wp_relevanssi_stopwords: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_termmeta: Data: 0.14MB + Index: 0.16MB + Engine InnoDB
wp_terms: Data: 0.11MB + Index: 0.13MB + Engine InnoDB
wp_term_relationships: Data: 0.20MB + Index: 0.13MB + Engine InnoDB
wp_term_taxonomy: Data: 0.11MB + Index: 0.11MB + Engine InnoDB
wp_tinvwl_analytics: Data: 0.06MB + Index: 0.05MB + Engine InnoDB
wp_tinvwl_items: Data: 0.02MB + Index: 0.00MB + Engine InnoDB
wp_tinvwl_lists: Data: 0.06MB + Index: 0.00MB + Engine InnoDB
wp_usermeta: Data: 6.52MB + Index: 5.03MB + Engine InnoDB
wp_users: Data: 0.28MB + Index: 0.25MB + Engine InnoDB
wp_wcpdf_invoice_number: Data: 0.25MB + Index: 0.00MB + Engine InnoDB
wp_wc_admin_notes: Data: 0.02MB + Index: 0.00MB + Engine InnoDB
wp_wc_admin_note_actions: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_wc_category_lookup: Data: 0.05MB + Index: 0.00MB + Engine InnoDB
wp_wc_customer_lookup: Data: 1.52MB + Index: 0.33MB + Engine InnoDB
wp_wc_download_log: Data: 0.02MB + Index: 0.03MB + Engine InnoDB
wp_wc_order_coupon_lookup: Data: 0.06MB + Index: 0.06MB + Engine InnoDB
wp_wc_order_product_lookup: Data: 4.52MB + Index: 6.06MB + Engine InnoDB
wp_wc_order_stats: Data: 1.52MB + Index: 0.59MB + Engine InnoDB
wp_wc_order_tax_lookup: Data: 0.08MB + Index: 0.06MB + Engine InnoDB
wp_wc_product_meta_lookup: Data: 0.27MB + Index: 0.44MB + Engine InnoDB
wp_wc_reserved_stock: Data: 0.02MB + Index: 0.00MB + Engine InnoDB
wp_wc_tax_rate_classes: Data: 0.02MB + Index: 0.02MB + Engine InnoDB
wp_wc_webhooks: Data: 0.02MB + Index: 0.02MB + Engine InnoDB

Post Type Counts

attachment: 1492
boxzilla-box: 1
cfredux_contact_form: 2
mc4wp-form: 1
nav_menu_item: 143
page: 113
post: 8
product: 606
product_variation: 2110
revision: 215
shop_coupon: 23
shop_order: 8029
shop_order_refund: 190
soliloquy: 1

Security

Secure connection (HTTPS): ✔
Hide errors from visitors: ✔

Active Plugins (15)

Autoptimize: by Frank Goossens (futtta) – 2.7.7
Boxzilla: by ibericode – 3.2.23
Contact Form Redux: by linux4me – 1.1.1
MC4WP: Mailchimp for WordPress: by ibericode – 4.8.1
MainWP Child: by MainWP – 4.1.2
Relevanssi: by Mikko Saari – 4.8.3
Simple Lightbox: by Archetyped – 2.8.1
Responsive WordPress Slider - Soliloquy Lite: by Soliloquy Team – 2.6.1
TI WooCommerce Wishlist: by TemplateInvaders – 1.21.11
User Role Editor: by Vladimir Garagulya – 4.56.1
WooCommerce Elavon Converge Gateway: by SkyVerge – 2.7.1
WooCommerce PDF Invoices & Packing Slips: by Ewout Fernhout – 2.6.1
WooCommerce - ShipStation Integration: by WooCommerce – 4.1.39
WooCommerce: by Automattic – 4.5.2
WP Super Cache: by Automattic – 1.7.1

Inactive Plugins (0)

Dropin Plugins (1)

advanced-cache.php: advanced-cache.php

Settings

API Enabled: –
Force SSL: ✔
Currency: USD ($)
Currency Position: left
Thousand Separator: ,
Decimal Separator: .
Number of Decimals: 2
Taxonomies: Product Types: external (external)
grouped (grouped)
simple (simple)
variable (variable)

Taxonomies: Product Visibility: exclude-from-catalog (exclude-from-catalog)
exclude-from-search (exclude-from-search)
featured (featured)
outofstock (outofstock)
rated-1 (rated-1)
rated-2 (rated-2)
rated-3 (rated-3)
rated-4 (rated-4)
rated-5 (rated-5)

Connected to WooCommerce.com: ✔

WC Pages

Shop base: #2799 - /shop/
Cart: #14 - /cart/
Checkout: #15 - /checkout/
My account: #16 - /my-account/
Terms and conditions: ❌ Page not set

Theme

Name: Twenty Sixteen Child
Version: 1.0.0
Author URL: https://somedomain.com/
Child Theme: ✔
Parent Theme Name: Twenty Sixteen
Parent Theme Version: 2.2
Parent Theme Author URL: https://wordpress.org/
WooCommerce Support: ✔

Templates

Overrides: –

Elavon Converge Credit Card

Environment: Production
Tokenization Enabled: ✔
Debug Mode: Off

TI WooCommerce Wishlist Templates

Overrides: –

Action Scheduler

Complete: 935
Oldest: 2020-09-12 16:25:56 -0700
Newest: 2020-10-13 14:00:28 -0700

Pending: 1
Oldest: 2020-10-15 09:55:44 -0700
Newest: 2020-10-15 09:55:44 -0700

`

</details>
@knutsp
Copy link

knutsp commented Oct 13, 2020

Hidden products are still public content. They are visible when linked to, custom internal links other external (e.g. search engines). The visibility property is about on what kind of pages it's listed. One might say that "Hidden" is bit misleading, as "Unlisted" would be more accurate, IMHO. But that's another issue.

If a product is not to public, it has to be un-published (have a main status not visible for the visitor/user role), as with posts/pages.

Since anyone can view a "hidden" product on a standard WooCommerce setup, it would not be intuitive, but expected, that it's listed on a sitemap. As a published "hidden" page with no navigation (menu/widget) link is also listed there.

Removing it form a sitemap could actually contribute the false impression that "hidden" products are actually not public. And I assume there is no plans to make "hidden" products actually not visible to visitors at all, as it would be unwanted and a huge BC break.

@linux4me
Copy link
Author

I'm fairly confident that most users using the "hidden" setting to make a product not visible in the catalog or search will not want the product showing up on the sitemap or Google search results, either.

The whole point is to have products that can be viewed by the public if desired--by providing a customer a direct link--but that aren't otherwise obvious on the site. If they pop up on search engines, it defeats the whole point of hiding them from the catalog and search on the site.

@kraftbj
Copy link
Contributor

kraftbj commented Oct 14, 2020

Just some notes for possible implementation. The Core logic for posts, including CPTs, is in https://github.com/WordPress/WordPress/blob/master/wp-includes/sitemaps/providers/class-wp-sitemaps-posts.php

You can filter individual posts via the wp_sitemaps_posts_entry filter.

One approach, though, could be to remove the product CPT via the wp_sitemaps_post_types filter and creating your own provider (an instance of https://github.com/WordPress/WordPress/blob/master/wp-includes/sitemaps/class-wp-sitemaps-provider.php ). Then, register it with the wp_register_sitemap_provider function.

@peterfabian
Copy link
Contributor

Hi,

thanks for creating the bug report. I think we have this documented, although it's not very explicit:

Hidden – Only visible on the single product page – not on any other pages

I guess that's why the option is also called 'Catalog visibility' and not just visibility.
I tend to agree with you @linux4me that it might be confusing for store owners, as this does not explicitly state whether the product is included in the sitemap, but it says clearly that it's visible on the single product page.

That being said, I also think that changing the behavior to make these products inaccessible from a single product page is something that would break bw compatibility considerably. However, removing products just from the sitemap could be seen similar to removing them from the public catalog, so I tend to agree that it might be a more logical approach to filter them out.

Searching through past issues, I can't see anyone else raising this, so it seems to me like a low priority bug/enhancement. If this is important to you, @linux4me , would you consider submitting a PR based on pointers from @kraftbj?

(Thanks @kraftbj !)

@peterfabian peterfabian added type: enhancement The issue is a request for an enhancement. priority: normal The issue/PR is of normal priority—not many people are affected or there’s a workaround, etc. labels Oct 23, 2020
@linux4me
Copy link
Author

linux4me commented Oct 23, 2020

Hi @peterfabian

I wonder if the reason you haven't had more people raising this is that they have no idea it's happening. I would never have known had Google Search Console not alerted me that I had a product without an image or description indexed. The product Google found was a hidden product we use programmatically but not directly on the site, so it doesn't need an image or description.

I may not have been clear. I think the products should show up on the single product page, just not in the sitemap.

I'm not sure what a "PR" is, but I've been using the following in my child theme's functions.php as a workaround to remove the hidden products from the sitemap:

add_filter('wp_sitemaps_posts_query_args', function( $args, $post_type ) { if ( 'product' !== $post_type ) { return $args; } $args['post__not_in'] = isset( $args['post__not_in'] ) ? $args['post__not_in'] : array(); array_push($args['post__not_in'], 12176, 9504, 9500, 9496, 9492, 9489, 9486, 9483, 9478, 9473, 9467, 9450, 9443, 9436); return $args; }, 10, 2);

@ajgw
Copy link

ajgw commented Nov 13, 2020

@peterfabian I can not agree with you because only devs are able to add a filter into functions files of theme. Most of my clients are not able to do so. On the other hand I use often product bundles and bundle items are hidden from catalog too, if do not want to show product pages from single bundled items. Same is for grouped products were grouped items do not have to show up in catalog. Such items often do not have a description because parent products do have. And in one situation one of my clients from healthy business had trouble because we setup few products for internal use only and so they where hidden in catalog. Products are only in use for manual orders.

I had in case of last mentioned client make use of Yoast SEO plugin where I could set related products to no index for searc engines. After that where gone from sitemap which was overwritten by Yoast SEO.

Please give us a kind of no index setting on such products which should be hidden in catalog so we do not need a SEO plugin for this case.

For me it is Urgent.

@pedantic1
Copy link

pedantic1 commented Mar 2, 2022

Hi @peterfabian

I wonder if the reason you haven't had more people raising this is that they have no idea it's happening. I would never have known had Google Search Console not alerted me that I had a product without an image or description indexed. The product Google found was a hidden product we use programmatically but not directly on the site, so it doesn't need an image or description.

I may not have been clear. I think the products should show up on the single product page, just not in the sitemap.

I'm not sure what a "PR" is, but I've been using the following in my child theme's functions.php as a workaround to remove the hidden products from the sitemap:

add_filter('wp_sitemaps_posts_query_args', function( $args, $post_type ) { if ( 'product' !== $post_type ) { return $args; } $args['post__not_in'] = isset( $args['post__not_in'] ) ? $args['post__not_in'] : array(); array_push($args['post__not_in'], 12176, 9504, 9500, 9496, 9492, 9489, 9486, 9483, 9478, 9473, 9467, 9450, 9443, 9436); return $args; }, 10, 2);

My hidden products exist only as components of composite products. They are forbidden from being purchased individually. This is why they are hidden. In fact I redirect any hit on them to the compost product they are used to assemble. I absolutely do not want them in my sitemap.

To use product_visibility vs. listing each hidden product by ID, I wrote the following:

function remove_hidden_products( $args, $post_type ) {
	if ( 'product' !== $post_type ) {
		return $args;
	}

	$args['tax_query'] = $args['tax_query'] ?? array();

	$args['tax_query'][] = array(
		'taxonomy' => 'product_visibility',
		'field'    => 'name',
		'terms'    => 'exclude-from-catalog',
		'operator' => 'NOT IN',
	);
return $args;
} 
add_filter( 'wp_sitemaps_posts_query_args', 'remove_hidden_products', 10, 2 );

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
focus: product management Related to product creation and editing. plugin: woocommerce Issues related to the WooCommerce Core plugin. priority: normal The issue/PR is of normal priority—not many people are affected or there’s a workaround, etc. team: Mothra type: enhancement The issue is a request for an enhancement.
Projects
None yet
Development

No branches or pull requests

8 participants