Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty GET request causes errors and creates duplicates with different canonical URLs #8043

Closed
optimlab opened this issue Jun 25, 2020 · 2 comments

Comments

@optimlab
Copy link

optimlab commented Jun 25, 2020

Opencart 3.x.x.x

Empty GET request causes errors and creates duplicates with different canonical URLs
If the URL has empty:

  1. page=
  2. limit=
    Duplicates with different canonical URLs:
    <link href="https://demo.opencart.com/index.php?route=product/category&amp;path=20" rel="canonical" />
    <link href="https://demo.opencart.com/index.php?route=product/category&amp;path=20&amp;page=" rel="canonical" /> - This url is indexed!

Fix Instruction

  1. Disable Display Errors.
  2. Add in robots.txt:
Disallow: /*?page=$
Disallow: /*&page=$
Disallow: /*?sort=
Disallow: /*&sort=
Disallow: /*?order=
Disallow: /*&order=
Disallow: /*?limit=
Disallow: /*&limit=
Disallow: /*?filter_name=
Disallow: /*&filter_name=
Disallow: /*?filter_sub_category=
Disallow: /*&filter_sub_category=
Disallow: /*?filter_description=
Disallow: /*&filter_description=

Fix in in future versions Opencart
Search:
if (isset($this->request->get['page'])) {
Replace:
if (!empty($this->request->get['page'])) {

etc..

condor2 added a commit to condor2/Opencart_30xx that referenced this issue Jul 19, 2020
@optimlab
Copy link
Author

If you do, we will do it to the end.

User-agent: *
Disallow: /*route=account/
Disallow: /*route=affiliate/
Disallow: /*route=checkout/
Disallow: /*route=product/search
Disallow: /index.php?route=product/product*&manufacturer_id=
Disallow: /admin
Disallow: /catalog
Disallow: /system
Disallow: /*?page=$
Disallow: /*&page=$
Disallow: /*?sort=
Disallow: /*&sort=
Disallow: /*?order=
Disallow: /*&order=
Disallow: /*?limit=
Disallow: /*&limit=
Disallow: /*?filter_name=
Disallow: /*&filter_name=
Disallow: /*?filter_sub_category=
Disallow: /*&filter_sub_category=
Disallow: /*?filter_description=
Disallow: /*&filter_description=
Disallow: /*?tracking=
Disallow: /*&tracking=
Disallow: /*compare-products
Disallow: /*search
Disallow: /*cart
Disallow: /*checkout
Disallow: /*login
Disallow: /*logout
Disallow: /*vouchers
Disallow: /*wishlist
Disallow: /*my-account
Disallow: /*order-history
Disallow: /*newsletter
Disallow: /*return-add
Disallow: /*forgot-password
Disallow: /*downloads
Disallow: /*returns
Disallow: /*transactions
Disallow: /*create-account
Disallow: /*recurring
Disallow: /*address-book
Disallow: /*reward-points
Disallow: /*affiliate-forgot-password
Disallow: /*create-affiliate-account
Disallow: /*affiliate-login
Disallow: /*affiliates
Allow: /catalog/view/javascript/
Allow: /catalog/view/theme/*/

@ADDCreative
Copy link
Contributor

ADDCreative commented Jul 22, 2020

A bit of a warning about just using a robots.txt to fix this issue. You need remember to robots.txt only controls crawling and not indexing.

From https://support.google.com/webmasters/answer/6062608?hl=en.

You should not use robots.txt as a means to hide your web pages from Google Search results. This is because, if other pages point to your page with descriptive text, your page could still be indexed without visiting the page. If you want to block your page from search results, use another method such as password protection or a noindex directive.

If you are getting pages that are crawled with empty GET parameters. This could be caused by bad links on your site, so adding a robots.txt is not solving the root cause of the problem. Or it could just be an undesirable bot, which will probably ignore robots.txt anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants