Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicate canonical URLs for paths with and without trailing slashes #9128

Closed
7 tasks done
ori-shalom opened this issue Jul 7, 2023 · 1 comment · Fixed by #9130
Closed
7 tasks done

duplicate canonical URLs for paths with and without trailing slashes #9128

ori-shalom opened this issue Jul 7, 2023 · 1 comment · Fixed by #9130
Labels
bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers

Comments

@ori-shalom
Copy link
Contributor

ori-shalom commented Jul 7, 2023

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

When the same page can be accessed from URLs with and without a trailing slash crawlers can index the page twice without understanding this is the same page.
The way to improve SEO and mark these pages as a single page is using a <link rel="canonical" .../ > element in the header.

Docusaurus is adding this element but the href value it receives is being set dynamically based on the current location pathname.

This makes the canonical URL different between the pages that end with a trailing slash and those that don't.

I tried to see if the trailingSlash option in docusaurus.config.js has any effect on the behavior but it doesn't seem to do anything with how the canonical link is generated.

Looking at the code I can see the canonical URL is just using the location pathname as it is and calling useBaseURL.

function useDefaultCanonicalUrl() {
const {
siteConfig: {url: siteUrl},
} = useDocusaurusContext();
const {pathname} = useLocation();
return siteUrl + useBaseUrl(pathname);
}

A small experiment of calling useBaseURL with 2 paths, one with a trailing slash and one without shows that it truly doesn't do anything with it to make it canonical:

https://stackblitz.com/edit/github-gwvqkk?file=docs%2Findex.mdx,src%2Fcomponents%2FTestUseBaseURLBehavior%2Findex.tsx

Reproducible demo

https://stackblitz.com/edit/github-gwvqkk?file=docs%2Findex.mdx,src%2Fcomponents%2FTestUseBaseURLBehavior%2Findex.tsx

Steps to reproduce

  1. Navigate to any page of a Docusaurus site without a trailing slash in the URL
  2. Use DevTools to inspect the received HTML and look for the <link rel="canonical" ...> element and see it has the current URL without a trailing slash
  3. Navigate to the same page with a trailing slash
  4. Find the <link rel="canonical" ...> again and see it now has a URL with a trailing slash.

Expected behavior

The <link rel="canonical" ...> element should produce the same URL for the same page no matter if accessed with or without a trailing slash.

Preferably this can also respect trailingSlash option from docusaurus.config.js.

Actual behavior

The <link rel="canonical" ...> element produces different URLs.

Your environment

  • Public source code: N/A
  • Public site URL: https://docs.piiano.com/
  • Docusaurus version used: 2.4.1
  • Environment name and version: Chrome Version 114.0.5735.198 (Official Build) (arm64), Node.js v20.1.0
  • Operating system and version: MacOS 13.4.1 Ventura, Apple M1

Self-service

  • I'd be willing to fix this bug myself.
@ori-shalom ori-shalom added bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers labels Jul 7, 2023
@ori-shalom ori-shalom changed the title duplicate canonical URLs with and without trailing slashes duplicate canonical URLs for paths with and without trailing slashes Jul 7, 2023
@slorber
Copy link
Collaborator

slorber commented Jul 7, 2023

Will look into it when I'm back from holiday, but yes the 2 pages are expected to have the same canncal url

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers
Projects
None yet
2 participants