Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Think about rel=canonical linking #251

Closed
bkil opened this issue May 30, 2023 · 3 comments · Fixed by #266
Closed

Think about rel=canonical linking #251

bkil opened this issue May 30, 2023 · 3 comments · Fixed by #266
Labels
A-archive-room-view The view to look at a room day by day in the archive help wanted PR's welcome to fix this issue. It probably has a potential solution documented in the issue. T-Enhancement New feature or request

Comments

@bkil
Copy link

bkil commented May 30, 2023

Spawning from #238 (comment),

It is not trivial how to apply it here, but basically if multiple pages are only differentiated by a query argument and contain the exact same set of messages with only tiny changes (such as in its highlighting or in its preview metadata), they should be linked back together to a single canonical URL. A search engine crawler is free to throw away any and all alternative links which fold back to the same canonical one instead of indexing each of them separately.

Valid for link, it defines the preferred URL for the current document, which helps search engines reduce duplicate content.

See here for detailed explanation:

For example, StackExchange offers path-based routing for individual answers to a given question, but marks the document up so that each almost identical such document shall refer back to the path of the question as a canonical link:

https://stackoverflow.com/a/482129/796832 -> https://stackoverflow.com/questions/184618/what-is-the-best-comment-in-source-code-you-have-ever-encountered/482129#482129

<link rel="canonical" href="https://stackoverflow.com/questions/184618/what-is-the-best-comment-in-source-code-you-have-ever-encountered" />
@MadLittleMods MadLittleMods added the A-archive-room-view The view to look at a room day by day in the archive label May 30, 2023
@MadLittleMods
Copy link
Contributor

MadLittleMods commented May 30, 2023

Ahhh, based on your explanation of <link rel="canonical" href="..."> here I think I misunderstood the purpose.

I was thinking that <link rel="canonical" href="..."> pointed to the main document where you would find the permalinked item and search engines would consider the current URL as a special cased individual view of the event.

If search engines typically just use this to deduplicate results and avoid wasteful crawling, that's not necessarily a bad thing. I'm sure it would still highlight the relevant thing you're searching for in the search result and use the scroll to text fragment syntax (#:~:text=foo) when you visit the page but it seems like it may not use our ?at=$abc query parameter to link exactly to the relevant message.

It seems to work out for Reddit and StackExchange which all do this 🤷. I think I'm in favor of adding this ⏩

Relevant links:

@MadLittleMods MadLittleMods added T-Enhancement New feature or request help wanted PR's welcome to fix this issue. It probably has a potential solution documented in the issue. labels May 30, 2023
@MadLittleMods MadLittleMods changed the title Think about rel=canonical linking Think about rel=canonical linking May 30, 2023
@bkil
Copy link
Author

bkil commented May 30, 2023

A bit better explanation:
https://en.wikipedia.org/wiki/Canonical_link_element

@jonaharagon
Copy link

I don't know whether this should be a separate issue here, but I would also like rel=canonical to be used to deduplicate matrix-public-archive instances as well, as discussed at matrix-org/matrix-spec-proposals#4021 (comment):

If we wanted something specific to the Matrix Public Archive URL format, we could use an event type scoped to the sub-domain like org.matrix.archive.canonical to convey this information.

👍 Maybe this is something that should be implemented specifically for this client in the way you stated, as opposed to in an MSC. The more I think about that, the more it makes a lot more sense.

(The use-case for this is the same self-hosted community situation we talked about at #234 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-archive-room-view The view to look at a room day by day in the archive help wanted PR's welcome to fix this issue. It probably has a potential solution documented in the issue. T-Enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants