Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEO improvements #1080

Closed
ghost opened this issue Sep 17, 2018 · 19 comments
Closed

SEO improvements #1080

ghost opened this issue Sep 17, 2018 · 19 comments

Comments

@ghost
Copy link

ghost commented Sep 17, 2018

SEO - Search Engine Optimization

The thing is that almost none of the instances/video titles/content is discoverable by SEs (Search Engines).
For example, there is an instance called peertube.social, and let's say we need to find a channel called "chriswere" so if we search for:
"chriswere peertube social" - we get no results.

Let's be more exact and add 2 more keywords from this video of ()chriswere:
"chriswere peertube social Chromium BSU" - zero results.

To fix this (at least for now), this article should do: https://backlinko.com/on-page-seo (this is a general purpose article, not everything applies to peertube, but most of it I'd say)

@rigelk
Copy link
Collaborator

rigelk commented Sep 17, 2018

We're essentially missing custom video urls, if I get the gist of this page.

@ghost
Copy link
Author

ghost commented Sep 17, 2018

@rigelk a ctr+u on a video page reveals that there are no header like H1, H2, H3. So there could be more missing steps. Found another one, images have no alt="some text" attribute. If you wanna generate them automatically (instead of telling the users to do that when they upload a video), the alt text for a video thumbnail IMO can be the video title + the keywords specified by the user or only the title or only the keywords.
An example of alt text, based on this video:
"What is peertube | Peertube, presentation"
"What is peertube"
"Peertube, presentation"

Also this file is weird to me https://peertube.social/robots.txt
I don't know how crawlers interpret this:

User-agent: *
Disallow: ' '

After Disallow it's usually a link or a path or nothing https://moz.com/learn/seo/robotstxt

So perhaps bots aren't even allowed to enter and crawl these websites?

@rigelk
Copy link
Collaborator

rigelk commented Sep 18, 2018

@Zig-03 I fixed the robots.txt - not that it should have any impact though.

@rigelk
Copy link
Collaborator

rigelk commented Sep 22, 2018

I have changed the html tag of the video title to <h1> on the video view, and attributes of miniatures now include the title in the alt attribute in f845c68.

@rigelk rigelk closed this as completed Sep 22, 2018
@Chocobozzz
Copy link
Owner

I'm reopening, because we still have SEO issues. For example, we need to index only local videos/accounts/video channels.

Moreover, google does not index our videos because "Google chose different canonical than user".

@ghost
Copy link
Author

ghost commented Nov 13, 2018

non video pages are missing opengraph/twitter tags and a canonical URL.
and not found urls are not returning a 404 status code

@ghost
Copy link
Author

ghost commented Nov 13, 2018

a nice thing to add would be another opengraph tag

video:tag - string array - Tag words associated with this movie.

@ghost
Copy link
Author

ghost commented Dec 2, 2018

We need H tags as well. <H1> for each video title and <H2> for each video description.

BTW right now comments aren't displayed when viewing the page source, was this intended to be like this?

@fflorent
Copy link
Contributor

fflorent commented Jan 27, 2019

I can confirm that's an issue for video makers who want to be present in Peertube (mostly because Youtube strikes them). SEO will lower the barrier for those who don't see the advantages of Peertube (yet).

@Chocobozzz
Copy link
Owner

PeerTube should be properly indexed now. Closing this issue.

If you notice bugs, or ways to improve peertube SEO please create another issue.

@ghost
Copy link

ghost commented May 20, 2020

SEO is still very bad. I was thinking that we might need to use rel=canonical.

The idea is simple: if you have several similar versions of the same content, you pick one “canonical” version and point the search engines at it. This solves the duplicate content problem where search engines don’t know which version of the content to show in their results.

https://yoast.com/rel-canonical/

Cross-domain canonical URLs
Perhaps you have the same piece of content on several domains. There are sites or blogs that republish articles from other websites on their own, as they feel the content is relevant for their users. In the past, we had websites republishing articles from Yoast.com as well (with express permission), but if you had looked at the HTML of every one of those articles you’d found a rel=canonical link pointing right back to our original article. This means all the links pointing to their version of the article count towards the ranking of our canonical version. They get to use our content to please their audience, and we get a clear benefit from it too. Everybody wins.

In our case, each video is indexed by the many instances that are available and I believe this confuses search engines. Search engines don't know who should have the most "priority".

So how should we do this? If instance A has a certain video, then all the other instances that have that video on their instance should add a rel=canonical which will point to the video hosted on instance A.

We could do this for other things which might be perceived by search engines as duplicates, such as user accounts, channel description, etc.

I'm not a master in SEO, but this seems right for our use case.

@Chocobozzz
Copy link
Owner

@k09i71 It's already the case:

curl https://peertube.cpy.re/videos/watch/04af977f-4201-4697-be67-a8d8cae6fa7a

...
<link rel="canonical" href="https://peertube2.cpy.re/videos/watch/04af977f-4201-4697-be67-a8d8cae6fa7a" />
...

@ghost
Copy link

ghost commented May 20, 2020

@Chocobozzz sorry my bad.

After taking another look at it, I can see that you don't have any h1 and h2 (which are critical) in the html version of the page. As someone has pointed this above and this is how Vimeo is doing it as well:

  • H1 for the video title
  • H2 for description
  • H3 for channel name and account name
  • H4 or <p> for the section which includes this info:

Privacy Public
Origin instance video.latavernedejohnjohn.fr
Originally published 25 February 2018
etc..

  • and <p> for everything else, including comments (if you wanna index them)

Check this video's page source out https://vimeo.com/88385603 and you'll see that their html source is much richer than our pages.

@ghost
Copy link

ghost commented May 20, 2020

Google gave this page https://peertube.pcservice46.fr/videos/watch/809a0fa0-4960-412c-8421-db64cae00d0e (peertube 2.2) 15/100 points for mobile and 70/100 for desktop.

/!\ Be aware that google is using this tool to promote their own harmful solutions (e.g. they might recommend using Google AMP to speed up mobile pages, but Google AMP is harming the free internet, and it should be avoided!). So I wouldn't recommend to blindly implement everything they say, but only use their suggestions to understand what the weak spots are, and then find better ways to mitigate them.

To see what SEO issues google has found, introduce a url (this one for example) here.

@Chocobozzz
Copy link
Owner

Chocobozzz commented May 20, 2020

We have a SPA so yes, you won't see HTML using curl. Google bot should interpret JS and correctly render the page. We already use h1 for video title and inject the video title and description in the meta tags of the HTML page.

I'm not sure to understand what do you mean by SEO is still very bad..

@ghost
Copy link

ghost commented May 20, 2020

I'm not sure to understand what do you mean with SEO is still very bad..

If such a specific search can't find this video, then we have huge SEO problems. Try searching for the same phrase using google, you won't find that video either.

We have a SPA so yes, you won't see HTML using curl. Google bot should interpret JS and correctly render the page. We already use h1 for video title and inject the video title and description in the meta tags of the HTML page.

This is not ideal for all search engines, not all search engines are as advanced as google and bing. Also, most major websites and web journals I've analyzed have all these tags as html.

Even https://rankmath.com/tools/seo-analyzer/ can't find the H1 and H2 tags.

@Chocobozzz also, please see my comment from above.

@creasysee
Copy link

creasysee commented Aug 19, 2022

curl https://peertube.cpy.re/videos/watch/04af977f-4201-4697-be67-a8d8cae6fa7a

Hi, all!
I found that Peertube instance has two (more?) pages for each video:

So, why the rel=canonical meta tag point to the first link?

I understand, now is all correct, duplicates has identical canonical meta, it's ok, the question only is why the first link, not second?

Search Engines will give the first link in search results and links in the search results will not match the links on the pages of the PeerTube instance. Users usually copy a video URL from a browser address bar (for social networks), but this isn't a canonical URL, the share button also shows the second link.

@creasysee
Copy link

creasysee commented Aug 19, 2022

I found that Peertube instance has two (more?) pages for each video:

I found the third link:
https://peertube.cpy.re/videos/embed/04af977f-4201-4697-be67-a8d8cae6fa7a
And this page doesn't have canonical rel link (also indexing by SEs).

The fourth link:
https://peertube.cpy.re/w/04af977f-4201-4697-be67-a8d8cae6fa7a
The canonical rel link is ok.

Note: I saw this: #3406

@Chocobozzz
Copy link
Owner

Hello,

For historical reason we use the first URL (videos/watch) as canonical URL/ActivityPub ID but in the web interface we prefer shorter URLs. It's the reason why the SPA changes the first URL to the second one.

I'll add the canonical tag in embed too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants