Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined title #134

Closed
chrisspen opened this issue Nov 5, 2022 · 6 comments
Closed

Undefined title #134

chrisspen opened this issue Nov 5, 2022 · 6 comments
Labels
improvement Not a bug Pagefind CLI The CLI responsible for indexing content

Comments

@chrisspen
Copy link

Let me just say great job on Pagefind's performance.

Although it was difficult for me to index all my pages (took about an hour), the resulting index is very fast to search and looks great.

However, all my search results are showing up with an "undefined" title. How do I correct this?

My html has a valid <title> tag in every page. Is this not automatically pulled as the title?

@bglw
Copy link
Contributor

bglw commented Nov 5, 2022

Hello! 👋

First a quick Q — how many pages are you indexing that takes an hour? (Also, what system are you running on?1)

As for the titles, Pagefind pulls the h1 element from the page for the automatic title, not the title element2 (docs). The best fix for Pagefind (and general web accessibility) would be to make sure your pages have valid h1 heading tags.

Pagefind UI uses the value of the title metadata for the results, so if having h1 elements is not possible for some reason, you can tag any element as the title to use that instead. For example, to use the title element:

<title data-pagefind-meta="title">My Title</title>

Footnotes

  1. The reason I ask is that right now if you're running on an M1 Macbook, the npx pagefind release will be running through Rosetta and might be artificially slow.

  2. The cause of this default is that the title element on website usually includes redundant information (like the name of the site). For example, the h1 of this page is Undefined title ꖛ134, and the title is Undefined title · Issue ꖛ134 · CloudCannon/pagefind

@chrisspen
Copy link
Author

First a quick Q — how many pages are you indexing that takes an hour? (Also, what system are you running on?)

I have about 100k, and they're compressed. I mirrored my files using named pipes that feed each file into zcat when read, so that pagefind can read them.

I'm running on Ubuntu 20.

As for the titles, Pagefind pulls the h1 element from the page for the automatic title, not the title element

Yeah, that's what I figured. I understand your rationale. Although it would be nice to be able to configure that in Pagefind. In my case, it would be a huge hassle to either regenerate every single html document (each is large and takes about ~5min to render). I could write a script to parse and edit them in-place, but that's also cumbersome. Being able to modify Pagefind's default title lookup would save me hours, or possibly days of coding.

Having redundant stuff in the title isn't a big deal in my case. I don't have much redundant text and using the result processor callback, what little I do can be easily cleaned up on demand.

@bglw
Copy link
Contributor

bglw commented Nov 6, 2022

Sounds good — I'll whip up exposing that as an option in the same PR for gzip handing (#135) which together should provide some great QOL improvements for your setup.

@bglw
Copy link
Contributor

bglw commented Nov 6, 2022

Actually, in the interest of keeping the available options lean I'll get Pagefind to automatically fall back to <title> if no <h1> is present.

@chrisspen
Copy link
Author

Thanks. That gzip support looks like it would help tremendously.

@bglw
Copy link
Contributor

bglw commented Nov 6, 2022

Hi @chrisspen 👋

A fallback to the title element has been implemented in Pagefind v0.9.2 — let me know how that goes!

@bglw bglw closed this as completed Nov 6, 2022
@bglw bglw added improvement Not a bug Pagefind CLI The CLI responsible for indexing content labels Nov 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Not a bug Pagefind CLI The CLI responsible for indexing content
Projects
None yet
Development

No branches or pull requests

2 participants