Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML entities in search results #24

Closed
fanf2 opened this issue Jul 22, 2022 · 14 comments
Closed

HTML entities in search results #24

fanf2 opened this issue Jul 22, 2022 · 14 comments
Milestone

Comments

@fanf2
Copy link

fanf2 commented Jul 22, 2022

I have a prototype search feature for my website using pagefind, https://dotat.at/search.html. It's nice and whizzy, and it fits in well with my Rust static site generator. Thanks for making pagefind!

The only significant problem is that HTML entities in page titles are escaped, so my results page displays them like

 2022-04-20 – really divisionless random numbers 

Entities in page bodies are not escaped, so if you search (for example) for nbsp, you get a lot of highlighted spaces in the results. This is probably a bug but it isn't a showstopper for me.

@fanf2
Copy link
Author

fanf2 commented Jul 22, 2022

I guess the fix is to change {data.meta?.title} to {@html data.meta?.title} in pagefind_ui/svelte/result.svelte

@fanf2
Copy link
Author

fanf2 commented Jul 23, 2022

For now I have fixed this for myself by monkeypatching the JS bundle, so you will not see the issue on my site as I described above. (I was keen to get it working!)

@bglw
Copy link
Contributor

bglw commented Jul 25, 2022

Ah, yep good spot. I'll get that released for you this week.

@bglw bglw added this to the v0.5.0 milestone Jul 25, 2022
@wgroeneveld
Copy link

wgroeneveld commented Jul 25, 2022

I have a similar issue but I'm not sure if it's 100% the same.
I have a blog post that starts with "I'm joining" and gets displayed via Pagefind as "I'm Joining". I'm having trouble debugging the issue but could this also be resolved with the {@html fix mentioned by @fanf2 ? It's also a title problem.
Thanks!
Heard about the tool in HugoConf and keen to replace Lunr.js as the index file started to clog up! Cheers!

@bglw
Copy link
Contributor

bglw commented Jul 25, 2022

Hi @wgroeneveld — yes that looks like the same issue, but I'll make sure to test it directly before the next release.

@bglw
Copy link
Contributor

bglw commented Jul 26, 2022

Hi @fanf2 and @wgroeneveld 👋

Fixes for HTML entities have been released in Pagefind v0.5.0 🎉

@bglw bglw closed this as completed Jul 26, 2022
@wgroeneveld
Copy link

Hi @fanf2 and @wgroeneveld 👋

Fixes for HTML entities have been released in Pagefind v0.5.0 🎉

Awesome thanks!

@mrjbq7
Copy link

mrjbq7 commented Sep 19, 2023

I still have this happen in Pagefind 1.0.3.

@mrjbq7
Copy link

mrjbq7 commented Sep 19, 2023

For example, using this markup:

<h2 data-pagefind-meta="title"><span class="title">Faster &#34;shuffle&#34;</span></h2>

@bglw
Copy link
Contributor

bglw commented Sep 19, 2023

Ah, my mistake, I did regress this in 1.0. 

We need to avoid rendering some things as HTML (for example, on MDN where pages are titled the <aside> element — but in resolving that I have reintroduced the title display error from this issue.

I'll get this re-fixed up for the next release — sorry about that.

(The indexing bug has not been introduced, just the display bug)

@bglw
Copy link
Contributor

bglw commented Sep 19, 2023

In the meantime, PagefindUI's processResult hook could be used to normalize the titles before display.

@mrjbq7
Copy link

mrjbq7 commented Sep 19, 2023

This seems to work, there might be a more elegant way to unescape:

      new PagefindUI({
          element: "#search",
          showSubResults: true,
          resetStyles: false,
          processResult: function (result) {
              var title = new DOMParser().parseFromString(result.meta.title, "text/html");
              result.meta.title = title.documentElement.textContent;
              return result;
          }
      });

@bglw
Copy link
Contributor

bglw commented Sep 19, 2023

I was about to write a larger note about how I cannot reproduce this, but it only seems to apply if you're using a custom data-pagefind-meta="title" attribute. The automatic h1 title capture works fine — so that will make the fix simpler (and also means this doesn't affect most sites).

(example: searching greater than on https://mdn.pagefind.app/)

That processResult looks fine to me! At least as a temporary stop-gap 🙂

@mrjbq7
Copy link

mrjbq7 commented Sep 19, 2023

Thank you so much for the quick response, and this pagefind thing is awesome. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants