-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Upgrade pagefind #2048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade pagefind #2048
Conversation
To see this in action, direct your web browser to https://dscho.github.io/git-scm.com/pagefind/playground/. |
4097e80
to
43171c4
Compare
Okay, I did have to work on this a bit more than I thought because of a "link checker report" caused by enabling the Pagefind Playground. Turns out that lychee (the link checker) posed some challenges that can only be resolved using options that are now available, but had not been available when I introduced those link checks. So now this is ready to be merged! |
This version most notably sports an optional Playground (see https://pagefind.app/docs/playground/) that _should_ help optimize the search parameters better. Release notes: https://github.com/Pagefind/pagefind/releases/tag/v1.4.0 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a HTML file in a subdirectory references a file in the same subdirectory via a relative path (e.g. Pagefind Playground's `<link rel="stylesheet" href="./pagefind-playground.css" />` in `pagefind/playground/index.html`), lychee's `--base-url` option causes unwanted transformations: It pretends that that file is actually at the base URL. For example, running with `--base-url https://dscho.github.io/git-scm.com` will pretend that that CSS is at https://dscho.github.io/git-scm.com/pagefind-playground.css, skipping the `pagefind/playground/` altogether. This is a known issue, and lychee introduced a `--root-dir` option to accommodate. However, there remain problems, see e.g. lycheeverse/lychee#1718. In our concrete case, we cannot easily use `--root-dir` because when we deploy to forks, the absolute paths have a `/git-scm.com/` prefix yet that is not the suffix of the absolute path of the `public/` folder! Even though it is not easily used, it _is_ possible: by constructing a separate path and adding a symbolic link whose name _is_ `git-scm.com` and which points to the `public/` directory. Then the parent directory of that symlink can be used as `--root-dir`. There is still one caveat: lychee ignores symbolic links when populating the initial set of files from a given base directory. So let's pass the path to that symlink, with trailing slash, as the base directory parameter. That seems to work. This change is needed because we are about to enable Pagefind's shiny new Playground, which has an `index.html` file in `/pagefind/playground/` that references a `.css` in the same directory via a relative link (which would be mishandled when using `--base-url`). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This makes experimenting with Pagefind's search parameters easier, allowing to fiddle with the parameters in real-time by directing the browser to the `/pagefind/playground` URL. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
43171c4
to
a348237
Compare
@dscho Fancy, but I don't have any idea what to do with this. About the lychee changes, they seem to work well when you deploy to your fork, but how can we ensure they also work on git-scm.com? I did not test this locally, and to be honest, I don't have any idea how to. But I trust you, so I'll approve already then you can merge if you feel it's safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine.
You can see how the search results change with changed search parameters, such as: how much weight to give word frequency? How much weight on page length? Etc Just type in a search term and change the sliders.
I did test lightly, in particular what would happen if In any case, I'll see directly after merging if it breaks. And if it does, I'll revert, that's what the Revert button is for 😁 |
Seems to have worked all right: Summary
For comparison, the previous deployment reported these numbers: Summary
I have little idea why the new run's Total is so much lower, but it excites me that the Excluded is drastically lower! Which means that the previous way to run lychee excluded waaaay too much, and I fixed a bug (that thankfully did not uncover broken links). |
FWIW these numbers are consistent with the numbers of the "broken link" issue at dscho#22 (comment) which was opened because of the false positive I merely missed the dramatic change of the Excluded number because they're both long and both start with a |
True. But I didn't have time to babysit the deploy, so I didn't merge yet. And about the link report, I don't know how to read it. But if you're happy, I'm too! |
The numbers refer to unique URLs that were link-checked. That is, the site has about 1.5 million distinct URLs, previously 1.4 million of them were not checked for one reason or another, and only about a bit more than one hundred thousand were checked. All external URLs are excluded from checking. That is, the link to https://gitlab.com/gitlab-org/gitlab-ce/tree/master is not verified. But a link to Now, there are a bit fewer distinct URLs, but only some sixty thousand of them are excluded, and the rest of them (which is now 96% of them) are checked. I recreated the So I inadvertently (and unintentionally!) snuck in a rather important bug fix in this here PR... |
Changes
Context
The recently-released new version of Pagefind sports a pretty neat Playground for experimenting with the search parameters (to optimize them for better search results).
I mentioned this here, and don't want to derail the highly desirable effort to refresh the website, therefore I'll not even mention it there that I opened this PR.
@jvns @To1ne FYI this is the PR to upgrade Pagefind and enable the Playground.