Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazy config #695

Merged
merged 6 commits into from Mar 22, 2024
Merged

lazy config #695

merged 6 commits into from Mar 22, 2024

Conversation

mbostock
Copy link
Member

@mbostock mbostock commented Feb 7, 2024

This defers loading of the config until a request is made, such that when the page is reloaded, we can always grab the latest config. This means you don’t have to restart the preview server in order to see config changes.

Fixes #645.
Fixes #646 (although, doesn’t push the change down — you still have to reload).

@mbostock mbostock requested a review from Fil February 7, 2024 03:31
@mbostock
Copy link
Member Author

mbostock commented Feb 7, 2024

Hmm, so a downside of this approach is that if you don’t have any config file, every request requires materializing the default config (which includes crawling the docs folder to find all pages). That’s kind of a bummer. I’m not sure how to solve that… probably recording the current time before you load the config, and then invalidating the previously-cached config if any Markdown file within the root is newer than the last load time?

@mbostock mbostock marked this pull request as draft February 7, 2024 03:48
@Fil Fil mentioned this pull request Feb 7, 2024
@Fil
Copy link
Contributor

Fil commented Feb 7, 2024

I think #702 is enough to solve the performance issue you mention. What's slow is not reading the files, but parsing them. In my tests at least, using #702 with this branch makes it go from "noticeably slow" to "instant" (← scientific precision).

Copy link
Contributor

@Fil Fil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want!

@Fil Fil mentioned this pull request Feb 13, 2024
@Fil
Copy link
Contributor

Fil commented Feb 13, 2024

tested with #765 this works very well

@mbostock mbostock marked this pull request as ready for review March 22, 2024 17:09
@mbostock
Copy link
Member Author

I no longer notice any slowness when there’s no config file, probably because we rewrote the parser in #843 which means we do a lot less within parseMarkdown than we used to (less rendering). What do you think @Fil? Is this good enough as-is or should we add the memoization from #702 into this PR?

@mbostock mbostock requested a review from Fil March 22, 2024 17:12
Copy link
Contributor

@Fil Fil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine the preview server needs 1.1s to read the 300 pages of pangea. So that gives me an estimate of about 3ms per page. That's probably fine for medium-sized projects (of course for a large project you wouldn't rely on the default config).

However I noticed that a page that loads 20 _esm.js resources will pay this cost 20 times, so that's now 60ms per page. Even for a medium project this might add up to too much time.

But, I think it points to a simple optimization—pay this cost only once every few seconds. Adding a simple memoization of the result of readConfig completely fixes the issue instantly.

 async _readConfig() {
    const t = +new Date();
    if (_mem.until > t) return _mem.config;
    _mem.until = t + 12000;
    return _mem.config = readConfig(this._config, this._root)
  }

I don't think it's a blocker, and can be worked on as a follow-up.

@mbostock
Copy link
Member Author

mbostock commented Mar 22, 2024

I can take a look at memoizing just readPages, too… I think we could make that cheap by just checking if any of the files have been modified, or if the set of files has changed. Scanning the directory should be cheap.

I’m surprised that _esm.js resources make a difference, though? parseMarkdown doesn’t do any import resolution — that happens in getResolvers and so we shouldn’t be paying for that inside readPages. Is it possible that you were measuring the rendering cost rather than the parsing cost?

@Fil
Copy link
Contributor

Fil commented Mar 22, 2024

I don't know exactly as I am only looking at the network tab in the inspector. Each resource served has the same ~1s latency.

@mbostock
Copy link
Member Author

mbostock commented Mar 22, 2024

a page that loads 20 _esm.js resources will pay this cost 20 times

Ah, sure, that’s because each request to the preview server, including the linked resources, will load the config.

@mbostock
Copy link
Member Author

mbostock commented Mar 22, 2024

The latest commit makes the default config.pages (and config.sidebar by extension) a getter, such that it’s only computed on-demand, and recomputed as necessary. This fixes two problems: it avoids computing pages when not needed (e.g., serving _esm.js), and it recomputes pages as needed so that it reflects the contents of the source root at the time it is needed (which could change independently of the config file being edited!).

This feels nice! I think we could further add caching to readPages, but I don’t think it’s a blocker.

@mbostock mbostock merged commit 3962950 into main Mar 22, 2024
4 checks passed
@mbostock mbostock deleted the mbostock/lazy-config branch March 22, 2024 22:14
@Fil
Copy link
Contributor

Fil commented Mar 22, 2024

This seems to have removed the memoization of the minisearch index.

@mbostock
Copy link
Member Author

I don’t see that… but it is loading pages because that’s destructured within searchIndex before it checks the cache.

@mbostock
Copy link
Member Author

It does break the linkCache in the page’s findLink when pages is undefined…

@Fil
Copy link
Contributor

Fil commented Mar 22, 2024

I've confirmed the problem with a brand-new clone of the repo: after this commit minisearch.json is now recreated every time, which makes the search slow in preview.

Because it's memoized in a WeakMap indexed with the config:

const indexCache = new WeakMap();
...
indexCache.set(config, …)

To fix we could just memoize it independently of the config. But for now… good night!

@mbostock
Copy link
Member Author

Fix in #1129!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants