Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: middleware to short-circuit on 304 #15586

Merged
merged 5 commits into from
Jan 19, 2024
Merged

perf: middleware to short-circuit on 304 #15586

merged 5 commits into from
Jan 19, 2024

Conversation

patak-dev
Copy link
Member

Description

I'm checking the work needed in the server to answer on 304 once it is warmed up. We return 304 when:

  • On a forced reload from the user
  • On HMR when there is a full reload for all code except for the modified modules
  • On second startup (both warm and cold), half or more of the modules could already be warmed up on large apps and they will be also hitting 304.

There isn't much work we do to reach the current condition, but we still do a few things.

  1. We add CORS headers. If they are the same, we could skip this, but probably not worth the complexity to take into account the headers when computing the etag. See https://httpwg.org/specs/rfc7234.html#freshening.responses, the CORS headers will be refreshed on 304 if changed.
  2. We have a few middlewares: proxy, open-in-editor, hmr-ping, public. The public one in particular has the most expensive check
  3. Then in the transform middleware, we run a few regexes (CSS, JS, etc) and then need to run a getModuleByUrl that also adds up.

This PR implements a fast path for 304, with a new middleware after CORS that kicks in when there is an etag, and uses a etagToModuleMap.

I'm seeing a ~3-4% improvement when forcing a reload on my M1. I would say the extra map is ok here. It would be interesting to see the effect on slower machines too.


What is the purpose of this pull request?

  • Bug fix
  • New Feature
  • Documentation update
  • Other

@patak-dev patak-dev added the performance Performance related enhancement label Jan 12, 2024
Copy link

stackblitz bot commented Jan 12, 2024

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

Comment on lines +60 to +65
// For direct CSS requests, if the same CSS file is imported in a module,
// the browser sends the request for the direct CSS request with the etag
// from the imported CSS module. We ignore the etag in this case.
const mixedEtag =
!req.headers.accept?.includes('text/css') &&
isDirectRequest(moduleByEtag.url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand this part. Do you mean that the browser would send the same etag for text/css requests and *.css imports? That sounds strange. Is it a bug that also happens before?

I think it would still be nice to make the request work instead of bailing here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this guard, this test was failing. The issue is that in that test we are both adding global.css as a link in the html and then importing it. I tried to find a reason why we do so and I couldn't. But I think we should leave it as is because it actually helped to spot this bug.

I think our current system for .css isn't ideal. What we do is adding a ?direct flag when we detect that global.css was requested as a link. But that ?direct flag is internal, for the browser, it sees two global.css requests with the same URL. We return different etags for each. When the browser requests the imported global.css module the second time, it uses the etag of the linked one. I thought it could be a bug in chrome but I tested in Firefox and Safari and in all the browsers is the same.

This playground was still working because we did a getModuleByUrl before this PR, and it was done after adding ?direct when it was a link. If the etag was mixed for imported css modules, then it will not match with the etag of the module and a regular 200 will be issued instead of a 304. So this PR is doing the same for this case, I don't know how we could avoid bailing out here. I think it is ok as the setup is an edge case.

An alternative would be to switch from ?direct for linked CSS to using ?import for imported CSS modules instead like we do with other assets (rewriting during import analysis). I think this would actually be cleaner. If we do so, then a bare .css request will always correspond to a link request, and a .css?import will always be a module and these URLs are seen by the browser (and not like ?direct that we add internally) so the requests are for different URLs. But I don't know if we can do this change given that it may have a big effect on the ecosystem, and it is hard to justify it when it is only an issue if the same .css is linked and imported at the same time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks. Sounds like a weird browser bug, but maybe for good reasons. Yeah in that case if it's rare, then I think we can simply bail then. Maybe we could add a debug message to detect this kind of requests if it later turns out to be more common than expected.

The ?import idea does make more sense to me, but yeah maybe we can leave that until we need to make a bigger change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sapphi-red, this may be interesting to you as I think you mentioned we could remove ?import at one point, no? If we do so, we may hit this same etag issues for that assets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess Vary header (Vary: Accept) is what we should use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... this looks great, thanks @sapphi-red! I tried it out and it seems it fixes safari but I still see the same issue in Chrome and Firefox. So maybe there are browsers bugs here? I think we should keep the guard for now and aim to use Vary header in the future (for when we'll try to remove ?import)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems I misunderstood the Vary header. Vary header doesn't affect conditional requests.
Also browsers seems to only use a single E-Tag in If-None-Match header. I guess there isn't a way to tell browsers to send different If-None-Match header for a same URL and a different header.

// check if we can return 304 early
const ifNoneMatch = req.headers['if-none-match']
if (ifNoneMatch) {
const moduleByEtag = server.moduleGraph.getModuleByEtag(ifNoneMatch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a file is added in public directory, I think we need to delete a entry from server.moduleGraph.etagToModuleMap.
For example, if /src/foo.ts existed and later on /public/src/foo.ts is added, then /public/src/foo.ts should be returned instead of the transpileed /src/foo.ts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

More context for others. We are serving public files using sirv, and I see that the way it computes its etag is:

if (isEtag) headers['ETag'] = `W/"${stats.size}-${stats.mtime.getTime()}"`;

There shouldn't be any overlap between public etags and processed etags.

But as this two URLs are the same, the browser may request the public file using the etag of processed module because of the same issue discussed here.

But if the source file existed but was removed, the etag is no longer going to be there. The only case where this is an issue is when there are both a public and a source file at the same time (and the public file was added later). If we find the module by id and remove its etag, then the browser may still request the url with the wrong etag but it should still pick up the public file.

I think it should be fixed by e9a6011

@sapphi-red
Copy link
Member

/ecosystem-ci run

@vite-ecosystem-ci
Copy link

📝 Ran ecosystem CI on e9a6011: Open

suite result latest scheduled
analogjs success success
astro success success
histoire success success
ladle success success
laravel failure failure
marko success success
nuxt failure success
nx failure failure
previewjs success success
qwik failure failure
rakkas success success
remix failure failure
sveltekit success success
unocss success success
vike failure failure
vite-plugin-pwa success success
vite-plugin-react success success
vite-plugin-react-pages success success
vite-plugin-react-swc success success
vite-plugin-svelte success success
vite-plugin-vue success success
vite-setup-catalogue success success
vitepress success success
vitest success success

@patak-dev
Copy link
Member Author

I saw the same Nuxt fail in other PRs when running ecosystem-ci, I think it isn't related to this PR

@sapphi-red
Copy link
Member

sapphi-red commented Jan 19, 2024

I noticed that this breaks vite-plugin-static-copy (similar edge case with #15586 (comment)). That plugin injects a middleware just before the serve public middleware. I will add something similar to https://github.com/vitejs/vite/pull/15586/files#diff-abb3345b6e3b2ec6d297c2ebc54cca85ae4487a31bac3cc9e78457f5114adb26R687-R697 on the plugin side as I guess the middlewares.stack is not part of the public API.

@patak-dev
Copy link
Member Author

injects a middleware just before the serve public middleware

The names in middlewares.stack are part of the public API, but we have added and removed middlewares in minors in the past. In this case, IIUC, vite-plugin-static-copy will still work and only break when a static file is added with the same name as a source file (and that source file isn't deleted). I think it is an edge case worth solving in the plugin, but we could still move forward with this one in a minor.

Copy link
Member

@bluwy bluwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 I'm not sure how confident are you with the new module graph API, and whether we should mark them as internal if we ever need to change it, but otherwise we can give this a shot!

@patak-dev
Copy link
Member Author

I'm not sure if we should mark it as internal yet, because if not @sapphi-red won't be able to fix the edge case he talked about here 🤔

Let's merge it without marking as internal, but we can revisit this before releasing Vite 5.1.
I think the API will be useful for plugins in general.

@patak-dev patak-dev merged commit 35ae4f8 into main Jan 19, 2024
15 checks passed
@patak-dev patak-dev deleted the perf/fast-304-path branch January 19, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance related enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants