Skip to content

Caching issues (leading to Webpack crashing or component templates not being picked up) #1619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chancancode opened this issue Oct 2, 2023 · 8 comments

Comments

@chancancode
Copy link
Contributor

chancancode commented Oct 2, 2023

Setup

  1. ember new --embroider --yarn zomg (yarn optional)
  2. JOBS=0 ember s (JOBS=0 optional, but may get better stack traces/easier to step through the code without the workers)

Scenario 1

  1. touch app/components/foo.hbs
  2. observe it rebuilds successfully
  3. observe that .../rewritten-app/components/foo.hbs exists (as expected)
  4. observe that .../rewritten-app/components/foo.js exists (in addition to foo.hbs)
  5. observe that .../rewritten-app/assets/zomg.js has an entry for components/foo.js
  6. rm app/components/foo.hbs
  7. observe that the build crashes
  8. observe that .../rewritten-app/components/foo.hbs does not exists (as expected)
  9. observe that .../rewritten-app/components/foo.js exists (unexpected)
  10. observe that .../rewritten-app/assets/zomg.js has an entry for components/foo.js (unexpected)

The problem here is that the code the reflects the FS changes (deletion, in this case) into rewritten-app, in response to a sole .hbs file being deleted, should have also deleted the "virtual" js file (which it added in the first place), but did not.

While it is hard to "see" (not sure if there is a way to get the babel output written to disk), but because of the content-based caching (see the scenario below), the babel-transformed output of the leftover foo.js has a reference to foo.hbs, leading to the crash.

Scenario 2

(This scenario is to illustrate an independent issue to the previous one, so make sure you git reset the changes, restart the server, etc, to being from a fresh state)

  1. create app/components/foo.js with the following content:
    import Component from "@ember/component";
    export default Component.extend({ tagName: "span" });
  2. observe it rebuilds successfully
  3. observe that the expected files and entries are present in rewritten-app
  4. observe that if you invoke the component (from application.hbs) it renders an empty span in the browser
  5. create app/components/foo.hbs with any non-empty content (e.g. hello world!)
  6. observe it rebuilds successfully
  7. observe that the expected files and entries are present in rewritten-app
  8. observe that if you invoke the component it renders an empty span without the template's content (unexpected)
  9. edit the component js file to have tagName: "p"
  10. observe it rebuilds successfully
  11. observe that if you invoke the component it does render an p tag with the template's content (as expected)

I believe the issue is the babel transform for the co-located components plug-in only gets re-run solely if (and only if) the content of the .js file changes, which is semantically incorrect. The presence or absence of the adjacent .hbs file should effectively be part of the logical cache key.

Notably, simply touching the mtime of the .js file (either in the actual app, or in the rewritten-app) is not sufficient.

Scenario 3

I added this one later, and I think fundamentally it is just a variant of Scenario 2, but this results in a persistent crash, which shows that the content-based caching behavior survives rebuilds.

  1. before starting the server, create app/components/foo.js and app/components/foo.hbs, the content doesn't matter
  2. start the server
  3. observe that it builds successfully
  4. delete app/components/foo.hbs
  5. observes that the build crashes
  6. exit the server
  7. start the server again
  8. observe that the build is still crashing

You can fix the build by making any content changes to the .js file. However, if you then revert to the previous content at any later time, so long as the underlying cache entry hasn't been evicted yet, it will bring back the crash! So effectively that exact filename-content pair is semi-permanently poisoned to expect a corresponding .hbs file. Fun!

chancancode added a commit to chancancode/embroider that referenced this issue Oct 3, 2023
This fixes the first scenario detailed in embroider-build#1619, where a synthesized
template-only component .js file is not cleaned up when the corresponding
.hbs file is deleted.

There are actually two bugs here.

The first issue is we never attempt to clean up these unneeded files,
which is fixed by keeping track of which of the emitted files are still
needed after the main loop.

The second issue is that, even with that fix, it still does not work,
because we are incorrectly using the async version of the fs API here
without awaiting for the promise.

With the way the promisify code is written, the fs delete is actually
scheduled to happen on the microtask queue, so by the time the
`mergeTrees` code runs, the deletion hasn't actually happened yet.

Since the synthesized tree appears after the appTree, I believe (but
haven't tested) this has also been causing a different bug, where if you
started with just a `.hbs` file, then add a `.js` file later, the new file
does not actually take effect until one more build later, since the build
triggered by the `.js` file creation would get shadowed by the previously
synthesized template-only component file.

Perhaps no caught this issue because the typical workflow involves
creating either an empty file or generating a stub `.js` file, then filling
it in with the actual content. By the "first meaningful save", the deletion
would probably have gone through and this would self-resolve.
@chancancode
Copy link
Contributor Author

I am not sure that there is a Good™ way to fix the babel (?) caching issue – fundamentally the co-location babel plugin breaks Webpack's (or at least, babel-loader's) assumption that transforms are strictly based on the input source.

I think this is basically the same kind of problem that something like babel-plugin-inline-import has, and it seems like their solution is a custom alternative to babel-loader (which probably wraps it under-the-hood). Perhaps we could do that too, but that would only work for Webpack.

I came up with an alternative that may work. In synthesize-template-only-components.js, we can also attempt to detect the cases where there is a lone .js file, and only in those cases, we emit a rewritten .js file that appends a special magic comment, something like //# embroiderTemplate=false.

Alternatively, we can flip it around so that we add //# embroiderTemplate=true to .js files with colocated .hbs files. Either way, the point is to encode the hidden dependency (of whether a .hbs file is present) into the input source file, so that Webpack/babel-loader/etc knows that a change happened.

I'm not sure which way is better. On the one hand, lone .js files are rare, so doing the first option would probably be more performant on big legacy apps which are the ones suffering the most from slow builds. One the other hand, .gjs pretty much transforms into a lone .js files so the other option may be penalizing new apps.

I am not sure how much of a performance impact this really has, but if we have to worry about source maps and stuff then it may start to add up?

@chancancode
Copy link
Contributor Author

Actually: https://dev.to/goatandsheep/how-to-use-babels-new-addexternaldependency-api-338h

This seems promising, I'll look into it

@chancancode
Copy link
Contributor Author

I'm not sure if addExternalDependency is a good fit. From my read so far, it seems like the list is per-plugin-instance, rather than per-plugin-instance-per-file, we could probably massage it to make it work still, but it would require creating some awkward machinery...

Come to think of it, doesn't @embroider/macros have some of the same problems? I'll try to make the meeting tomorrow so I can ask about it.

@chancancode
Copy link
Contributor Author

Yeah. I don't think we can use addExternalDependency for this purpose. It really is just intended for per-plugin-instance things. We can perhaps use it for @embroider/macros, though.

@ef4
Copy link
Contributor

ef4 commented Oct 3, 2023

I came up with an alternative that may work. In synthesize-template-only-components.js,

Yes, I think this class of solution is probably best. By which I mean, incorporating co-location details directly into the JS in a dedicated plugin that happens before babel.

Ideally, this plugin would simply do everything and produce reasonable javascript output with no hidden dependencies. The challenge is doing that without paying the parse cost twice. A special comment is probably a good compromise, so that parse-aware work can still happen in babel but all the filesystem concerns got handled already.

An alternative design for the comment would be to actually make the comment contain the inlined contents of the colocated template. Then the colocation babel plugin would only be responsible for converting the comment to the proper equivalent javascript.

@chancancode
Copy link
Contributor Author

chancancode commented Oct 3, 2023

That is basically the const __COLOCATED_TEMPLATE__ = ... strategy from the old days which seems fine to me. That inlining used to be done in a broccoli pass and the babel pass just transforms it into setComponentTemplate

@chancancode
Copy link
Contributor Author

Regarding the approach of inlining the content into the .js file:

  • Do we have to worry about source maps at all?
    • Are we expecting to process any .js files with pre-existing source maps at all? It seems unlikely to me as we should be talking about hand-written source files here, but I suppose you could have vendored a library into your app folder or something.
    • If there are source existing maps, is it generally safe to append stuff to the end of the file as long as you do it before the //# sourceMappingURL? It seems alright to me, in that it doesn't shift any existing lines downwards, but IIRC source maps spans does not have an endpoint, so whichever span that runs until the previous EOF will now slurp up the template as well. Is that acceptable?
  • Interaction with .gjs files:
    • Do we have plans to make that a first-class feature in Embroider?
    • By the time the colocation plugin runs, currently .gjs files are already compiled into .js with source maps. Usually we wouldn't have to touch .gjs files anyway with this approach, but there is the temporary cases where you are in the middle of refactoring, renamed the .js into .gjs but haven't deleted the .hbs yet. We will have to do something during the incoherent state. I think the correct semantics are ignore the .hbs file but I am not sure there is enough information to do that because of the ordering of the transformations.
  • Inlining vs importing:
    • Currently, I believe we import the the .hbs file into the .js, which means Babel can cache the output as long as the .js source didn't change (which is, a bit too aggressive, but with this approach the missing information is just "whether there is a template", but we won't have to recompile a .js file when the .hbs content changes).
    • If we inline, we will bust the Babel cache whenever the .hbs source changes. Is that acceptable? Is that already happening somehow because of the need to add imports for discovered hbs dependencies?
    • If we do inline, we should probably drop the .hbs file from rewritten-app, but tactically I am not sure we have a easy way to do it, since we rely on mergeTree to smoosh things together, so we can easily add files but not remove them, unfortunately

TL;DR I think retrofitting "has co-located template: Y/N" bit is easy enough, but actually inlining it requires more redesign

chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
These tests are currently written to exhibit the buggy behavior.
When I fix the issue I'll adjust the tests to the correct/expected
behavior.
chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
These tests are currently written to exhibit the buggy behavior.
When I fix the issue I'll adjust the tests to the correct/expected
behavior.
chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
These tests are currently written to exhibit the buggy behavior.
When I fix the issue I'll adjust the tests to the correct/expected
behavior.
chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
These tests are currently written to exhibit the buggy behavior.
When I fix the issue I'll adjust the tests to the correct/expected
behavior.
chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
This fixes the first scenario detailed in embroider-build#1619, where a synthesized
template-only component .js file is not cleaned up when the corresponding
.hbs file is deleted.

There are actually two bugs here.

The first issue is we never attempt to clean up these unneeded files,
which is fixed by keeping track of which of the emitted files are still
needed after the main loop.

The second issue is that, even with that fix, it still does not work,
because we are incorrectly using the async version of the fs API here
without awaiting for the promise.

With the way the promisify code is written, the fs delete is actually
scheduled to happen on the microtask queue, so by the time the
`mergeTrees` code runs, the deletion hasn't actually happened yet.

Since the synthesized tree appears after the appTree, I believe (but
haven't tested) this has also been causing a different bug, where if you
started with just a `.hbs` file, then add a `.js` file later, the new file
does not actually take effect until one more build later, since the build
triggered by the `.js` file creation would get shadowed by the previously
synthesized template-only component file.

Perhaps no caught this issue because the typical workflow involves
creating either an empty file or generating a stub `.js` file, then filling
it in with the actual content. By the "first meaningful save", the deletion
would probably have gone through and this would self-resolve.
chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
As discussed in embroider-build#1619, tools like babel caches the ouput based on
the source content of the input files. For component javascript
files, whether there is a co-located template file is an extra bit
of information that doesn't show up in the source file, but that
information does get used in producing the output. This causes the
caches to not invalidate when a co-located tempalte file is added
or deleted.

This fixes the problem by ensuring we include that information in
the input source file. For now, it is just an inert comment, but
we can actually adjust our babel plugin to rely on this information
rather than doing its own filesystem probing again, which should
have some performance benefit.
chancancode added a commit to chancancode/embroider that referenced this issue Oct 8, 2023
As discussed in embroider-build#1619, tools like babel caches the ouput based on
the source content of the input files. For component javascript
files, whether there is a co-located template file is an extra bit
of information that doesn't show up in the source file, but that
information does get used in producing the output. This causes the
caches to not invalidate when a co-located tempalte file is added
or deleted.

This fixes the problem by ensuring we include that information in
the input source file. For now, it is just an inert comment, but
we can actually adjust our babel plugin to rely on this information
rather than doing its own filesystem probing again, which should
have some performance benefit.
ef4 added a commit that referenced this issue Oct 10, 2023
Fix co-located components regressions (#1619)
@chancancode
Copy link
Contributor Author

This can probably be closed now, but @void-mAlex @mansona @NullVoxPopuli do you think the vite integration has the same bug? In particular the more subtle case 2. Just realized that it has the same general set up so it's probably worth checking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants