Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore creating a 'reverse engineered' records.json / stats.json file from a webpack build #9

Open
0xdevalias opened this issue Mar 2, 2024 · 3 comments

Comments

@0xdevalias
Copy link
Owner

0xdevalias commented Mar 2, 2024

This is an idea I've had in passing a few times, but keep forgetting to document it:

  • https://medium.com/@songawee/long-term-caching-using-webpack-records-9ed9737d96f2
    • there are many factors that go into getting consistent filenames. Using Webpack records helps generate longer lasting filenames (cacheable for a longer period of time) by reusing metadata, including module/chunk information, between successive builds. This means that as each build runs, modules won’t be re-ordered and moved to another chunk as often which leads to less cache busting.

    • The first step is achieved by a Webpack configuration setting: recordsPath: path.resolve(__dirname, ‘./records.json’)
      This configuration setting instructs Webpack to write out a file containing build metadata to a specified location after a build is completed.

    • It keeps track of a variety of metadata including module and chunk ids which are useful to ensure modules do not move between chunks on successive builds when the content has not changed.

    • With the configuration in place, we can now enjoy consistent file hashes across builds!

    • In the following example, we are adding a dependency (superagent) to the vendor-two chunk.

      We can see that all of the chunks change. This is due to the module ids changing. This is not ideal as it forces users to re-download content that has not changed.

      The following example adds the same dependency, but uses Webpack records to keep module ids consistent across the builds. We can see that only the vendor-two chunk and the runtime changes. The runtime is expected to change because it has a map of all the chunk ids. Changing only these two files is ideal.

  • https://webpack.js.org/configuration/other-options/#recordspath
    • recordsPath: Use this option to generate a JSON file containing webpack "records" – pieces of data used to store module identifiers across multiple builds. You can use this file to track how modules change between builds.

  • https://github.com/search?q=path%3A%22webpack.records.json%22&type=code

I'm not 100% sure if this would be useful, or partially useful, but I think I am thinking of it tangentially in relation to things like:

@0xdevalias
Copy link
Owner Author

0xdevalias commented Mar 2, 2024

Even more tangentially related to this, I've pondered how much we could 're-construct' the files necessary to use tools like bundle analyzer, without having access to the original source (or if there would even be any benefit to trying to do so):

My gut feel is that we probably can figure out most of what we need for it; we probably just can't give accurate sizes for the original pre-minified code, etc; and the module names/etc might not be mappable to their originals unless we have module identification type features (see pionxzh/wakaru#41)

@0xdevalias 0xdevalias changed the title Explore creating a 'reverse engineered' records.json file from a webpack build Explore creating a 'reverse engineered' records.json / stats.json file from a webpack build Mar 2, 2024
@pionxzh
Copy link

pionxzh commented Mar 2, 2024

You want a re-constructed stat.json or records.json which can be put back into an analyzer plugin, right? This can be useful to understand the shape and code size distribution in chunks.

I just did some research on it. I feel it's possible to generate stats.json, but it requires deep understanding about the bundling details of webpack. And the module graph would be a must for us to do this.

This is the sample that I get on google.
https://gist.github.com/TheLarkInn/577d6a8896b4553d4b2865fe1c8db7fa

@0xdevalias
Copy link
Owner Author

You want a re-constructed stat.json or records.json which can be put back into an analyzer plugin, right?

@pionxzh nods yeah, that was what I was originally thinking about; and then I was thinking that there might also be some crossover with the parts used for this that could align with figuring how to identify module changes/etc.

Here's another search that should pull up a bunch more samples:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants