Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webpack Clean Task #1744

Merged
merged 6 commits into from
Sep 18, 2019
Merged

Webpack Clean Task #1744

merged 6 commits into from
Sep 18, 2019

Conversation

ericboehs
Copy link
Contributor

@ericboehs ericboehs commented Oct 11, 2018

Resolves #1410

Goal

Add a task to clean old packs so that our packs directory doesn't get bloated with hundreds of old, unused assets.

Approach

Create a new webpacker:clean task (which enhances the assets:clean task) to delete unused assets by detecting old assets (using the Webpack-generated manifest.json to find alternative versions and by using File.mtime to determine file age).

Context

Users' browsers of a Webpacker-enabled app will request packs from the public packs directory. During a deploy the packs are replaced with a new version (at a new location) that the browser may not yet know about which results in 404s or other errors to the end user. Best practices recommend to keep a version or two of your assets around to avoid this scenario. Currently, if the packs directory is persisted across deploys, all compiled packs are kept forever. Sprockets has an assets:clean task which removes unused assets. This pull request intends to add the equivalent task to Webpacker.

Notes

I'm not 100% confident with how I find the older versions but it's the best I can come up with. I'm open to suggestions.

Related

@ericboehs ericboehs changed the title Webpack Clean Webpack Clean Task Oct 11, 2018
count_to_keep = 2
versions_to_keep = files_in_manifest.flat_map do |file_in_manifest|
file_prefix, file_ext = file_in_manifest.scan(/(.*)[0-9a-f]{20}(.*)/).first
versions_of_file = Dir.glob("#{file_prefix}*#{file_ext}").grep(/#{file_prefix}[0-9a-f]{20}#{file_ext}/)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the grep here to help filter out mismatches that Dir.glob might match. I'm thinking of something like packs/users-abc123.js and packs/users-profile-def456.js.

I'm trying to think of something that could still match. Maybe users-baddadbabebaddadbabe-abc123.js (i.e. the dev named the file something that matches the digest character space and length).

@ericboehs
Copy link
Contributor Author

In a previous commit, I removed all files in the directory that we didn't explicitly know about. I think it's better to build up a list of files that we know are ours and remove those. I left the previous commit intact for historical purposes.

Previously all files were gathered in the packs output dir and then
files that shouldn't be there were removed from the list. This additive
process will leave files intact that webpack doesn't manage (for
example, system files).
@gauravtiwari
Copy link
Member

Thanks @ericboehs Will review this tomorrow :)

@ericboehs
Copy link
Contributor Author

I think I also need to delete the.js.map and .js.gz files. Doing more testing now.

@ericboehs
Copy link
Contributor Author

ericboehs commented Oct 11, 2018

Looks like the current code I have also deletes the .js.map and .js.gz files.

Copy link
Contributor

@swrobel swrobel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no expert here, but I've attached my comments, since you asked!

@@ -2,6 +2,7 @@ tasks = {
"webpacker:info" => "Provides information on Webpacker's environment",
"webpacker:install" => "Installs and setup webpack with Yarn",
"webpacker:compile" => "Compiles webpack bundles based on environment",
"webpacker:clean" => "Remove old compiled webpacks",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be "Removes" to be consistent with other task descriptions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would be great if this took a keep arg like the assets:clean rake task does : https://github.com/rails/sprockets/blob/master/lib/sprockets/manifest.rb#L245

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mimicking rails assets:clean for my wording.

I've added the keep arg to the task. Good idea!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the verbs of webpacker's tasks should follow singular verbs of rails assets. Send a separate PR?


count_to_keep = 2
files_in_manifest = manifest.refresh.values.map { |f| File.join config.root_path, "public", f }
files_to_be_removed = files_in_manifest.flat_map do |file_in_manifest|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth comparing to the logic that Sprockets uses: https://github.com/rails/sprockets/blob/master/lib/sprockets/manifest.rb#L245

Copy link
Contributor Author

@ericboehs ericboehs Oct 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've compared the logic and I believe it's fairly identical (I used it as a guide when I wrote this). They're going about getting the files differently though since they keep old version names in the manifest whereas Webpack does not.

To expand, they're rejecting if the current file is the same as the active asset (I'm using next); they're sorting by the mtime, I'm returning the mtime as the last element and sorting by that; and they're reverse.drop_whileing on age or count, I'm reverse.droping on just count.

@@ -5,6 +5,23 @@ def initialize(webpacker)
@webpacker = webpacker
end

def clean(count_to_keep = 2)
return if !config.public_output_path.exist? || !config.public_manifest_path.exist?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericboehs I am wondering if we do this other way around for readability.

if config.public_output_path.exist? && config.public_manifest_path.exist?
  # do the cleaning
end

Copy link
Contributor Author

@ericboehs ericboehs Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuboCop usually barks at wrapping a whole method in a conditional and recommends using a guard clause such as this. I thought the way I wrote it was clear enough (I avoided using unless). Would it be easier to understand if I expanded it to two guard clauses?

return unless config.public_output_path.exist?
return unless config.public_manifest_path.exist?

# ...

https://github.com/rubocop-hq/ruby-style-guide#no-nested-conditionals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return unless config.public_output_path.exist? && config.public_manifest_path.exist?. Reads clearly to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually stay away from compound unless statements but if I'm the oddball on thinking those are confusing I'd be glad to change it to your suggestion @swrobel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ericboehs and @swrobel

The reason mainly is that we don't use/prefer early returns inside Webpacker or Rails codebase. Rubocop shouldn't complain if it's multi-line conditional right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I updated it. I didn't realize the Rails style guide differed from the Ruby style guide here.

@ericboehs
Copy link
Contributor Author

@gauravtiwari Would you have time to review this soon?

Thanks!

@gauravtiwari
Copy link
Member

Thanks, @ericboehs for this, looks great. I am wondering if you would be happy to add some test for this? Something along this line: https://github.com/rails/sprockets/blob/master/test/test_manifest.rb#L415

# Run clean if the assets:clean is run
if Rake::Task.task_defined?("assets:clean")
Rake::Task["assets:clean"].enhance do
Rake::Task["webpacker:clean"].invoke

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a very nice touch if we could pass on the same arguments given to assets:clean; I'm not sure Rake provides a particularly clean way of doing this though; my best approach so far has been this:

Rake::Task["assets:clean"].enhance do |_, args|
  Rake::Task["webpacker:clean"].invoke(*args.to_a)
end

@lazyatom
Copy link

Another potential issue here is that compressed files (generated in production) aren't included in the manifest, so if we want them to be cleaned, we'll also need to check explicitly for them.

@grk grk mentioned this pull request Jan 30, 2019
@phlegx
Copy link

phlegx commented Feb 21, 2019

This is a part of my manifest file:

{
  "entrypoints": {
    "application": {
      "css": [
        "/packs/css/application-34645e16.css"
      ],
      "js": [
        "/packs/js/application-c7b3dd68a6ac50abc8ca.js"
      ],
      "css.map": [
        "/packs/css/application-34645e16.css.map"
      ],
      "js.map": [
        "/packs/js/application-c7b3dd68a6ac50abc8ca.js.map"
      ]
    }
  },
  "frontend.css": "/packs/css/application-34645e16.css",
  "frontend.css.map": "/packs/css/application-34645e16.css.map",
  "frontend.js": "/packs/js/application-c7b3dd68a6ac50abc8ca.js",
  "frontend.js.map": "/packs/js/application-c7b3dd68a6ac50abc8ca.js.map",
  ...

And I see that only JS files are cleaned and all other files not, because the JS files has a 20 hash length and all other files (css, images, fonts, etc.) only a 8 hash length. Rake task assets:clean does not clean files in /public/packs directory, because the files are generated by webpacker.

Here my clean code:

    def clean(count_to_keep = 1)
      if config.public_output_path.exist? && config.public_manifest_path.exist?
        files_in_manifest = manifest.refresh.values.map { |f| File.join config.root_path, "public", f if f.is_a?(String) }.compact
        files_to_be_removed = files_in_manifest.flat_map do |file_in_manifest|
          file_prefix, file_ext = file_in_manifest.scan(/(.*)(?:[0-9a-f]{8}|[0-9a-f]{20})(.*)/).first
          versions_of_file = Dir.glob("#{file_prefix}*#{file_ext}").grep(/#{file_prefix}(?:[0-9a-f]{8}|[0-9a-f]{20})#{file_ext}/)
          versions_of_file.map do |version_of_file|
            next if version_of_file == file_in_manifest

            [version_of_file, File.mtime(version_of_file).utc.to_i]
          end.compact.sort_by(&:last).reverse.drop(count_to_keep).map(&:first)
        end

        files_to_be_removed.each { |f| File.delete f }
      end
    end

The regular expression is the important part: (?:[0-9a-f]{8}|[0-9a-f]{20})

@drewB
Copy link

drewB commented May 11, 2019

I am curious if there has been anymore movement on this.

@beauraF
Copy link

beauraF commented Jun 26, 2019

We are very interested in that, here too.

@jakeNiemiec
Copy link
Member

Webpack has changed since the creation of this PR.

@drewB @beauraF If you want this functionality, you can use https://github.com/johnagan/clean-webpack-plugin#about

That plugin would be passed into webpacker according to https://github.com/rails/webpacker/blob/0feb1474a7e06ac2a6bff745475bee192643a29f/docs/webpack.md#plugins

🤔 Is there any reason not to include clean-webpack-plugin by default?

@hibachrach
Copy link

@jakeNiemiec I would vote against inclusion by default. The behavior of clean-webpack-plugin is closer to webpacker:clobber. One of the nice things about assets:clean (and what this PR is suggesting) is it maintains a limited number of (recent) old versions of bundled assets. This is great when your web servers/workers have non-trivial cutover time and users could possibly fetch the previous deploy's assets.

@gauravtiwari gauravtiwari merged commit d2b26d6 into rails:master Sep 18, 2019
@gauravtiwari
Copy link
Member

Yes, makes sense. Merging....

Thanks @ericboehs 🙏 sorry it took ages.

@viniciusgama
Copy link

@ericboehs I was just using this but I noticed it does not remove files that are not referenced in the manifest file. I saw your comment above mentioning you remove this option but can you explain why is that? Looks like a desired behaviour for me.

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add equivalent task to "rake assets:clean"