Skip to content

collectstatic benchmark

James Bligh edited this page Jun 9, 2026 · 1 revision

Evaluating ManifestStaticFilesStorage's experimental JavaScript modules support against real projects.

The main benefit of collectstatic is to collect static files from third party apps into one central location. A secondary benefit is the ability of ManifestStaticFilesStorage to create cache friendly file names. Collectstatic will mutate file names in css url statements to make this work. It also supports sourcemaps and javascript modules, mutating import/export statements. That javascript one is tricky to do with only regular expression replacements, but we can get to good enough. Below is an exploration of how it performs against real projects.

There are 700+ third party apps on djangopackages.com that ship a static folder. That's our main source of test data. Searching madewithdjango.com and similar sites gets us another 50+ sites that are using ManifestStaticFilesStorage. A search for top css/js libs gets another 50+ libraries to test with.

To test them all we ran collectstatic with javascript support and record pass/fail rates, % of import/export statements found and how much time it takes. Acorn is used to say how many import/export statements should be found and speed is measured against a benchmark of running with just css support. You can see the detailed results of this output.

The benchmarks folder has the scripts that do this, these were all generated by LLM coding agents. They are based on simpler scripts and I have reviewed the output, but I have not reviewed the code. They are provided for anyone looking to replicate or extend this work.

The results

Fail/pass - does collectstatic run without error?

  • The vast majority of projects pass fine, 751 out of 877.

  • 100 of the projects are shipping files in /static that could never be processed as is

    • 41 of the djangopackages are shipping a static folder that can never be parsed as an image/font file is referenced in a css file that doesn't exist in the project or a js file is not utf-8.
    • 41 of the djangopackages are shipping a file referencing a sourcemap that doesn't exist.
    • 18 packages/sites are shipping the javascript they use for building in the /static folder. This is not going to be served to the browser but collectstatic has to spend time processing it and worse will fail sometimes due to circular references or because it finds imports to replace that it can't.
  • That leaves 26 projects that fail collectstatic due to its regex limitations.

    • Tested against 4.2 as a baseline, when the feature was added, 26 of the packages/sites/libs from our test data failed because the regular expressions tried to replace something that was not a real import.
    • 5.0-6.0 makes no improvement on that tally
    • 6.1 fixes a long standing issue of matching in comments for js and css. That fixes 22 cases.
    • That leaves 2 issues with circular references, and 2 issues with import statement in a string.

Correctness - does collectstatic find all the import statements?

  • Django expects a semicolon at the end of the line, there are some packages (django-aces in particular) that do not use this, this causes import statements to be missed.

Performance - how much slower is supporting javascript?

  • import/export is always a little slower than plain css support, the median is 1.6x for 6.1. All variants suffer from some worst case performance in rare cases that can take 20, 50 or in one case 70 times longer than the css baseline.

The solutions implemented in this package

  • Instead of failing and stopping on sourcemaps we warn and continue.
  • We do the same for things that look like build js, we already ignore import * from bare, ES module imports have to look like a url ./ or /. But plenty of build scripts do import * from ./jquery as well. In a real browser ES module case it will pretty much always have the jquery.js form with extension. A warn and continue pattern would be enough here too.
  • Uses topological sort to avoid circular dependency problem, also helps with worst case performance.
  • Handles ASI and ignoring matching import syntax in strings.
  • Correctness jumps to 100%.
  • Performance of the package improves to 1.3x.

Regex weaknesses

There are still theoretical weaknesses in the regex approach. Parsing around comments and regex literals requires too much state to do with regexs. Patterns like the below will fail.

import(/*comment*/"./module.js");

import /*comment*/ "./module.js";
import { lexerOnlyConst } from /*comment*/ "./module.js";

const re1 = /test"pattern/; import("./module.js");
const re2 = /[a-z//]/; import("./module.js");

const re3 = /test`pattern/;
import(`./module.js`);

None of the real world projects tested here have these in practice though, it's fair to conclude they would be very rare failure cases, and the package contains a lexer alternative that could be used if any of these ever became significant enough in real world packages to need addressing.

Future work

  • There are PR's to django for the topological sort and the improvements for ASI and string matching.
  • Seek community feedback on changing sourcmap/buildjs to warnings