New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: Marshal metadata #3706
Conversation
/cc @alfredxing since this is yours! |
To make this more backwards-compatible, I could add some code that migrates the file over (i.e. reads with YAML if reading with Marshal fails. The next write will then "fix" the metadata file automatically.) |
At the end of the day I think this need be left up to @alfredxing because it is his work, the backwards compatible part isn't a worry since this is a 3.0.0 feature that hasn't landed on stable so I don't even think that's a big problem to begin with. |
Ok then I guess it comes down to: How important is it for the file to be human-readable? |
Well, we include sass and sass-cache isn't exactly human readable (in that unless you try hard it's gonna just make you mad.) So from my point of view (and others might not agree) that doesn't matter because I think we store the meta in a dot folder anyways. |
I'm totally okay with this. I stuck to YAML in the first place simply because we were making extensive use of it already, but I can see how this would be a large improvement in terms of performance. I would leave any thoughts/considerations on compatibility to @parkr though. |
Agreed. 👍
This would be good! We prefer to have an upgrade/migration path whenever possible. with the backwards-compatibility fix. |
Pending fixes it'll merge! |
updated |
to test this manually: build the "site" folder locally on master, check out this branch, build it again (incrementally), check that the site/.jekyll-metadata file is now Marshal. |
@fw42 maybe add a test into the tests that ensures that it works? |
added a test |
Performance: Marshal metadata
When doing some profiling, I noticed that for large sites, a considerable amount of time is spend in
read_metadata
andwrite_metadata
ofJekyll::Regenerator
. Most of that is in Psych (the YAML library that Jekyll uses).You are probably not going to like this PR, since it's not backwards compatible, but I want to propose it anyway: Marshal is considerably faster and I don't really see why the metadata file needs to be human-readable.
My naive measurements: On a ~6000 pages site (which has a ~1MB metadata file with about 15k lines), this reduces the build time for incremental builds by about 400-500ms on my machine (from a total of about 5s).
@parkr @envygeeks thoughts?