New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Marshal metadata #3706

Merged
merged 1 commit into from May 18, 2015

Conversation

Projects
None yet
5 participants
@fw42
Contributor

fw42 commented May 16, 2015

When doing some profiling, I noticed that for large sites, a considerable amount of time is spend in read_metadata and write_metadata of Jekyll::Regenerator. Most of that is in Psych (the YAML library that Jekyll uses).

You are probably not going to like this PR, since it's not backwards compatible, but I want to propose it anyway: Marshal is considerably faster and I don't really see why the metadata file needs to be human-readable.

My naive measurements: On a ~6000 pages site (which has a ~1MB metadata file with about 15k lines), this reduces the build time for incremental builds by about 400-500ms on my machine (from a total of about 5s).

@parkr @envygeeks thoughts?

@envygeeks

This comment has been minimized.

Show comment
Hide comment
@envygeeks

envygeeks May 16, 2015

Contributor

/cc @alfredxing since this is yours!

Contributor

envygeeks commented May 16, 2015

/cc @alfredxing since this is yours!

@fw42

This comment has been minimized.

Show comment
Hide comment
@fw42

fw42 May 16, 2015

Contributor

To make this more backwards-compatible, I could add some code that migrates the file over (i.e. reads with YAML if reading with Marshal fails. The next write will then "fix" the metadata file automatically.)

Contributor

fw42 commented May 16, 2015

To make this more backwards-compatible, I could add some code that migrates the file over (i.e. reads with YAML if reading with Marshal fails. The next write will then "fix" the metadata file automatically.)

@envygeeks

This comment has been minimized.

Show comment
Hide comment
@envygeeks

envygeeks May 16, 2015

Contributor

At the end of the day I think this need be left up to @alfredxing because it is his work, the backwards compatible part isn't a worry since this is a 3.0.0 feature that hasn't landed on stable so I don't even think that's a big problem to begin with.

Contributor

envygeeks commented May 16, 2015

At the end of the day I think this need be left up to @alfredxing because it is his work, the backwards compatible part isn't a worry since this is a 3.0.0 feature that hasn't landed on stable so I don't even think that's a big problem to begin with.

@fw42

This comment has been minimized.

Show comment
Hide comment
@fw42

fw42 May 16, 2015

Contributor

Ok then I guess it comes down to: How important is it for the file to be human-readable?

Contributor

fw42 commented May 16, 2015

Ok then I guess it comes down to: How important is it for the file to be human-readable?

@envygeeks

This comment has been minimized.

Show comment
Hide comment
@envygeeks

envygeeks May 16, 2015

Contributor

Well, we include sass and sass-cache isn't exactly human readable (in that unless you try hard it's gonna just make you mad.) So from my point of view (and others might not agree) that doesn't matter because I think we store the meta in a dot folder anyways.

Contributor

envygeeks commented May 16, 2015

Well, we include sass and sass-cache isn't exactly human readable (in that unless you try hard it's gonna just make you mad.) So from my point of view (and others might not agree) that doesn't matter because I think we store the meta in a dot folder anyways.

@alfredxing

This comment has been minimized.

Show comment
Hide comment
@alfredxing

alfredxing May 16, 2015

Member

I'm totally okay with this. I stuck to YAML in the first place simply because we were making extensive use of it already, but I can see how this would be a large improvement in terms of performance.

I would leave any thoughts/considerations on compatibility to @parkr though.

Member

alfredxing commented May 16, 2015

I'm totally okay with this. I stuck to YAML in the first place simply because we were making extensive use of it already, but I can see how this would be a large improvement in terms of performance.

I would leave any thoughts/considerations on compatibility to @parkr though.

@parkr

This comment has been minimized.

Show comment
Hide comment
@parkr

parkr May 18, 2015

Member

Marshal is considerably faster and I don't really see why the metadata file needs to be human-readable.

Agreed. 👍

To make this more backwards-compatible, I could add some code that migrates the file over (i.e. reads with YAML if reading with Marshal fails. The next write will then "fix" the metadata file automatically.)

This would be good! We prefer to have an upgrade/migration path whenever possible.

:shipit: with the backwards-compatibility fix.

Member

parkr commented May 18, 2015

Marshal is considerably faster and I don't really see why the metadata file needs to be human-readable.

Agreed. 👍

To make this more backwards-compatible, I could add some code that migrates the file over (i.e. reads with YAML if reading with Marshal fails. The next write will then "fix" the metadata file automatically.)

This would be good! We prefer to have an upgrade/migration path whenever possible.

:shipit: with the backwards-compatibility fix.

@parkr parkr added the Enhancement label May 18, 2015

@envygeeks

This comment has been minimized.

Show comment
Hide comment
@envygeeks

envygeeks May 18, 2015

Contributor

Pending fixes it'll merge!

Contributor

envygeeks commented May 18, 2015

Pending fixes it'll merge!

@envygeeks envygeeks added :shipit: labels May 18, 2015

@fw42

This comment has been minimized.

Show comment
Hide comment
@fw42

fw42 May 18, 2015

Contributor

updated

Contributor

fw42 commented May 18, 2015

updated

@fw42

This comment has been minimized.

Show comment
Hide comment
@fw42

fw42 May 18, 2015

Contributor

to test this manually: build the "site" folder locally on master, check out this branch, build it again (incrementally), check that the site/.jekyll-metadata file is now Marshal.

Contributor

fw42 commented May 18, 2015

to test this manually: build the "site" folder locally on master, check out this branch, build it again (incrementally), check that the site/.jekyll-metadata file is now Marshal.

@envygeeks

This comment has been minimized.

Show comment
Hide comment
@envygeeks

envygeeks May 18, 2015

Contributor

@fw42 maybe add a test into the tests that ensures that it works?

Contributor

envygeeks commented May 18, 2015

@fw42 maybe add a test into the tests that ensures that it works?

@fw42

This comment has been minimized.

Show comment
Hide comment
@fw42

fw42 May 18, 2015

Contributor

added a test

Contributor

fw42 commented May 18, 2015

added a test

@fw42 fw42 changed the title from Marshal metadata to Performance: Marshal metadata May 18, 2015

@envygeeks envygeeks removed :shipit: labels May 18, 2015

envygeeks added a commit that referenced this pull request May 18, 2015

@envygeeks envygeeks merged commit 498ea4f into jekyll:master May 18, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@DirtyF DirtyF referenced this pull request Jan 16, 2017

Merged

New rubocop security checks #5768

0 of 1 task complete

@jekyll jekyll locked and limited conversation to collaborators Feb 27, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.