New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take CR/LF into account when separating content from YAML metadata #471

Merged
merged 1 commit into from Aug 31, 2014

Conversation

Projects
None yet
4 participants
@gpakosz
Member

gpakosz commented Aug 21, 2014

Fixes issue #470.

@gpakosz

This comment has been minimized.

Show comment
Hide comment
@gpakosz

gpakosz Aug 21, 2014

Member

@Fjan the fix is indeed to add \r? to the regular expression.

Note that the last parameter to split limits the number of pieces so there is no need to join them later, which could potentially mess up a file that has '---' somewhere in the middle.

How not limiting the split could mess up a file that has --- somewhere in the middle. It's tested and already working. However, for the sake of the discussion, I added a limit of 3 and removed the join.

@ddfreyne which do you prefer?

  • pieces = data.split(/^(-{5}|-{3})[ \t]*\r?\n/) and content = pieces[4..-1].join ?

or

  • pieces = data.split(/^(-{5}|-{3})[ \t]*\r?\n/, 3) and content = pieces[4] ?

Tell me and I'll update the PR accordingly.

Member

gpakosz commented Aug 21, 2014

@Fjan the fix is indeed to add \r? to the regular expression.

Note that the last parameter to split limits the number of pieces so there is no need to join them later, which could potentially mess up a file that has '---' somewhere in the middle.

How not limiting the split could mess up a file that has --- somewhere in the middle. It's tested and already working. However, for the sake of the discussion, I added a limit of 3 and removed the join.

@ddfreyne which do you prefer?

  • pieces = data.split(/^(-{5}|-{3})[ \t]*\r?\n/) and content = pieces[4..-1].join ?

or

  • pieces = data.split(/^(-{5}|-{3})[ \t]*\r?\n/, 3) and content = pieces[4] ?

Tell me and I'll update the PR accordingly.

@Fjan

This comment has been minimized.

Show comment
Hide comment
@Fjan

Fjan Aug 21, 2014

Contributor

@gpakosz You are right, I see the content will not be messed up because although it might split unnecessarily it will store the separator and then join it back in. So hacky, but it works. If we choose to avoid the unnecessary split then we can further improve that code by not capturing and storing the split in the first place:

pieces = data.split(/^-{3,5}[ \t]*\r?\n/,3)

(and then adjust the size check and the indexes of pieces accordingly).

Contributor

Fjan commented Aug 21, 2014

@gpakosz You are right, I see the content will not be messed up because although it might split unnecessarily it will store the separator and then join it back in. So hacky, but it works. If we choose to avoid the unnecessary split then we can further improve that code by not capturing and storing the split in the first place:

pieces = data.split(/^-{3,5}[ \t]*\r?\n/,3)

(and then adjust the size check and the indexes of pieces accordingly).

@gpakosz

This comment has been minimized.

Show comment
Hide comment
@gpakosz

gpakosz Aug 21, 2014

Member

@Fjan if the split is not captured, then you need to inject separator before calling YAML.load

Member

gpakosz commented Aug 21, 2014

@Fjan if the split is not captured, then you need to inject separator before calling YAML.load

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Aug 21, 2014

Member

I did not know about the second argument to #split. Use it!

Member

ddfreyne commented Aug 21, 2014

I did not know about the second argument to #split. Use it!

@Fjan

This comment has been minimized.

Show comment
Hide comment
@Fjan

Fjan Aug 21, 2014

Contributor

@gpakosz Ok, let's leave the split. However, I just realised the split does not capture the line ending, so if '---' does occur in the rest of the code somewhere the line ending for that line might change, so my initial comment was correct (although a line ending change does not typically cause issues). Anyway, if we simply limit the split to 3, as I see @ddfreyne agrees with, we avoid it altogether.

Contributor

Fjan commented Aug 21, 2014

@gpakosz Ok, let's leave the split. However, I just realised the split does not capture the line ending, so if '---' does occur in the rest of the code somewhere the line ending for that line might change, so my initial comment was correct (although a line ending change does not typically cause issues). Anyway, if we simply limit the split to 3, as I see @ddfreyne agrees with, we avoid it altogether.

@gpakosz

This comment has been minimized.

Show comment
Hide comment
@gpakosz

gpakosz Aug 21, 2014

Member

Alright, then this PR is good as is. It does the split at 3, and avoids joining.

I would love more eyes on this though, just in case. (cc @bobthecow @lifepillar)

Member

gpakosz commented Aug 21, 2014

Alright, then this PR is good as is. It does the split at 3, and avoids joining.

I would love more eyes on this though, just in case. (cc @bobthecow @lifepillar)

@gpakosz gpakosz added the bug label Aug 21, 2014

@gpakosz gpakosz added this to the 3.7.3 milestone Aug 21, 2014

@bobthecow

This comment has been minimized.

Show comment
Hide comment
@bobthecow
Member

bobthecow commented Aug 21, 2014

:shipit:

ddfreyne added a commit that referenced this pull request Aug 31, 2014

Merge pull request #471 from gpakosz/fix-metadata-crlf-bug
Take CR/LF into account when separating content from YAML metadata

@ddfreyne ddfreyne merged commit b30a718 into nanoc:release-3.7.x Aug 31, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@ddfreyne ddfreyne removed the to review label Aug 31, 2014

@gpakosz gpakosz deleted the gpakosz:fix-metadata-crlf-bug branch Sep 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment