Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not indent multiline string values #54

Merged
merged 7 commits into from
Dec 29, 2020

Conversation

tboyko
Copy link
Contributor

@tboyko tboyko commented Nov 30, 2020

The current version of plist erroneously adds indentation to multiline string values, effectively changing the values. This PR attempts to rectify the issue by inspecting each line of plist/xml (and multiline string value, as a result) and only indents lines that start with an XML character "<".

Tests still pass "bundle exec ruby test/test_parser.rb` though I would appreciate having some additional eyes on this to verify that new issues are not being introduced here.

Before PR

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>data</key>
	<dict>
		<key>a_multiline_string</key>
		<string>This is a string with 
		multiple line 
		breaks.</string>
		<key>a_normal_string</key>
		<string>This is a string without a line break.</string>
		<key>integer</key>
		<integer>100</integer>
	</dict>
</dict>
</plist>

After PR

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>data</key>
	<dict>
		<key>a_multiline_string</key>
		<string>This is a string with 
multiple line 
breaks.</string>
		<key>a_normal_string</key>
		<string>This is a string without a line break.</string>
		<key>integer</key>
		<integer>100</integer>
	</dict>
</dict>
</plist>

@tas50
Copy link
Contributor

tas50 commented Dec 1, 2020

Thanks for fixing this @tboyko. This is going to be very handy for us in Chef

@tboyko
Copy link
Contributor Author

tboyko commented Dec 2, 2020

In my commit above, I failed to realize that there were multiple test files in the test/ folder. I ran the additional tests, found that there was a newline bug in the PR, and have fixed it.

Please note that another issue arose that I would appreciate feedback on. Specifically, the TestDataElements#TestData method loads a plist, parses it, dumps it, and then checks that the dump is identical to the original plist. This test was initially failing because the original plist file has a data object with contents that was indented. When this PR dumps the plist, it does not include the indentation.

The indentation is being removed by the parser process and not the dump process. I tried to find a plist specification document to clarify the treatment of data contents and indentation, but could not find any. My assumption is that indentation is irrelevant to the plist specification, and only has meaning when part of a string contents.

@tboyko
Copy link
Contributor Author

tboyko commented Dec 2, 2020

If data contents should indeed be indented, this PR will need a more thorough rewrite. Currently, the plist is built in a recursive fashion, starting with descendants and moving upward to the common ancestor (root node), all the while adding indentation to the document. I believe a fix would require something along the lines of a tree traversal so that the indentation level of a given line does not need to be revisited (and an already-processed line doesn't need to be determined to be string related or not).

…lds but not multiline strings. Move much of generator code to class for code simplicity. General code cleanup.
@tboyko
Copy link
Contributor Author

tboyko commented Dec 2, 2020

I made another attempt at this tonight by following the idea in my last comment. Data is now indented as it was previously and I've made an attempt to clean up the code where possible.

It'd be great if someone else could review this.

Copy link
Collaborator

@mattbrictson mattbrictson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for PR! Could you add a test that illustrates the bug that is being fixed, sort of like what you explained in the PR description? Maybe like this:

  def test_string_containing_newlines
    source = {
      data: {
        a_multiline_string: "This is a string with\nmultiple line\nbreaks.",
        a_normal_string: "This is a string without a line break.",
        integer: 100
      }
    }
    plist_emit_dump = Plist::Emit.dump(source, true)
    assert_equal(<<-EXPECTED, plist_emit_dump.gsub(/\t/, "  "))
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>data</key>
  <dict>
    <key>a_multiline_string</key>
    <string>This is a string with
multiple line
breaks.</string>
    <key>a_normal_string</key>
    <string>This is a string without a line break.</string>
    <key>integer</key>
    <integer>100</integer>
  </dict>
</dict>
</plist>
EXPECTED
  end

@mattbrictson mattbrictson self-assigned this Dec 27, 2020
@tboyko
Copy link
Contributor Author

tboyko commented Dec 28, 2020

Thanks @mattbrictson. Added that test.

@patsplat
Copy link
Owner

Thank you @tboyko and @mattbrictson

@mattbrictson mattbrictson mentioned this pull request Dec 30, 2020
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 14, 2021
3.6.0 (2020-12-30)

New features and important changes:

* Do not indent multiline string values
  (github.com/patsplat/plist/pull/54)
* Add `Plist::UnimplementedElementError`
  (github.com/patsplat/plist/pull/51)
* Add support for text inside `<![CDATA[ … ]]>`
  (github.com/patsplat/plist/pull/49)

Housekeeping:

* Add Ruby 2.7 and 3.0 to CI (github.com/patsplat/plist/pull/55)
* add docker-compose for development (github.com/patsplat/plist/pull/47)
* Replace `require` with
  `require_relative`(github.com/patsplat/plist/pull/52)
sanssecours added a commit to sanssecours/bundle-support.tmbundle that referenced this pull request Aug 30, 2021
Before this update converting data structures that contain multiline
strings with the method `to_plist` would result in output that contains
additional leading whitespace.

See also: patsplat/plist#54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

String fields that contain line breaks end up with tabs inserted at the start of every new line
4 participants