Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option mltabs to emit m-lines when string contains tabs #15

Merged
merged 3 commits into from Dec 16, 2016

Conversation

hmalphettes
Copy link
Contributor

Many thanks for a great project!

This PR add support for generating triple quoted multiple lines even if the string value contains \t characters:

Hjson.stringify(value, { mltabs: true }

It is exposed on the command line as the flag -mltab

hjson-mltabs

I have not figured out why the tab character would be a problem for triple quotes multiple lines.
Let me know if I missed the obvious.

Thanks!

@laktak
Copy link
Member

laktak commented Dec 12, 2016

Thanks!

Tabs are not treated like spaces to help the user avoid mistakes (by making tabs visible).

Also the parser is not able to properly filter out whitespace (used by indentation) if there are tabs because there is no standard that says n spaces == 1 tab.

Can you tell me your use case for the tab output and how you would handle the parsing issue?

@hmalphettes
Copy link
Contributor Author

hmalphettes commented Dec 14, 2016

Hi!

Usecase:
Some user input is serialised in HJSON and committed in a source control.
I am using HJSON for the multi-line formatting; even when a tab is part of the input.

Bottom-line:
It works and the parser supports tabs inside values just as much as it supports spaces at the beginning of a line.

Background:
When a user input contains a tab, using HJSON with this patch in place enables the diff to be readable.

imf-hjson-vs-json

imf-hjson-diff

I dont think the parser is confused between spaces internally used for the multi-lines indentation and tabs or spaces entered by the user.

The test unit I added has some tabs at the beginning of a line and it is round-tripped correctly.

Please find below some more manual checks for the round-trip:
mltab-round-trip

The parser computes the indent of a triple quoted multi-line by counting the spaces once for each multi-line block. For each new line it removes those spaces.
It is even less likely to confuse a tab with a space as it specifically looks for the space character alone: I take that back - sorry for the edit.
We are still ok as indent skip spaces and non printable characters (<= ' ') for the size of the indent alone. Everything after ends-up in the value including tabs and spaces.

  function mlString() {
    // Parse a multiline string value.
    // we are at ''' +1 - get indent
    var indent = 0;
    while (true) {
      var c=peek(-indent-5);
      if (!c || c === '\n') break;
      indent++;
    }

    function skipIndent() {
      var skip = indent;
      while (ch && ch <= ' ' && ch !== '\n' && skip-- > 0) next();
    }

So if the actual data starts with spaces or tabs the parser will do the right thing.

This time I actually got to review the parser instead of only trusting the unit test.

I also tested adding a space or a tab at the very beginning of a multi-line string and the round-trip is also working. Here is is the same file with a value that starts with a tab.

mltab-first-char-round-trip

Thanks again for your attention!

@laktak
Copy link
Member

laktak commented Dec 15, 2016

Thanks for the detailed analysis! I'd just like to refactor the regex definitions to be easier to read first (by removing the duplication).

@laktak
Copy link
Member

laktak commented Dec 15, 2016

Could you please take a look at b6b021b and 8e5652b and rebase your PR?

Since all tests are shared across all implementations and pass5_mltabs_test.json requires a special option, could you move it to a new extra folder and add a pass5_mltabs_testmeta.hjson with

{ 
  options: {
    mltabs: true
  }
}

? That would make it easier to only run the test on implementations that support it. Thanks!

@hmalphettes
Copy link
Contributor Author

Rebased.
Now working on refactoring the test as suggested.
Will squash after that.

`Hjson.stringify(value, { mltabs: true }` generate multiple lines
even if the string value contains `\t` characters.

Exposed on the command-line as the flag `-mltab`
@hmalphettes
Copy link
Contributor Author

@laktak ok I have followed your instructions regarding the tests as best as I understood them.

Please let me know if that is what you expected.

If you find it easier, feel free to edit my fork - I added you as a collaborator.
I obviously don't mind trying again with your instructions if that is not what you expected.

@laktak
Copy link
Member

laktak commented Dec 16, 2016

Please let me know if that is what you expected.

Yes that's great! Thanks!

I only tweaked the test and changed the option to a string so it can be used for #16 as well.

@laktak laktak merged commit 7b8f0d1 into hjson:master Dec 16, 2016
@AurelienRibon
Copy link

Thanks for the merge! I was also confused by this distinction. I had a hard time figuring out why you had to look for tab characters in the user input.

Basically, expect if I'm missing something, but you just have to count the number of characters before the beginning ''' flag, and remove them from the start of each following lines until the closing tag. If the input is corrupted (sequence not found at the start of one of the lines) then throw. I still don't understand what tabs have to do in that process. But that's not important. Again, thanks a bunch for the merge :)

@laktak
Copy link
Member

laktak commented Dec 19, 2016

@hmalphettes @AurelienRibon would it help if this were the default option? I think this can be changed without having a negative impact.

@hmalphettes
Copy link
Contributor Author

@laktak certainly helps if it is the default.
I also think it does not have a negative impact.
Cheers!

@hmalphettes
Copy link
Contributor Author

@laktak I'll make a PR to make this the default.
I'll also update the readme file where we still document mltabs instead of multiline.

hmalphettes added a commit to hmalphettes/hjson-go that referenced this pull request Apr 13, 2017
hmalphettes added a commit to hmalphettes/hjson-go that referenced this pull request Apr 13, 2017
laktak pushed a commit to hjson/hjson-go that referenced this pull request Jun 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants