Skip to content

Improve handling of lines containing only indentation#162

Merged
jimhester merged 2 commits intotidyverse:masterfrom
alandipert:trim-fix-blank-lines
Feb 4, 2020
Merged

Improve handling of lines containing only indentation#162
jimhester merged 2 commits intotidyverse:masterfrom
alandipert:trim-fix-blank-lines

Conversation

@alandipert
Copy link
Contributor

@alandipert alandipert commented Jan 8, 2020

Previously, after a '\n' was encountered while copying, min_indent number of characters are trimmed from the beginning of lines by seeking i characters ahead in the input string.

However, after setting i += min_indent, i was incremented again right before the while loop continued via str[j++] = xx[i++]. This caused i to point to the character after the '\n' that terminated the line, and so leading indentation on the following line was not recognized as such, and so was improperly copied to the output string.

While working on this, I also ran into a few other possible bugs:

  1. Lines containing only indentation might be shorter than min_indent and so could cause the succeeding line to contain spurious leading indentation. Currently, trim() only works properly if there's either no whitespace on the blank line, or if there is an amount of whitespace of at least min_indent.
  2. Tabs are not converted to spaces, and so output may appear to be improperly indented even after trimming if different lines contain different distributions of leading tab and space characters.

Let me know if it would be helpful for me to create issues for those possible bugs. I didn't create issues because maybe these are features 😄

It also occurred to @wch and I while working on this that we could probably be smarter about the size of the output string we allocate. Since we know that min_indent will removed from every line after the first, in the min_indent calculation pass we could tally newlines and multiply that tally by the size of min_indent to determine how much smaller the output string needs to be.

TODO

  • Update NEWS

@jimhester
Copy link
Collaborator

jimhester commented Jan 8, 2020

Thanks!

  1. seems to be a bug.
  2. Is by design, it is assumed that people will use tabs or space consistently, converting tabs to spaces would require glue to know how many spaces a tab is worth, which I didn't want to do.

@alandipert
Copy link
Contributor Author

OK I updated the NEWS and created #163, which I should be to make another PR for today. Please let me know if there's anything else I can do on this one 👍

@jimhester
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants