Skip to content

Conversation

dinamic
Copy link
Contributor

@dinamic dinamic commented Apr 11, 2013

CND data was not correctly parsed while using Windows OS.
The issue was caused by the different line endings between the operating systems.

I came up with a fix and updated the unit tests to test both Windows and Unix line endings.

I've found some discrepancies inside the FileReader class, so it is updated as well.

This commit fixes: https://github.com/symfony-cmf/symfony-cmf-standard/issues/21

$reader->rewind();

return true;
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else is not needed here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Just removed it.

@lsmith77
Copy link
Member

thank you! can you have a look at the failing test on Travis? @dbu should have a look at you method rename

@dinamic
Copy link
Contributor Author

dinamic commented Apr 12, 2013

Sorry, didn't noticed the failing test.

@dinamic
Copy link
Contributor Author

dinamic commented Apr 12, 2013

While checking the failed tests I realized there's an issue with my current implementation.
It has to use look-ahead to determine the new line sequence, rather than look-behind.

public function isEol()
{
$current = $this->current();
$marker = $this->getEolMarker();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need a field and getter for the eol, rather use the PHP_EOL directly below. end-of-line is handled by php for us, i think that is good enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue is that PHP_EOL is different depending on which OS you are running the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, you wanted the test to test both. then lets just have a member variable but not use a getter method each time.

and lets have the constructor accept an optional $eolMarker that defaults to PHP_EOL. it could be useful for somebody in edge cases.

@dbu
Copy link
Member

dbu commented Apr 12, 2013

thanks for this PR. i commented on some naming things.
can you solve the look-ahead issue you have or do you need input on that?

is the reason you ran into this issue in the first place that your git client converted the line endings? i fear that we still can see problems after this fix. what would happen if we always check for both types of line endings, and maybe add the apple OS endings as well? might be the most flexible approach.

btw, impressive that after 30 years of PC we still see this stupid problem :-(

@dinamic
Copy link
Contributor Author

dinamic commented Apr 12, 2013

The look-ahead is solved.

The issue I came upon was invoked because of the hardcoded EOL. It was set like "\n" and for Windows is "\r\n". Later on the process of parsing there was the assumption that the EOL is a single character, which obviously is not true, so the code failed to parse the string.

I'm struggling at the GenericScanner now. There are some naming wtfs that are pretty confusing.

Once I'm done with the GenericScanner, I believe we would be able to handle whatever EOL we like. LF, CR, CR+LF, LF+CR, you name it. IMHO, it is a design fail that caused the issue.

I had the idea of converting the EOL within the reader constructor. Any thoughts on that?

@dbu
Copy link
Member

dbu commented Apr 12, 2013

maybe we could simply string replace all possible formats to \n and then do the char comparison on \n rather than the php constant. the LF char is not valid anywhere else in the cnd document anyway than as end of line character... this could make the code as simple as it was before.
we load the whole string anyways, and cnd files are not expected to be huge, so doing the replace upfront sounds save to me from a memory perspective.

@dbu
Copy link
Member

dbu commented Apr 12, 2013

the constructor then would be

$buffer = str_replace("\r\n", "\n", $buffer);
$buffer = str_replace("\r", "\n", $buffer);

@dinamic
Copy link
Contributor Author

dinamic commented Apr 15, 2013

Exactly my point. I was not aware whether the CRLF would be used anywhere inside the document and couldn't decide by myself to do such conversion.

@@ -54,7 +88,7 @@ public function testGetNextChar()
break;
}

//var_dump('Expected:' . $this->chars[$i] . ', found: ' . $peek);
// var_dump('Expected:' . $this->chars[$i] . ', found: ' . $peek);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets remove commented out code

@dbu
Copy link
Member

dbu commented Apr 15, 2013

i don't expect any problems, lets convert everything to the one-char \n in the constructor of BufferReader, and simplify the code again (we won't need peek and multi-char EOL anymore then)

@dbu
Copy link
Member

dbu commented Apr 19, 2013

@dinamic can you do the cleanup and convert all cnd files to unix line endings in the constructor, before further handling them?

@relo-san
Copy link

@dbu In Windows (at least Windows 7) in PHP global constant PHP_EOL by default equal CRLF, not LF. Accordingly, any comparison between currentChar() and PHP_EOL not work, because PHP_EOL contains 2 chars. In this case, it does not matter what kind of line breaks in the cnd files.

@relo-san
Copy link

I see the most reasonable solution to convert all line breaks in cnd to the LF and replace PHP_EOL in all comparisons also to LF.
P.S. Sorry, I casually read discussion, it's an already known.

@dbu
Copy link
Member

dbu commented Apr 20, 2013

yes, thats exactly what we should do. i hope dinamic can do it - if you don't want to wait you could fork his repo and finish it and create a new PR

@lsmith77
Copy link
Member

superseded by #51

@lsmith77 lsmith77 closed this Apr 20, 2013
@dinamic
Copy link
Contributor Author

dinamic commented Apr 22, 2013

Thanks for contributing @relo-san. I didn't had the time to finish my fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants