ASCII 160 (non-breaking space) is not recognised as a space character #16

Open
rekado opened this Issue Nov 2, 2012 · 2 comments

Projects

None yet

2 participants

@rekado
rekado commented Nov 2, 2012

Only ASCII 32 and '\t' are currently matched as Spacechar. ASCII 160 and other characters in the Unicode whitespace category should also be recognised as Spacechar tokens.

@jgm
Owner
jgm commented Nov 7, 2012

peg-markdown operates on strings of bytes, and knows nothing of unicode. Unicode nonbreaking space 160 is encoded in UTF-8 as two bytes. So there are some difficulties here.

@jgm
Owner
jgm commented Nov 15, 2012

Thanks for the peg upgrade. However, I don't agree that the nonbreaking space should be counted as a space character in peg-markdown. It is very handy to be able to insert spaces that should be treated just as normal spaces. For example, if you want a paragraph to start with four spaces, you can start with four 160's, and the paragraph won't be treated as a code block. Furthermore, spaces are treated as word boundaries in the writers, and lines may break on them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment