Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin is very fragile with respect to UTF-8 #4

Closed
kittylyst opened this issue Feb 4, 2014 · 6 comments
Closed

Plugin is very fragile with respect to UTF-8 #4

kittylyst opened this issue Feb 4, 2014 · 6 comments

Comments

@kittylyst
Copy link

If a .adoc contains incorrect UTF-8, it throws exceptions and does not display properly. Could this be made less of a hard crash - getting some output so I can find the bad UTF-8 in the document would be useful.

@bodiam
Copy link
Contributor

bodiam commented Nov 8, 2014

@kittylyst Do you have a testcase for this? I'm not sure how to reproduce this.

@kittylyst
Copy link
Author

@bodiam It's been months since I looked at this, but if memory serves, the problem was caused by a \n character being incorrectly inserted by stupid line-wrapping software partway through a multibyte UTF-8 sequence.

I don't have the test case to hand, as it came up during some client work which I don't have access to any more. A probable minimal test case is:

  1. Take a 3-byte UTF-8 sequence
  2. Insert a \n in between offset 1 and offset 2 in the 3-byte sequence
  3. Profit (?)

@bodiam
Copy link
Contributor

bodiam commented Nov 15, 2014

Hi @kittylyst , I've tried a simple test by opening this document: http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

I get a NPE in at apple.awt.CAccessible.getAccessibleContext(CAccessible.java:74), I only get partial rendering, but I don't see a crash. Would this be good enough for you?

@kittylyst
Copy link
Author

That is a significant torture test - if you're not seeing a crash & getting partial rendering from this, feel free to close this bug.

@bodiam
Copy link
Contributor

bodiam commented Nov 15, 2014

It was the worst I could find ;-). I doesn't crash on my Mac, only partial rendering, but that seems good enough to fix the document. I'm in the process of releasing 0.2, which should be approved in 1-2 days. If you have a different OS than Mac, would you care to retest it?

@mojavelinux
Copy link
Member

Thanks for the torture test. That's super useful. I'm going to see if I can get Asciidoctor core to drop invalid characters and only issue a warning so that it at least processes the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants