New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
value of Textcontent dissappears (empty string) upon add? #17
Comments
Now I can't reproduce the above debug anymore (text content shows fine), but the serialisation to xml still has an empty text..
|
Ok, the following debug shows the problem, still no idea why though:
Debug code is committed: https://github.com/LanguageMachines/foliautils/blob/wordtranslate/src/FoLiA-wordtranslate.cxx#L143 |
It also fails on the following other input words, which probably get mangled to asteriskses too by my tool (incorrectly but that's not the issue here):
|
Hah, there seem to be two 0x00 bytes in front of the string! That would explain things. I should have counted the characters better :) |
Conclusion: So this seems to happen if there are invalid characters in the string, I think it would be helpful if this could be caught and a warning outputted when appending text, provided it's not too expensive. |
checking for an string to be valid UTF8 is quite expensive. |
Ok. the problem occurred in a program that incorrectly used the libicu API, yielding iinvalid Unicode strings. |
Something goes wrong when I add TextContent with value
eologico*phijsico*metaphijsicum
, libfolia adds an empty text content element instead! I've no idea what triggers this (special meaning for the asterisk perhaps??), other words process fine.I add TextContent as follows:
https://github.com/LanguageMachines/foliautils/blob/wordtranslate/src/FoLiA-wordtranslate.cxx#L134
Debug output, I explicitly check if I'm not passing an empty string (after trimming even):
The text was updated successfully, but these errors were encountered: