-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing Latin1 characters #158
Comments
Note that PetitParser never supported any other encoding but the standard UTF-16 code units of a Dart I am not aware of a change in how characters are read in a long time. Could you provide a short reproducible test-case that passes with PetitParser 4.0.2, but fails with a newer version? I agree that the built-in predicates such as |
Thanks for your quick response. I now think the problem is not with PetitParser, but rather was caused by a change in the way I store files, made to be able to deploy the app as a webapp. I am now using the Hive NoSql database. Printing out the output from the database, before I try to parse it with PetitParse, shows that it is corrupting (some?) non-ASCII characters. The characters returned are not ones handled by the grammar so I get a parse error. So I will see if this problem can be fixed and hope that this will solve the parsing problem as well. |
I'd like to be able to help you with extending the letter() implementation, but I'm afraid that's over my head. |
I found the problem. Hive encodes strings using UTF8. I just needed to convert them into UTF16 and everything works as it should. |
I need to parser Latin1 characters which are not ASCII. My parser was working with version 4.0.2, but I need to use the newer version of petitparser now due to dependencies with the pdf Flutter package that I also need.
Here's a simplified code snippet which fails:
// letter() extended with Latin 1 characters for coverage of most Western European languages
final Parser extChar =
letter() |
char('ä');
I've also tried using pattern, like this:
final Parser extChar =
letter() |
pattern("À-ÿ");
This also fails.
Would it be easiest to extend letter() to cover all Latin 1 alphabetic characters?
The text was updated successfully, but these errors were encountered: