-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full unicode support #50
Comments
textX uses Python Assuming you are trying to parse a file with multiple records like the one you provided you could do something like this: # -*- coding: utf-8 -*-
from textx import metamodel_from_str
grammar = '''
Model: records*=Record;
Record: num=/\d+-\d+/ name=/\w+\s+\w+/;
'''
mm = metamodel_from_str(grammar)
input = '''
2785-599 São Domingo
2785-599 São Domingo
'''
model = mm.model_from_str(input)
print(model.records)
print(model.records[0].name)
print(model.records[0].num) Note that |
First of all, thanks for your quick response! I forgot to tell you that I am using Python 2.7. With this version, using:
makes the |
For Python 2 you can do this: # -*- coding: utf-8 -*-
from textx import metamodel_from_str
grammar = '''
Model: records*=Record;
Record: num=/\d+-\d+/ name=/(?u)\w+\s+\w+/;
'''
mm = metamodel_from_str(grammar)
input = u'''
2785-599 São Domingo
2785-599 São Domingo
'''
model = mm.model_from_str(input)
print(model.records[0].name)
print(model.records[0].num) Notice There was a slight problem with printing exceptions on syntax errors with unicode chars in Python 2. It should be fixed now on the |
Great! Great job! Thank you so much for your help! |
I'd like to know how to parse full unicode string.
\w
is not enough for parsing lines likeI've been trying to use
/'regEx'/u
or even\p{L}
like in python but it doesn't seem to work.Using string match, it works:
address = '2785-599 São Domingo'
But when the instance is created a new error arises:
'ascii' codec can't encode character
Appreciate your help.
The text was updated successfully, but these errors were encountered: