-
Notifications
You must be signed in to change notification settings - Fork 13
Multibyte characters break positions #25
Comments
One more file: probably because of |
OK, nevermind, I reproduced an issue. The native driver returns JS offsets in UTF8 runes, not bytes. |
@dennwc do you plan to fix it not only in the lastest v2 driver but for the latest v1 too? |
I don't think we can backport it down to v1, but all new drivers should be able to run in v1 compatibility mode. |
oh, I did not know it is possible. Sorry for offtopic, but how can I enable it? From my experiments, I remember that pure driver replacement gives different UAST structure and it breaks ML team code. |
@dennwc what about v1 compatibility mode and my last question? |
@zurk Yeah, sorry, missed the notification. I checked the code, v1 compatibility is still enabled in the new But, since we fixed lots of things in new drivers, you might see some slight changes in the AST. Those are mostly bug fixes, but I don't know if they are severe enough to break assumptions in your pipeline. I would say you should give it a try and feel free to ping me if you find any differences that break the pipeline. Some might be fixable upstream. |
ok, thank you, @dennwc! I will check it. |
See also bblfsh/documentation#228, which was motivated by issues like this. I think we will need to make a clearer contract about what constitutes "acceptable" changes to the UAST structure as a result of bug fixes and natural evolution |
Hi,
I used vis tool for position and found that positions are broken at this file.
and it's visualization
we can see that offset is broken by 1 symbol after this comment
could it be because of special character
🤔
?The text was updated successfully, but these errors were encountered: