Permalink
Please sign in to comment.
Browse files
All 10 files in benchmark/testdata can now be oheap-encoded.
- Use const.NO_INTEGER for more span IDs. - Don't encode to utf-8 while serializing to oheap. The encoder should just pass bytes straight through. The issue is that the lexer will read individual bytes of utf-8 characters and split them into Lit_Other tokens. This is fine since normally we just re-concatenate the bytes. But it's not OK to encode them one at a time! Make note of a possible fix in osh/lex.py and spec/unicode.sh.
- Loading branch information...
Showing
with
46 additions
and 26 deletions.
- +23 −15 asdl/encode.py
- +9 −9 osh/cmd_parse.py
- +2 −0 osh/lex.py
- +4 −2 osh/word_parse.py
- +8 −0 spec/unicode.sh
0 comments on commit
de19b3d