Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve unicode escape sequences not being processed correctly #293

Merged
merged 1 commit into from
Oct 23, 2022

Conversation

chrisjbillington
Copy link
Contributor

In process_unicode_escape_sequences(), any backslash escape sequences in the original string are escaped upon the first
.encode('unicode-escape') and therefore round-trip the sequence of .encode('unicode-escape').decode('unicode-escape').

That is not what we want - we want these sequences to be passed-through the .encode unchanged, so that they will be converted to the character they represent upon .decode().

This patch changes the .encode() step to pass through any ascii characters unchanged, only escaping non-ascii characters. This ensures any existing backslash escape sequences will be interpreted as the character they represent upon .decode().

Tested on Python 2.7, 3.6 and 3.10

In `process_unicode_escape_sequences()`, any backslash escape sequences
in the original string are escaped upon the first
`.encode('unicode-escape')` and therefore round-trip the sequence of
`.encode('unicode-escape').decode('unicode-escape')`.

That is not what we want - we want these sequences to be passed-through
the `.encode` unchanged, so that they will be converted to the
character they represent upon `.decode()`.

This patch changes the `.encode()` step to pass through any ascii
characters unchanged, only escaping non-ascii characters. This ensures
any existing backslash escape sequences will be interpreted as the
character they represent upon `.decode()`.
@frej
Copy link
Owner

frej commented Oct 23, 2022

You are an ideal contributor @chrisjbillington, thanks a lot!

@frej frej merged commit 6700b16 into frej:master Oct 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants