Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support python 2's almost-raw ur'' literals #1594

Closed
wants to merge 1 commit into from

Conversation

habnabit
Copy link

@habnabit habnabit commented Jan 26, 2017

Python 2 allows ur'' literals to contain \u and \U escapes which are actually parsed as unicode escapes instead of literally '\' 'u'.

I was going to write test cases but I'm not sure where to put them! This does actually fix the problem I was having where code relied on this, though.

Python 2 allows ur'' literals to contain \u and \U escapes which are actually
parsed as unicode escapes instead of literally '\' 'u'.
@robertwb
Copy link
Contributor

robertwb commented Feb 7, 2017

Tests?

@habnabit
Copy link
Author

habnabit commented Feb 7, 2017 via email

@robertwb
Copy link
Contributor

robertwb commented Feb 7, 2017

Sorry, should have answered your question. tests/run//unicodeliterals.pyx might be a good spot to add some examples.

@scoder
Copy link
Contributor

scoder commented Feb 11, 2017

I'm not sure if we should really support this. It is a legacy feature and has the potential to break existing code in subtle ways.

@scoder
Copy link
Contributor

scoder commented Feb 11, 2017

The change for latest master would be this, BTW:

            if is_raw and (is_python3_source or kind != 'u' or systr[1] not in u'Uu'):

@habnabit
Copy link
Author

habnabit commented Feb 11, 2017 via email

@scoder
Copy link
Contributor

scoder commented Feb 11, 2017

The deviation from Python behaviour is (almost) always a bug and this one was definitely an oversight and not a deliberate decision. I agree that it's a trap that users might run into when compiling or converting Python 2 code to Cython code. But changing the current behaviour might break existing Cython code. And people have the same problem when migrating code from Python 2 to Python 3 already. Thus, it does not seem an obvious choice to implement this.

@habnabit
Copy link
Author

habnabit commented Feb 11, 2017 via email

@robertwb
Copy link
Contributor

I would find it really surprising (a bug) if someone was relying on ur"\uXXXX" not following Python 2 semantics. Thus I think it makes a lot of sense to fix this bug (despite it being a bit legacy, at least it doesn't add non-trivial complexity to support).

@scoder scoder closed this in 71ec1a4 Feb 12, 2017
@habnabit habnabit deleted the python2-raw-unicode branch February 13, 2017 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants