Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refs #23919 -- Removed re.U and re.UNICODE (default on Python 3). #7879

Merged
merged 1 commit into from Jan 21, 2017

Conversation

Projects
None yet
4 participants
@felixxm
Copy link
Member

commented Jan 18, 2017

No description provided.

@felixxm felixxm force-pushed the felixxm:refs-23919-re-unicode branch 2 times, most recently from 9603ddb to 7de73ad Jan 18, 2017

@@ -246,11 +247,13 @@ class RegexSerializer(BaseSerializer):
def serialize(self):
imports = {"import re"}
regex_pattern, pattern_imports = serializer_factory(self.value.pattern).serialize()
regex_flags, flag_imports = serializer_factory(self.value.flags).serialize()
# re.UNICODE is implicit flag, hence we need to turn it off
flags = self.value.flags ^ re.UNICODE

This comment has been minimized.

Copy link
@claudep

claudep Jan 19, 2017

Member

What is the problem if we let the default flags value?

This comment has been minimized.

Copy link
@felixxm

felixxm Jan 19, 2017

Author Member

All compiled regexes will be serialized with re.UNICODE flag e.g.:

re.compile(r'^\w+$') -> re.compile(r'^\w+$', 32)

instead of re.compile(r'^\w+$').

That breaks tests with assertSerializedEqual (migrations.test_writer.WriterTests) e.g.
test_serialize_compiled_regex, test_serialize_class_based_validators, test_serialize_lazy_objects

This comment has been minimized.

Copy link
@claudep

claudep Jan 19, 2017

Member

What about defining default_flags = re.compile('').flags and using that value below in the if comparison to decide if regex_flags have to be added?

This comment has been minimized.

Copy link
@felixxm

felixxm Jan 19, 2017

Author Member

We can use re.compile('').flags instead of re.UNICODE but xor is still needed. Flags is int value e.g. re.UNICODE -> 32, re.S -> 16, re.S | re.UNICODE -> 48, so if we get self.value.flags==48 we need to return 16.

It seems that regexes with the same implicit and explicit flags are not the same:

>>> re.compile(r'^\w+$', flags=re.UNICODE)==re.compile(r'^\w+$')
False
>>> re.compile(r'^\w+$', flags=re.UNICODE).flags==re.compile(r'^\w+$').flags
True

@felixxm felixxm force-pushed the felixxm:refs-23919-re-unicode branch from 7de73ad to 53544b1 Jan 19, 2017

@felixxm

This comment has been minimized.

Copy link
Member Author

commented Jan 19, 2017

@claudep Thanks for review. I updated PR.

@claudep

This comment has been minimized.

Copy link
Member

commented Jan 19, 2017

Thanks. Will that change produce new migrations for each field with regexes?

regex_flags, flag_imports = serializer_factory(self.value.flags).serialize()
# We turn off default implicit flags, because regexes with the same
# implicit and explicit flags are not equal.
flags = self.value.flags ^ re.compile('').flags

This comment has been minimized.

Copy link
@MarkusH

MarkusH Jan 19, 2017

Member

Would it make sense to set implicit flags along with implicit flags or is that exactly what's not working?

This comment has been minimized.

Copy link
@felixxm

felixxm Jan 19, 2017

Author Member

Without xor default flags (i.e. re.UNICODE) are always set explicit after serialization. I think that is something that we want to avoid.

Before this PR:

>>> serializer_factory(re.compile(r'^\w+$')).serialize()
("re.compile('^\\\\w+$', 32)", {'import re'})
>>> serializer_factory(re.compile(r'^\w+$', 32)).serialize()
("re.compile('^\\\\w+$', 32)", {'import re'})

After this PR:

>>> serializer_factory(re.compile(r'^\w+$')).serialize()
("re.compile('^\\\\w+$')", {'import re'})
>>> serializer_factory(re.compile(r'^\w+$', 32)).serialize()
("re.compile('^\\\\w+$')", {'import re'})

This comment has been minimized.

Copy link
@claudep

claudep Jan 20, 2017

Member

Of course, I like a lot more the result after the PR. It's just a bit unfortunate if it will trigger some "fake" migrations (which I'm not sure it will, it was just a guess).

This comment has been minimized.

Copy link
@timgraham

timgraham Jan 21, 2017

Member

It doesn't trigger an extra migration in my testing.

@timgraham timgraham force-pushed the felixxm:refs-23919-re-unicode branch from 53544b1 to c222122 Jan 21, 2017

regex_flags, flag_imports = serializer_factory(self.value.flags).serialize()
# We turn off default implicit flags, because regexes with the same
# implicit and explicit flags are not equal.
flags = self.value.flags ^ re.compile('').flags

This comment has been minimized.

Copy link
@timgraham

timgraham Jan 21, 2017

Member

It doesn't trigger an extra migration in my testing.

@timgraham timgraham merged commit c222122 into django:master Jan 21, 2017

17 checks passed

docs Build #14553 ended
Details
flake8 Build #14665 ended
Details
isort Build #14692 succeeded in 13 sec
Details
pull-requests-javascript Build #11053 ended
Details
pull-requests-trusty/database=mysql,label=trusty-pr,python=python3.4 Build #10433 ended
Details
pull-requests-trusty/database=mysql,label=trusty-pr,python=python3.6 Build #10433 ended
Details
pull-requests-trusty/database=mysql_gis,label=trusty-pr,python=python3.4 Build #10433 ended
Details
pull-requests-trusty/database=mysql_gis,label=trusty-pr,python=python3.6 Build #10433 ended
Details
pull-requests-trusty/database=postgis,label=trusty-pr,python=python3.4 Build #10433 ended
Details
pull-requests-trusty/database=postgis,label=trusty-pr,python=python3.6 Build #10433 ended
Details
pull-requests-trusty/database=postgres,label=trusty-pr,python=python3.4 Build #10433 ended
Details
pull-requests-trusty/database=postgres,label=trusty-pr,python=python3.6 Build #10433 ended
Details
pull-requests-trusty/database=spatialite,label=trusty-pr,python=python3.4 Build #10433 ended
Details
pull-requests-trusty/database=spatialite,label=trusty-pr,python=python3.6 Build #10433 ended
Details
pull-requests-trusty/database=sqlite3,label=trusty-pr,python=python3.4 Build #10433 ended
Details
pull-requests-trusty/database=sqlite3,label=trusty-pr,python=python3.6 Build #10433 ended
Details
pull-requests-windows/database=sqlite3,label=windows,python=Python35 Build #6765 ended
Details
@claudep

This comment has been minimized.

Copy link
Member

commented Jan 21, 2017

Sorry for bringing confusion. And thanks for the patch!

@felixxm felixxm deleted the felixxm:refs-23919-re-unicode branch Jan 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.