Skip to content

<regex>: regex_traits<_Elem> uses an inadmissible value of type char_class_type to represent character class "w" #5242

@muellerj2

Description

@muellerj2

regex_traits<_Elem> uses static_cast<ctype_base::mask>(-1) to represent the character class "w":

_REGEX_CHAR_CLASS_NAME("w", static_cast<ctype_base::mask>(-1)),

This is an inadmissible choice, because it violates [re.grammar]/9:

The results from multiple calls to traits_inst.lookup_classname can be bitwise or'ed together and subsequently passed to traits_inst.isctype.

Specifically, or'ing the char_class_type for "w" with the char_class_type for any other character class always produces the value for "w" again, even if the combination should match more characters.

Additional remarks

I think resolving this issue will break ABI. However, it should be possible to mitigate the problems caused by this issue.

vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregexmeow is a substring of homeownervNextBreaks binary compatibility

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions