Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Invalid range end in character class" for dash after character range #199

Open
cai-lw opened this issue Nov 1, 2023 · 0 comments
Open

Comments

@cai-lw
Copy link

cai-lw commented Nov 1, 2023

Summary

For most other regex engines, [X-Y-Z] is a valid character set, consisting of character range X to Y, literal dash -, and character Z. For example, you can verify at https://regex101.com/ that this is valid for all regex flavors it supports.

Currently Boost rejects this under the default syntax. It's not clear from the documentation whether this is valid for Boost. This can be easily fixed by e.g. changing to [X-YZ-] but I'm still interested in knowing if rejecting this is intentional.

I encountered this when migrating from std::regex. This is explicitly valid for std::regex:
https://en.cppreference.com/w/cpp/regex/ecmascript:

The character - is treated literally if it is

  • immediately follows a dash-separated range specification.

Minimal reproducible example

Code

#include <boost/regex.hpp>

int main() {
    const boost::regex re("[0-9-#]+");
    return boost::regex_match("12#34-56", re);
}

Expected behavior

Returns 1

Actual behavior

terminate called after throwing an instance of 'boost::wrapexcept<boost::regex_error>'
  what():  Invalid range end in character class  The error occurred while parsing the regular expression: '[0-9->>>HERE>>>#]+'.
Program terminated with signal: SIGSEGV

Proposed fixes

  • Determine whether we intend to accept or reject such syntax.
  • If yes, change parsing code accordingly.
  • In either case, clarify in the documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant