Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lex: Avoid casting `min_token_id` to `id_type` in lexer ctor #420

Merged
merged 1 commit into from Nov 21, 2018

Conversation

Projects
None yet
2 participants
@Kojoley
Copy link
Collaborator

commented Nov 20, 2018

When min_token_id is not a valid value for id_type and first_id is not provided lexer constructor is triggering UB. To fix this problem, constructor with ommited first_id initializes next_token_id directly, so if unique ids feature is not used there is no UB.

Found the problem in a lex test with UBSan.

boost/spirit/home/lex/lexer/lexer.hpp:382:27: runtime error: load of value 65536, which is not a valid value for type 'boost::spirit::lex::lexer<boost::spirit::lex::lexertl::actor_lexer<boost::spirit::lex::lexertl::token<__gnu_cxx::__normal_iterator<wchar_t *, std::__cxx11::basic_string<wchar_t> >, boost::mpl::vector<wchar_t, std::__cxx11::basic_string<wchar_t>, double, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mpl_::bool_<true>, tokenids>, __gnu_cxx::__normal_iterator<wchar_t *, std::__cxx11::basic_string<wchar_t> >, boost::spirit::lex::lexertl::functor<boost::spirit::lex::lexertl::token<__gnu_cxx::__normal_iterator<wchar_t *, std::__cxx11::basic_string<wchar_t> >, boost::mpl::vector<wchar_t, std::__cxx11::basic_string<wchar_t>, double, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::    #0 0x44b481 in boost::spirit::lex::lexer<boost::spirit::lex::lexertl::actor_lexer<boost::spirit::lex::lexertl::token<__gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, boost::mpl::vector<wchar_t, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >, double, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mpl_::bool_<true>, tokenids>, __gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, boost::spirit::lex::lexertl::functor<boost::spirit::lex::lexertl::token<__gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, boost::mpl::vector<wchar_t, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >, double, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mpl_::bool_<true>, tokenids>, boost::spirit::lex::lexertl::detail::data, __gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, mpl_::bool_<true>, mpl_::bool_<true> > > >::lexer(unsigned int, tokenids) ~/boost-root/./boost/spirit/home/lex/lexer/lexer.hpp:382:27
    #1 0x43a5cd in mega_tokens<boost::spirit::lex::lexertl::actor_lexer<boost::spirit::lex::lexertl::token<__gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, boost::mpl::vector<wchar_t, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >, double, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mpl_::bool_<true>, tokenids>, __gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, boost::spirit::lex::lexertl::functor<boost::spirit::lex::lexertl::token<__gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, boost::mpl::vector<wchar_t, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >, double, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, mpl_::bool_<true>, tokenids>, boost::spirit::lex::lexertl::detail::data, __gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >, mpl_::bool_<true>, mpl_::bool_<true> > > >::mega_tokens() ~/boost-root/libs/spirit/test/lex/regression_wide.cpp:89:5
    #2 0x437bd5 in main ~/boost-root/libs/spirit/test/lex/regression_wide.cpp:122:29
    #3 0x7f078b0752e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #4 0x40cff9 in _start (~/boost-root/bin.v2/libs/spirit/test/lex/lex_regression_wide.test/undefined/clang-linux-7/debug/visibility-hidden/lex_regression_wide+0x40cff9)
Lex: Avoid casting `min_token_id` to `id_type` in lexer ctor
When `min_token_id` is not a valid value for `id_type` and `first_id` is not
provided lexer constructor is triggering UB. To fix this problem, constructor
with ommited `first_id` initializes `next_token_id` directly, so if unique ids
feature is not used there is no UB.
@hkaiser
Copy link
Collaborator

left a comment

LGTM, thanks!

@Kojoley Kojoley merged commit 8e74087 into boostorg:develop Nov 21, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Kojoley Kojoley deleted the Kojoley:fix-lexer-constructor-ub branch Nov 21, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.