-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subrule: Fixed a huge use after scope bug #284
Conversation
a4b5c74
to
e357244
Compare
It took me 3 days of pulling my hair to debug and fix the bug. The subrule's author put a big bomb under it and I caught the exposition only in a single test, with modern versions of Clang/GCC and enabled optimizations (most of boost regression test runners does not use `variant=release`). Valgrind did not catch any problems. Enabling address sanitizer causes the bug to disappear (now I understand why, it places big gaps between stack pointers). `-fsanitize-address-use-after-scope` catches the bug, but I have discovered that it is turned off in Clang by default after already having a repro. The bug could be easily reproduced if you use any parser that invalidates it's state in the destructor. I used `literal_char` with added `~literal_char() { ch = char_type(0); }`. Reproduction code: ```cpp rule<char const*> r; subrule<0> entry; r = (entry = 'a'); BOOST_TEST(test("a", r)); ``` It will fail because after assignment to `r` a temporary `subrule_group` is destroyed and rule is binded to the already destroyed object. The cause is in usage of `reference` parser. I am 100% sure the code was copy-pasted from `rule`/`grammar`. They store an expression within their body, while `subrule` actually used only as a placeholder. I have split `subrule_group` into an actual parser/generator that stores the expression and a Proto terminal that contains the parser/generator. Tested on: - VS 2008, 2010, 2015, 2017 - Clang 3.8, 3.9, 4.0, 5.0 - GCC 4.8, 7
e357244
to
b174d1f
Compare
I am going to ping the original |
Oh my... Hmmm... do you think the subrule is still worth keeping? I wonder if anyone is using it. I don't. |
Ouch... Indeed, much of the code was copied from grammar, and also from classic's subrule. I haven't worked with Spirit for years, and I'm afraid I don't remember enough about Spirit and Proto to give any useful feedback, sorry. I remember @djowel already mentioning years ago that the subrule mechanism was probably no longer useful in Qi/Karma (it was there in Classic for performance reasons, which no longer apply in Qi/Karma)... no objection here if you want to retire it. |
I have made some benchmarks on VS 2017.
|
Subrules were broken for about 2 years and never worked on c++11 (#259) so will not expect they are exist, but I think many people tried:
I just found these links and can confirm the problems described in 4 and 5 before this patch, while with it they are fixed. |
Very thoughtful of you to do all these benchmarks! I wonder how X3 fares in comparison and I think for performance concerns such as call overhead, I'd go for X3 instead. The design of X3 is based on the subrules anyway. |
-----------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------
BM_qi_rule/8 366 ns 350 ns 1869646 195.936M items/s qi_rule
BM_qi_rule/64 2753 ns 2753 ns 249286 199.499M items/s qi_rule
BM_qi_rule/512 21786 ns 21903 ns 32051 200.638M items/s qi_rule
BM_qi_rule/4096 171558 ns 172102 ns 4079 204.276M items/s qi_rule
BM_qi_rule/32768 1371057 ns 1344297 ns 499 209.217M items/s qi_rule
BM_qi_rule/65536 2742350 ns 2756644 ns 249 204.052M items/s qi_rule
BM_qi_rule_BigO 2.32 N 2.33 N qi_rule
BM_qi_rule_RMS 0 % 2 % qi_rule
BM_qi_subrule/8 105 ns 106 ns 7478585 645.438M items/s qi_subrule
BM_qi_subrule/64 755 ns 765 ns 897430 718.196M items/s qi_subrule
BM_qi_subrule/512 6001 ns 5980 ns 112179 734.9M items/s qi_subrule
BM_qi_subrule/4096 47801 ns 47281 ns 11218 743.553M items/s qi_subrule
BM_qi_subrule/32768 381998 ns 382398 ns 1795 735.49M items/s qi_subrule
BM_qi_subrule/65536 761447 ns 765222 ns 897 735.081M items/s qi_subrule
BM_qi_subrule_BigO 0.65 N 0.65 N qi_subrule
BM_qi_subrule_RMS 0 % 0 % qi_subrule
BM_x3/8 100 ns 100 ns 7478585 685.778M items/s x3
BM_x3/64 748 ns 747 ns 897430 734.898M items/s x3
BM_x3/512 5941 ns 5980 ns 112179 734.9M items/s x3
BM_x3/4096 47482 ns 47978 ns 14957 732.76M items/s x3
BM_x3/32768 379852 ns 383746 ns 1870 732.907M items/s x3
BM_x3/65536 760884 ns 765222 ns 897 735.081M items/s x3
BM_x3_BigO 0.64 N 0.65 N x3
BM_x3_RMS 0 % 0 % x3 The numbers are different, because for previous benchmark I used Benchmark code#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_subrule.hpp>
#include <benchmark/benchmark.h>
#include <string>
struct Helper
{
Helper(benchmark::State & state_, std::string && label)
: state(state_)
, data()
{
for (auto i = 0; i < state.range(0); ++i)
data = data + "abcbcacab";
state.SetLabel(std::move(label));
}
~Helper()
{
// complexity for "abc" in compare (`lit`) calls is
// a: 1, b: 2, c: 3; "abc" repeats 3 times
state.SetComplexityN(state.range(0) * (1 + 2 + 3) * 3);
state.SetItemsProcessed(state.iterations() * data.size());
}
benchmark::State & state;
std::string data;
};
static void BM_qi_subrule(benchmark::State & state)
{
using boost::spirit::qi::rule;
using boost::spirit::repository::qi::subrule;
using boost::spirit::qi::parse;
Helper helper(state, "qi_subrule");
auto const iter = helper.data.begin();
auto const end = helper.data.end();
using Iterator = decltype(helper.data.end());
subrule<99> entry;
subrule<42> a;
subrule<48> b;
subrule<16> c;
rule<Iterator> start;
start = (
entry = *(a | b | c)
, a = 'a'
, b = 'b'
, c = 'c'
);
for (auto _ : state) {
benchmark::DoNotOptimize(parse(iter, end, start));
}
}
static void BM_qi_rule(benchmark::State & state)
{
using boost::spirit::qi::rule;
using boost::spirit::qi::parse;
Helper helper(state, "qi_rule");
auto const iter = helper.data.begin();
auto const end = helper.data.end();
using Iterator = decltype(helper.data.end());
rule<Iterator> a, b, c, start;
a = 'a';
b = 'b';
c = 'c';
start = *(a | b | c);
for (auto _ : state) {
benchmark::DoNotOptimize(parse(iter, end, start));
}
}
static void BM_x3(benchmark::State & state)
{
using boost::spirit::x3::lit;
using boost::spirit::x3::rule;
using boost::spirit::x3::parse;
auto a = lit('a');
auto b = lit('b');
auto c = lit('c');
rule<class r> r;
auto start =
r = *(a | b | c);
Helper helper(state, "x3");
auto const iter = helper.data.begin();
auto const end = helper.data.end();
using Iterator = decltype(helper.data.end());
for (auto _ : state) {
benchmark::DoNotOptimize(parse(iter, end, start));
}
}
#define MAKE_BENCH(name) \
BENCHMARK(name) \
/*->RangeMultiplier(2)*/ \
->Range(8, 8 << 13) \
->Complexity(benchmark::oN) \
/**/
MAKE_BENCH(BM_qi_rule);
MAKE_BENCH(BM_qi_subrule);
MAKE_BENCH(BM_x3);
BENCHMARK_MAIN() Note: Code will note compile because of #257, I will open a PR for it on the week. |
Very nice! You should add that in the spirit/workbench directory along with the other tests. |
It uses google benchmark library. Is this okay? And what about this PR? I think it is safe to merge because tests pass and even if somehow the PR introduces a regression no one will notice this because subrules are currently in a broken state (what PR should fix) and no-one uses them 😃. |
Hmmm.. probably not. Boost should not require any dependencies. |
It took me 3 days of pulling my hair to debug and fix the bug. The subrule's author put a big bomb under it and I caught the exposition only in a single test, with modern versions of Clang/GCC and enabled optimizations (most of boost regression test runners does not use
variant=release
). Valgrind did not catch any problems. Enabling address sanitizer causes the bug to disappear (now I understand why, it places big gaps between stack pointers).-fsanitize-address-use-after-scope
catches the bug, but I have discovered that it is turned off in Clang by default after already having a repro.The bug could be easily reproduced if you use any parser that invalidates it's state in the destructor.
I used
literal_char
with added~literal_char() { ch = char_type(0); }
.Reproduction code:
It will fail because after assignment to
r
a temporarysubrule_group
is destroyed and rule is binded to the already destroyed object.The cause is in usage of
reference
parser. I am 100% sure the code was copy-pasted fromrule
/grammar
. They store an expression within their body, whilesubrule
actually used only as a placeholder.I have split
subrule_group
into an actual parser/generator that stores the expression and a Proto terminal that contains the parser/generator.Tested on: