Assertion '__b != memory_order_release' failed #864

elfringham · 2021-03-05T18:39:58Z

Resnet50 test on AARCH64 machine with A64FX CPU.

$ ./run_local.sh tf resnet50 cpu
INFO:main:Namespace(accuracy=False, backend='tensorflow', cache=0, count=None, data_format=None, dataset='imagenet', dataset_list=None, dataset_path='/home/builder/CK-TOOLS/dataset-imagenet-ilsvrc2012-val-min', find_peak_performance=False, inputs=['input_tensor:0'], max_batchsize=32, max_latency=None, mlperf_conf='../../mlperf.conf', model='/home/builder/mlperf/resnet50_v1.pb', model_name='resnet50', output='/home/builder/1/mlperf_build/mlperf/v0.7/mlperf/vision/classification_and_detection/output/tf-cpu/resnet50', outputs=['ArgMax:0'], profile='resnet50-tf', qps=None, samples_per_query=None, scenario='SingleStream', threads=48, time=None, user_conf='user.conf')
INFO:imagenet:reduced image list, 49500 images not found
INFO:imagenet:loaded 500 images, cache=0, took=2.3sec
INFO:main:starting TestScenario.SingleStream
/usr/local/include/c++/8.4.0/bits/atomic_base.h:393: std::__atomic_base<_IntTp>::__int_type std::__atomic_base<_IntTp>::load(std::memory_order) const [with _ITp = long int; std::__atomic_base<_IntTp>::__int_type = long int; std::memory_order = std::memory_order]: Assertion '__b != memory_order_release' failed.
./run_local.sh: line 13: 3689 Aborted (core dumped) python python/main.py --profile $profile $common_opt --model $model_path $dataset --output $OUTPUT_DIR $EXTRA_OPS $@

Running under gdb gives
/usr/local/include/c++/8.4.0/bits/atomic_base.h:393: std::__atomic_base<_IntTp>::__int_type std::__atomic_base<_IntTp>::load(std::memory_order) const [with _ITp = long int; std::__atomic_base<_IntTp>::__int_type = long int; std::memory_order = std::memory_order]: Assertion '__b != memory_order_release' failed.

Thread 1 "python" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000ffffbdeb07a8 in __GI_abort () at abort.c:79
#2 0x0000ffffbd54e5a0 in std::__replacement_assert (
__file=__file@entry=0xffffbd5ea928 "/usr/local/include/c++/8.4.0/bits/atomic_base.h", __line=__line@entry=393,
__function=__function@entry=0xffffbd5e9e48 <std::__atomic_base::load(std::memory_order) const::PRETTY_FUNCTION> "std::__atomic_base<_IntTp>::__int_type std::__atomic_base<_IntTp>::load(std::memory_order) const [with _ITp = long int; std::__atomic_base<_IntTp>::__int_type = long int; std::memory_order = std::memo"...,
__condition=__condition@entry=0xffffbd5ea908 "__b != memory_order_release")
at /usr/local/include/c++/8.4.0/aarch64-unknown-linux-gnu/bits/c++config.h:447
#3 0x0000ffffbd589220 in std::__atomic_base::load (__m=std::memory_order_release, this=)
at /usr/local/include/c++/8.4.0/bits/atomic_base.h:390
#4 mlperf::logging::AsyncLog::GetMaxLatencySoFar (this=) at logging.cc:363
#5 0x0000ffffbd5894b8 in mlperf::logging::Logger::GetMaxLatencySoFar (this=) at logging.cc:646
#6 0x0000ffffbd584fac in mlperf::loadgen::IssueQueries<(mlperf::TestScenario)0, (mlperf::TestMode)2> (
sut=sut@entry=0xaaaacd4dd3b0, settings=..., loaded_sample_set=..., sequence_gen=sequence_gen@entry=0xffffffffcbf0)
at logging.h:638
#7 0x0000ffffbd5851b0 in mlperf::loadgen::RunPerformanceMode<(mlperf::TestScenario)0> (sut=0xaaaacd4dd3b0,
qsl=0xaaaaae31cb50, settings=..., sequence_gen=0xffffffffcbf0)
at /usr/local/include/c++/8.4.0/bits/stl_iterator.h:783
#8 0x0000ffffbd55c888 in mlperf::StartTest (sut=sut@entry=0xaaaacd4dd3b0, qsl=qsl@entry=0xaaaaae31cb50,
requested_settings=..., log_settings=...) at loadgen.cc:1344
#9 0x0000ffffbd59fc08 in mlperf::py::StartTest (sut=sut@entry=187650565591984, qsl=qsl@entry=187650043661136,
test_settings=...) at bindings/python_api.cc:203
#10 0x0000ffffbd5d1950 in pybind11::detail::argument_loader<unsigned long, unsigned long, mlperf::TestSettings>::call_impl<void, void (&)(unsigned long, unsigned long, mlperf::TestSettings), 0ul, 1ul, 2ul, pybind11::detail::void_type> (
f=, this=0xffffffffe1a8) at ../third_party/pybind/include/pybind11/cast.h:1930
#11 pybind11::detail::argument_loader<unsigned long, unsigned long, mlperf::TestSettings>::call<void, pybind11::detail::void_type, void (&)(unsigned long, unsigned long, mlperf::TestSettings)>(void (&)(unsigned long, unsigned long, mlperf::TestSettings)) && (f=, this=0xffffffffe1a8) at ../third_party/pybind/include/pybind11/cast.h:1913
#12 pybind11::cpp_function::initialize<void (&)(unsigned long, unsigned long, mlperf::TestSettings), void, unsigned long, unsigned long, mlperf::TestSettings, pybind11::name, pybind11::scope, pybind11::sibling, char [95]>(void (&)(unsigned long, unsigned long, mlperf::TestSettings), void ()(unsigned long, unsigned long, mlperf::TestSettings), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [95])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (call=..., this=0x0)
at ../third_party/pybind/include/pybind11/pybind11.h:155
#13 pybind11::cpp_function::initialize<void (&)(unsigned long, unsigned long, mlperf::TestSettings), void, unsigned long, unsigned long, mlperf::TestSettings, pybind11::name, pybind11::scope, pybind11::sibling, char [95]>(void (&)(unsigned long, unsigned long, mlperf::TestSettings), void (*)(unsigned long, unsigned long, mlperf::TestSettings), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [95])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) () at ../third_party/pybind/include/pybind11/pybind11.h:133
#14 0x0000ffffbd5cab98 in pybind11::cpp_function::dispatcher (self=,
args_in=(187650565591984, 187650043661136, <mlperf_loadgen.TestSettings at remote 0xffffbc47e030>), kwargs_in=0x0)
at ../third_party/pybind/include/pybind11/pybind11.h:620
#15 0x0000ffffbe2e81e4 in _PyCFunction_FastCallDict (kwargs=, nargs=,
args=, func_obj=<built-in method StartTest of PyCapsule object at remote 0xffffbd64dc00>)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pystate.c:533
#16 _PyCFunction_FastCallKeywords (kwnames=, nargs=, stack=,
func=) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Objects/methodobject.c:294
#17 call_function (pp_stack=pp_stack@entry=0xffffffffe618, oparg=, kwnames=kwnames@entry=0x0)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4851
#18 0x0000ffffbe2e8f2c in _PyEval_EvalFrameDefault (f=, throwflag=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:3335
#19 0x0000ffffbe22f984 in PyEval_EvalFrameEx (throwflag=0, f=
Frame 0xaaaaab024348, for file python/main.py, line 523, in main (args=<Namespace(dataset='imagenet', dataset_path='/home/builder/CK-TOOLS/dataset-imagenet-ilsvrc2012-val-min', dataset_list=None, data_format=None, profile='resnet50-tf', scenario='SingleStream', max_batchsize=32, model='/home/builder/mlperf/resnet50_v1.pb', output='/home/builder/1/mlperf_build/mlperf/v0.7/mlperf/vision/classification_and_detection/output/tf-cpu/resnet50', inputs=['input_tensor:0'], outputs=['ArgMax:0'], backend='tensorflow', model_name='resnet50', threads=48, qps=None, cache=0, accuracy=False, find_peak_performance=False, mlperf_conf='../../mlperf.conf', user_conf='user.conf', time=None, count=None, max_latency=None, samples_per_query=None) at remote 0xffffb9ce5a20>, backend=<BackendTensorflow(inputs=[...], outputs=[...], sess=<Session(_graph=<Graph(_lock=<_thread.RLock at remote 0xffffa9ac2900>, _group_lock=<GroupLock(_ready=<Condition(_lock=<_thread.lock at remote 0xffffa9a--Type for more, q to quit, c to continue without paging--
19828>, acquire=<built-in method acquire of _thread....(truncated))
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#20 _PyEval_EvalCodeWithName (_co=, globals=, locals=,
args=, argcount=, kwnames=, kwargs=,
kwcount=, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name='main', qualname='main')
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#21 0x0000ffffbe2c1218 in fast_function (func=func@entry=<function at remote 0xffffb9d9e6a8>, stack=0xaaaaaab32778,
nargs=nargs@entry=0, kwnames=kwnames@entry=0x0) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4984
#22 0x0000ffffbe2e7f54 in call_function (pp_stack=pp_stack@entry=0xffffffffe9d8, oparg=,
kwnames=kwnames@entry=0x0) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4872
#23 0x0000ffffbe2e8f2c in _PyEval_EvalFrameDefault (f=, throwflag=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:3335
#24 0x0000ffffbe22f240 in PyEval_EvalFrameEx (throwflag=0,
f=Frame 0xaaaaaab325f8, for file python/main.py, line 545, in ())
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#25 _PyEval_EvalCodeWithName (_co=, globals=, locals=,
args=, argcount=, kwnames=, kwargs=,
kwcount=, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#26 0x0000ffffbe230d78 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0,
argcount=0, args=0x0, locals=, globals=, _co=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4187
#27 PyEval_EvalCode (co=, globals=, locals=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:731
#28 0x0000ffffbe3817a4 in run_mod (mod=mod@entry=0xaaaaaabb8060, filename=filename@entry='python/main.py',
globals=globals@entry={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated),
locals=locals@entry={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated),
flags=flags@entry=0xffffffffed20, arena=arena@entry=0xffffbda61330)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pythonrun.c:1025
#29 0x0000ffffbe2115a8 in PyRun_FileExFlags (fp=0xaaaaaaad03c0, filename_str=, start=,
globals={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated),
locals={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_fl--Type for more, q to quit, c to continue without paging--
ag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated), closeit=1, flags=0xffffffffed20)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pythonrun.c:978
#30 0x0000ffffbe2141e4 in PyRun_SimpleFileExFlags (fp=0xaaaaaaad03c0, filename=0xffffbd910ce0 "python/main.py",
closeit=1, flags=0xffffffffed20) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pythonrun.c:419
#31 0x0000ffffbe389f4c in run_file (p_cf=0xffffffffed20, filename=0xaaaaaaad5920 L"python/main.py", fp=0xaaaaaaad03c0)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Modules/main.c:344
#32 Py_Main (argc=, argv=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Modules/main.c:814
#33 0x0000aaaaaaaa0d08 in main (argc=12, argv=0xffffffffef58)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Programs/python.c:102
(gdb)

christ1ne · 2021-03-15T22:46:28Z

is this issue present on Ubuntu for you?

tjablin · 2021-03-15T23:01:39Z

loadgen/logging.cc:463 says:

QuerySampleLatency AsyncLog::GetMaxLatencySoFar() {
  return max_latency_.load(std::memory_order_release);
}

max_latency_ is defined as:
std::atomic<QuerySampleLatency> max_latency_{0};

The C++17 specification 32.6.1 says:

The order argument shall not be memory_order_release nor memory_order_acq_rel.

This code has always been wrong. I think we got away with it in the past because most people are either using the clang headers or are compiling out the assertions.

tjablin · 2021-03-15T23:05:27Z

The bug was introduced by #238.

guschmue · 2021-03-16T00:23:17Z

I think I ran into for windows debug builds and have a local fix. I could send a PR for it.

tjablin · 2021-03-16T00:41:13Z

The code for computing atomic max also seems busted.

tjablin · 2021-03-16T00:59:12Z

I wrote #878 to address this issue.

tjablin · 2021-03-24T21:54:33Z

Since #878 was merged, I think it is safe to close this.

christ1ne added the inference v2.1 and backlog label Mar 15, 2021

tjablin closed this as completed Mar 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assertion '__b != memory_order_release' failed #864

Assertion '__b != memory_order_release' failed #864

elfringham commented Mar 5, 2021

christ1ne commented Mar 15, 2021

tjablin commented Mar 15, 2021

tjablin commented Mar 15, 2021

guschmue commented Mar 16, 2021

tjablin commented Mar 16, 2021

tjablin commented Mar 16, 2021

tjablin commented Mar 24, 2021

Assertion '__b != memory_order_release' failed #864

Assertion '__b != memory_order_release' failed #864

Comments

elfringham commented Mar 5, 2021

christ1ne commented Mar 15, 2021

tjablin commented Mar 15, 2021

tjablin commented Mar 15, 2021

guschmue commented Mar 16, 2021

tjablin commented Mar 16, 2021

tjablin commented Mar 16, 2021

tjablin commented Mar 24, 2021