Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion '__b != memory_order_release' failed #864

Closed
elfringham opened this issue Mar 5, 2021 · 7 comments
Closed

Assertion '__b != memory_order_release' failed #864

elfringham opened this issue Mar 5, 2021 · 7 comments

Comments

@elfringham
Copy link

Resnet50 test on AARCH64 machine with A64FX CPU.

$ ./run_local.sh tf resnet50 cpu
INFO:main:Namespace(accuracy=False, backend='tensorflow', cache=0, count=None, data_format=None, dataset='imagenet', dataset_list=None, dataset_path='/home/builder/CK-TOOLS/dataset-imagenet-ilsvrc2012-val-min', find_peak_performance=False, inputs=['input_tensor:0'], max_batchsize=32, max_latency=None, mlperf_conf='../../mlperf.conf', model='/home/builder/mlperf/resnet50_v1.pb', model_name='resnet50', output='/home/builder/1/mlperf_build/mlperf/v0.7/mlperf/vision/classification_and_detection/output/tf-cpu/resnet50', outputs=['ArgMax:0'], profile='resnet50-tf', qps=None, samples_per_query=None, scenario='SingleStream', threads=48, time=None, user_conf='user.conf')
INFO:imagenet:reduced image list, 49500 images not found
INFO:imagenet:loaded 500 images, cache=0, took=2.3sec
INFO:main:starting TestScenario.SingleStream
/usr/local/include/c++/8.4.0/bits/atomic_base.h:393: std::__atomic_base<_IntTp>::__int_type std::__atomic_base<_IntTp>::load(std::memory_order) const [with _ITp = long int; std::__atomic_base<_IntTp>::__int_type = long int; std::memory_order = std::memory_order]: Assertion '__b != memory_order_release' failed.
./run_local.sh: line 13: 3689 Aborted (core dumped) python python/main.py --profile $profile $common_opt --model $model_path $dataset --output $OUTPUT_DIR $EXTRA_OPS $@

Running under gdb gives
/usr/local/include/c++/8.4.0/bits/atomic_base.h:393: std::__atomic_base<_IntTp>::__int_type std::__atomic_base<_IntTp>::load(std::memory_order) const [with _ITp = long int; std::__atomic_base<_IntTp>::__int_type = long int; std::memory_order = std::memory_order]: Assertion '__b != memory_order_release' failed.

Thread 1 "python" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x0000ffffbdeb07a8 in __GI_abort () at abort.c:79
#2 0x0000ffffbd54e5a0 in std::__replacement_assert (
__file=__file@entry=0xffffbd5ea928 "/usr/local/include/c++/8.4.0/bits/atomic_base.h", __line=__line@entry=393,
__function=__function@entry=0xffffbd5e9e48 <std::__atomic_base::load(std::memory_order) const::PRETTY_FUNCTION> "std::__atomic_base<_IntTp>::__int_type std::__atomic_base<_IntTp>::load(std::memory_order) const [with _ITp = long int; std::__atomic_base<_IntTp>::__int_type = long int; std::memory_order = std::memo"...,
__condition=__condition@entry=0xffffbd5ea908 "__b != memory_order_release")
at /usr/local/include/c++/8.4.0/aarch64-unknown-linux-gnu/bits/c++config.h:447
#3 0x0000ffffbd589220 in std::__atomic_base::load (__m=std::memory_order_release, this=)
at /usr/local/include/c++/8.4.0/bits/atomic_base.h:390
#4 mlperf::logging::AsyncLog::GetMaxLatencySoFar (this=) at logging.cc:363
#5 0x0000ffffbd5894b8 in mlperf::logging::Logger::GetMaxLatencySoFar (this=) at logging.cc:646
#6 0x0000ffffbd584fac in mlperf::loadgen::IssueQueries<(mlperf::TestScenario)0, (mlperf::TestMode)2> (
sut=sut@entry=0xaaaacd4dd3b0, settings=..., loaded_sample_set=..., sequence_gen=sequence_gen@entry=0xffffffffcbf0)
at logging.h:638
#7 0x0000ffffbd5851b0 in mlperf::loadgen::RunPerformanceMode<(mlperf::TestScenario)0> (sut=0xaaaacd4dd3b0,
qsl=0xaaaaae31cb50, settings=..., sequence_gen=0xffffffffcbf0)
at /usr/local/include/c++/8.4.0/bits/stl_iterator.h:783
#8 0x0000ffffbd55c888 in mlperf::StartTest (sut=sut@entry=0xaaaacd4dd3b0, qsl=qsl@entry=0xaaaaae31cb50,
requested_settings=..., log_settings=...) at loadgen.cc:1344
#9 0x0000ffffbd59fc08 in mlperf::py::StartTest (sut=sut@entry=187650565591984, qsl=qsl@entry=187650043661136,
test_settings=...) at bindings/python_api.cc:203
#10 0x0000ffffbd5d1950 in pybind11::detail::argument_loader<unsigned long, unsigned long, mlperf::TestSettings>::call_impl<void, void (&)(unsigned long, unsigned long, mlperf::TestSettings), 0ul, 1ul, 2ul, pybind11::detail::void_type> (
f=, this=0xffffffffe1a8) at ../third_party/pybind/include/pybind11/cast.h:1930
#11 pybind11::detail::argument_loader<unsigned long, unsigned long, mlperf::TestSettings>::call<void, pybind11::detail::void_type, void (
&)(unsigned long, unsigned long, mlperf::TestSettings)>(void (&)(unsigned long, unsigned long, mlperf::TestSettings)) && (f=, this=0xffffffffe1a8) at ../third_party/pybind/include/pybind11/cast.h:1913
#12 pybind11::cpp_function::initialize<void (
&)(unsigned long, unsigned long, mlperf::TestSettings), void, unsigned long, unsigned long, mlperf::TestSettings, pybind11::name, pybind11::scope, pybind11::sibling, char [95]>(void (&)(unsigned long, unsigned long, mlperf::TestSettings), void ()(unsigned long, unsigned long, mlperf::TestSettings), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [95])::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (call=..., this=0x0)
at ../third_party/pybind/include/pybind11/pybind11.h:155
#13 pybind11::cpp_function::initialize<void (&)(unsigned long, unsigned long, mlperf::TestSettings), void, unsigned long, unsigned long, mlperf::TestSettings, pybind11::name, pybind11::scope, pybind11::sibling, char [95]>(void (&)(unsigned long, unsigned long, mlperf::TestSettings), void (*)(unsigned long, unsigned long, mlperf::TestSettings), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, char const (&) [95])::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) () at ../third_party/pybind/include/pybind11/pybind11.h:133
#14 0x0000ffffbd5cab98 in pybind11::cpp_function::dispatcher (self=,
args_in=(187650565591984, 187650043661136, <mlperf_loadgen.TestSettings at remote 0xffffbc47e030>), kwargs_in=0x0)
at ../third_party/pybind/include/pybind11/pybind11.h:620
#15 0x0000ffffbe2e81e4 in _PyCFunction_FastCallDict (kwargs=, nargs=,
args=, func_obj=<built-in method StartTest of PyCapsule object at remote 0xffffbd64dc00>)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pystate.c:533
#16 _PyCFunction_FastCallKeywords (kwnames=, nargs=, stack=,
func=) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Objects/methodobject.c:294
#17 call_function (pp_stack=pp_stack@entry=0xffffffffe618, oparg=, kwnames=kwnames@entry=0x0)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4851
#18 0x0000ffffbe2e8f2c in _PyEval_EvalFrameDefault (f=, throwflag=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:3335
#19 0x0000ffffbe22f984 in PyEval_EvalFrameEx (throwflag=0, f=
Frame 0xaaaaab024348, for file python/main.py, line 523, in main (args=<Namespace(dataset='imagenet', dataset_path='/home/builder/CK-TOOLS/dataset-imagenet-ilsvrc2012-val-min', dataset_list=None, data_format=None, profile='resnet50-tf', scenario='SingleStream', max_batchsize=32, model='/home/builder/mlperf/resnet50_v1.pb', output='/home/builder/1/mlperf_build/mlperf/v0.7/mlperf/vision/classification_and_detection/output/tf-cpu/resnet50', inputs=['input_tensor:0'], outputs=['ArgMax:0'], backend='tensorflow', model_name='resnet50', threads=48, qps=None, cache=0, accuracy=False, find_peak_performance=False, mlperf_conf='../../mlperf.conf', user_conf='user.conf', time=None, count=None, max_latency=None, samples_per_query=None) at remote 0xffffb9ce5a20>, backend=<BackendTensorflow(inputs=[...], outputs=[...], sess=<Session(_graph=<Graph(_lock=<_thread.RLock at remote 0xffffa9ac2900>, _group_lock=<GroupLock(_ready=<Condition(_lock=<_thread.lock at remote 0xffffa9a--Type for more, q to quit, c to continue without paging--
19828>, acquire=<built-in method acquire of _thread....(truncated))
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#20 _PyEval_EvalCodeWithName (_co=, globals=, locals=,
args=, argcount=, kwnames=, kwargs=,
kwcount=, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name='main', qualname='main')
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#21 0x0000ffffbe2c1218 in fast_function (func=func@entry=<function at remote 0xffffb9d9e6a8>, stack=0xaaaaaab32778,
nargs=nargs@entry=0, kwnames=kwnames@entry=0x0) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4984
#22 0x0000ffffbe2e7f54 in call_function (pp_stack=pp_stack@entry=0xffffffffe9d8, oparg=,
kwnames=kwnames@entry=0x0) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4872
#23 0x0000ffffbe2e8f2c in _PyEval_EvalFrameDefault (f=, throwflag=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:3335
#24 0x0000ffffbe22f240 in PyEval_EvalFrameEx (throwflag=0,
f=Frame 0xaaaaaab325f8, for file python/main.py, line 545, in ())
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#25 _PyEval_EvalCodeWithName (_co=, globals=, locals=,
args=, argcount=, kwnames=, kwargs=,
kwcount=, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4166
#26 0x0000ffffbe230d78 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0,
argcount=0, args=0x0, locals=, globals=, _co=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:4187
#27 PyEval_EvalCode (co=, globals=, locals=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/ceval.c:731
#28 0x0000ffffbe3817a4 in run_mod (mod=mod@entry=0xaaaaaabb8060, filename=filename@entry='python/main.py',
globals=globals@entry={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated),
locals=locals@entry={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated),
flags=flags@entry=0xffffffffed20, arena=arena@entry=0xffffbda61330)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pythonrun.c:1025
#29 0x0000ffffbe2115a8 in PyRun_FileExFlags (fp=0xaaaaaaad03c0, filename_str=, start=,
globals={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated),
locals={'name': 'main', 'doc': '\nmlperf inference benchmarking tool\n', 'package': None, 'loader': <SourceFileLoader(name='main', path='python/main.py') at remote 0xffffbda029b0>, 'spec': None, 'annotations': {}, 'builtins': <module at remote 0xffffbdac9638>, 'file': 'python/main.py', 'cached': None, 'division': <_Feature(optional=(2, 2, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=8192) at remote 0xffffbd84fb70>, 'print_function': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_flag=65536) at remote 0xffffbd84fc18>, 'unicode_literals': <_Feature(optional=(2, 6, 0, 'alpha', 2), mandatory=(3, 0, 0, 'alpha', 0), compiler_fl--Type for more, q to quit, c to continue without paging--
ag=131072) at remote 0xffffbd84fc50>, 'argparse': <module at remote 0xffffbd8517c8>, 'array': <module at remote 0xffffbd85c458>, 'collections': <module at remote 0xffffbd957458>, 'json': <module at remote 0xffffbd6f1868>, 'logging': <module at remote 0xffffbd6fa778>, 'os': <module at remote 0xffffbd992...(truncated), closeit=1, flags=0xffffffffed20)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pythonrun.c:978
#30 0x0000ffffbe2141e4 in PyRun_SimpleFileExFlags (fp=0xaaaaaaad03c0, filename=0xffffbd910ce0 "python/main.py",
closeit=1, flags=0xffffffffed20) at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Python/pythonrun.c:419
#31 0x0000ffffbe389f4c in run_file (p_cf=0xffffffffed20, filename=0xaaaaaaad5920 L"python/main.py", fp=0xaaaaaaad03c0)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Modules/main.c:344
#32 Py_Main (argc=, argv=)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Modules/main.c:814
#33 0x0000aaaaaaaa0d08 in main (argc=12, argv=0xffffffffef58)
at /usr/src/debug/python3-3.6.8-31.el8.aarch64/Programs/python.c:102
(gdb)

@christ1ne
Copy link
Contributor

is this issue present on Ubuntu for you?

@tjablin
Copy link
Contributor

tjablin commented Mar 15, 2021

loadgen/logging.cc:463 says:

QuerySampleLatency AsyncLog::GetMaxLatencySoFar() {
  return max_latency_.load(std::memory_order_release);
}

max_latency_ is defined as:
std::atomic<QuerySampleLatency> max_latency_{0};

The C++17 specification 32.6.1 says:

The order argument shall not be memory_order_release nor memory_order_acq_rel.

This code has always been wrong. I think we got away with it in the past because most people are either using the clang headers or are compiling out the assertions.

@tjablin
Copy link
Contributor

tjablin commented Mar 15, 2021

The bug was introduced by #238.

@guschmue
Copy link
Contributor

I think I ran into for windows debug builds and have a local fix. I could send a PR for it.

@tjablin
Copy link
Contributor

tjablin commented Mar 16, 2021

The code for computing atomic max also seems busted.

@tjablin
Copy link
Contributor

tjablin commented Mar 16, 2021

I wrote #878 to address this issue.

@tjablin
Copy link
Contributor

tjablin commented Mar 24, 2021

Since #878 was merged, I think it is safe to close this.

@tjablin tjablin closed this as completed Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants