Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux AArch64 RuntimeDyld relocation overflows (#8567 specific to Linux only) #9001

Closed
gmarkall opened this issue Jun 5, 2023 · 9 comments
Closed
Labels

Comments

@gmarkall
Copy link
Member

gmarkall commented Jun 5, 2023

I'm breaking out this issue from #8567 to avoid spamming everyone who participated in that issue with notifications as I add notes and debugging info. In summary, on Linux AArch64,

test_standalone.py
import numpy as np
from nvidia.dali import pipeline_def
import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as dali_types
from nvidia.dali.plugin.numba.fn.experimental import numba_function


def set_all_values_to_255_batch(out0, in0):
    out0[0][:] = 255


def set_all_values_to_255_sample(out0, in0):
    out0[:] = 255


def set_all_values_to_float_batch(out0, in0):
    out0[0][:] = 0.5


def set_all_values_to_float_sample(out0, in0):
    out0[:] = 0.5


def setup_change_out_shape(out_shape, in_shape):
    out0_shape = out_shape[0]
    in0_shape = in_shape[0]
    perm = [1, 2, 0]
    for sample_idx in range(len(out0_shape)):
        for d in range(len(perm)):
            out0_shape[sample_idx][d] = in0_shape[sample_idx][perm[d]]


def change_out_shape_batch(out0, in0):
    for sample_id in range(len(out0)):
        out0[sample_id][:] = 42


def change_out_shape_sample(out0, in0):
    out0[:] = 42


def get_data(shapes, dtype):
    return [np.empty(shape, dtype=dtype) for shape in shapes]


def get_data_zeros(shapes, dtype):
    return [np.zeros(shape, dtype=dtype) for shape in shapes]

@pipeline_def
def numba_func_pipe(shapes, dtype, run_fn=None, out_types=None, in_types=None,
                    outs_ndim=None, ins_ndim=None, setup_fn=None, batch_processing=None):
    data = fn.external_source(lambda: get_data(shapes, dtype), batch=True, device="cpu")
    return numba_function(
        data, run_fn=run_fn, out_types=out_types, in_types=in_types,
        outs_ndim=outs_ndim, ins_ndim=ins_ndim, setup_fn=setup_fn,
        batch_processing=batch_processing)


def _testimpl_numba_func(shapes, dtype, run_fn, out_types, in_types,
                         outs_ndim, ins_ndim, setup_fn, batch_processing, expected_out):
    batch_size = len(shapes)
    pipe = numba_func_pipe(
        batch_size=batch_size, num_threads=1, device_id=0,
        shapes=shapes, dtype=dtype,
        run_fn=run_fn, setup_fn=setup_fn, out_types=out_types,
        in_types=in_types, outs_ndim=outs_ndim, ins_ndim=ins_ndim,
        batch_processing=batch_processing)
    pipe.build()
    for _ in range(3):
        outs = pipe.run()
        for i in range(batch_size):
            out_arr = np.array(outs[0][i])
            assert np.array_equal(out_arr, expected_out[i])

def test_numba_func():
    # shape, dtype, run_fn, out_types,
    # in_types, out_ndim, in_ndim, setup_fn, batch_processing,
    # expected_out
    args = [
        ([(10, 10, 10)], np.uint8, set_all_values_to_255_batch, [dali_types.UINT8],
         [dali_types.UINT8], [3], [3], None, True,
         [np.full((10, 10, 10), 255, dtype=np.uint8)]),
        ([(10, 10, 10)], np.uint8, set_all_values_to_255_sample, [dali_types.UINT8],
         [dali_types.UINT8], [3], [3], None, None,
         [np.full((10, 10, 10), 255, dtype=np.uint8)]),
        ([(10, 10, 10)], np.float32, set_all_values_to_float_batch, [dali_types.FLOAT],
         [dali_types.FLOAT], [3], [3], None, True,
         [np.full((10, 10, 10), 0.5, dtype=np.float32)]),
        ([(10, 10, 10)], np.float32, set_all_values_to_float_sample, [dali_types.FLOAT],
         [dali_types.FLOAT], [3], [3], None, None,
         [np.full((10, 10, 10), 0.5, dtype=np.float32)]),
        ([(10, 20, 30), (20, 10, 30)], np.int64, change_out_shape_batch, [dali_types.INT64],
         [dali_types.INT64], [3], [3], setup_change_out_shape, True,
         [np.full((20, 30, 10), 42, dtype=np.int32),
          np.full((10, 30, 20), 42, dtype=np.int32)]),
        ([(10, 20, 30), (20, 10, 30)], np.int64, change_out_shape_sample, [dali_types.INT64],
         [dali_types.INT64], [3], [3], setup_change_out_shape, None,
         [np.full((20, 30, 10), 42, dtype=np.int32),
          np.full((10, 30, 20), 42, dtype=np.int32)]),
    ]

    for shape, dtype, run_fn, out_types, in_types, outs_ndim, ins_ndim, \
            setup_fn, batch_processing, expected_out in args:
        _testimpl_numba_func(
            shape, dtype, run_fn, out_types, in_types, outs_ndim, ins_ndim, \
            setup_fn, batch_processing, expected_out
        )

test_numba_func()

results in:

$ python test_standalone.py 
python: /root/miniconda3/envs/buildenv/conda-bld/llvmdev_1680642098205/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.
Aborted (core dumped)
@gmarkall
Copy link
Member Author

gmarkall commented Jun 5, 2023

It turns out I can even hit the assertion with a smaller test than the original reproducer from #8567 on Linux, in a docker container:

$ NUMBA_OPT=0 python -m unittest -vb numba.tests.test_stencils.TestManyStencils.test_basic40 

test_basic40 (numba.tests.test_stencils.TestManyStencils)
2 args! ... python: /root/miniconda3/envs/buildenv/conda-bld/llvmdev_1680642098205/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.
Fatal Python error: Aborted

Current thread 0x0000ffffb74d4010 (most recent call first):
  File "/usr/local/lib/python3.8/dist-packages/llvmlite/binding/ffi.py", line 152 in __call__
  File "/usr/local/lib/python3.8/dist-packages/llvmlite/binding/executionengine.py", line 92 in finalize_object
  File "/usr/local/lib/python3.8/dist-packages/numba/core/codegen.py", line 1060 in wrapper
  File "/usr/local/lib/python3.8/dist-packages/numba/core/codegen.py", line 999 in _finalize_specific
  File "/usr/local/lib/python3.8/dist-packages/numba/core/codegen.py", line 797 in _finalize_final_module
  File "/usr/local/lib/python3.8/dist-packages/numba/core/codegen.py", line 765 in finalize
  File "/usr/local/lib/python3.8/dist-packages/numba/core/codegen.py", line 567 in _ensure_finalized
  File "/usr/local/lib/python3.8/dist-packages/numba/core/codegen.py", line 989 in get_pointer_to_function
  File "/usr/local/lib/python3.8/dist-packages/numba/core/cpu.py", line 236 in get_executable
  File "/usr/local/lib/python3.8/dist-packages/numba/core/typed_passes.py", line 495 in run_pass
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler_machinery.py", line 273 in check
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler_machinery.py", line 311 in _runPass
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler_lock.py", line 35 in _acquire_compile_lock
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler_machinery.py", line 356 in run
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler.py", line 494 in _compile_core
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler.py", line 528 in _compile_bytecode
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler.py", line 460 in compile_extra
  File "/usr/local/lib/python3.8/dist-packages/numba/core/compiler.py", line 742 in compile_extra
  File "/usr/local/lib/python3.8/dist-packages/numba/tests/test_stencils.py", line 96 in _compile_this
  File "/usr/local/lib/python3.8/dist-packages/numba/tests/test_stencils.py", line 105 in compile_parallel
  File "/usr/local/lib/python3.8/dist-packages/numba/tests/test_stencils.py", line 749 in check_against_expected
  File "/usr/local/lib/python3.8/dist-packages/numba/tests/test_stencils.py", line 1844 in test_basic40
  File "/usr/lib/python3.8/unittest/case.py", line 633 in _callTestMethod
  File "/usr/lib/python3.8/unittest/case.py", line 676 in run
  File "/usr/lib/python3.8/unittest/case.py", line 736 in __call__
  File "/usr/lib/python3.8/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.8/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.8/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.8/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.8/unittest/runner.py", line 176 in run
  File "/usr/lib/python3.8/unittest/main.py", line 271 in runTests
  File "/usr/lib/python3.8/unittest/main.py", line 101 in __init__
  File "/usr/lib/python3.8/unittest/__main__.py", line 18 in <module>
  File "/usr/lib/python3.8/runpy.py", line 87 in _run_code
  File "/usr/lib/python3.8/runpy.py", line 194 in _run_module_as_main

@gmarkall
Copy link
Member Author

gmarkall commented Jun 7, 2023

Witnessed running the whole testsuite (python -m numba.runtests -m`) in a desperate attempt to trigger the bug on the Ampere 80 core machine:

test_2d_shape_dtypes (numba.tests.test_dyn_array.TestNdZeros) ... ok
test_1d_dtype_str_alternative_spelling (numba.tests.test_dyn_array.TestNdZeros) ... ok
test_alloc_size (numba.tests.test_dyn_array.TestNdZeros) ... ok
test_1d_dtype_str (numba.tests.test_dyn_array.TestNdZeros) ... ok
test_hash (numba.tests.test_enums.TestIntEnum) ... ok
test_1d_with_dtype (numba.tests.test_dyn_array.TestNpArray) ... ok
test_1d_with_str_dtype (numba.tests.test_dyn_array.TestNpArray) ... ok
test_entrypoint_extension_sequence (numba.tests.test_entrypoints.TestEntrypoints) ... ok
python: /root/miniconda3/envs/buildenv/conda-bld/llvmdev_1680642098205/work/llvm/include/llvm/ADT/ArrayRef.h:255: const T& llvm::ArrayRef<T>::operator[](size_t) const [with T = llvm::MachineInstr*; size_t = long unsigned int]: Assertion `Index < Length && "Invalid index!"' failed.
Fatal Python error: Aborted

Current thread 0x0000ffff943b2750 (most recent call first):
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/llvmlite/binding/ffi.py", line 152 in __call__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/llvmlite/binding/executionengine.py", line 92 in finalize_object
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/codegen.py", line 1060 in wrapper
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/codegen.py", line 999 in _finalize_specific
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/codegen.py", line 797 in _finalize_final_module
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/codegen.py", line 765 in finalize
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/codegen.py", line 567 in _ensure_finalized
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/codegen.py", line 989 in get_pointer_to_function
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/cpu.py", line 236 in get_executable
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typed_passes.py", line 495 in run_pass
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_machinery.py", line 273 in check
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_machinery.py", line 311 in _runPass
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_lock.py", line 35 in _acquire_compile_lock
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_machinery.py", line 356 in run
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 494 in _compile_core
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 528 in _compile_bytecode
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 460 in compile_extra
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 742 in compile_extra
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 152 in _compile_core
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 139 in _compile_cached
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 125 in compile
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 965 in compile
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 363 in get_call_template
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/types/functions.py", line 541 in get_call_type
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typing/templates.py", line 817 in _build_impl
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typing/templates.py", line 713 in _get_impl
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typing/templates.py", line 614 in generic
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typing/templates.py", line 351 in apply
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/types/functions.py", line 308 in get_call_type
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typing/context.py", line 248 in _resolve_user_function_type
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typing/context.py", line 196 in resolve_function_type
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typeinfer.py", line 1557 in resolve_call
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typeinfer.py", line 601 in resolve
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typeinfer.py", line 578 in __call__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typeinfer.py", line 155 in propagate
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typeinfer.py", line 1078 in propagate
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typed_passes.py", line 88 in type_inference_stage
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/typed_passes.py", line 110 in run_pass
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_machinery.py", line 273 in check
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_machinery.py", line 311 in _runPass
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_lock.py", line 35 in _acquire_compile_lock
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler_machinery.py", line 356 in run
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 494 in _compile_core
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 528 in _compile_bytecode
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 460 in compile_extra
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/compiler.py", line 742 in compile_extra
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 152 in _compile_core
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 139 in _compile_cached
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 125 in compile
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 965 in compile
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/core/dispatcher.py", line 420 in _compile_for_args
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/tests/test_dyn_array.py", line 1421 in test_2d
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/unittest/case.py", line 633 in _callTestMethod
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/unittest/case.py", line 676 in run
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/unittest/case.py", line 736 in __call__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/main.py", line 701 in __call__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/pool.py", line 125 in worker
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/process.py", line 108 in run
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/process.py", line 315 in _bootstrap
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/popen_fork.py", line 75 in _launch
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/popen_fork.py", line 19 in __init__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/context.py", line 277 in _Popen
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/process.py", line 121 in start
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/pool.py", line 326 in _repopulate_pool_static
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/pool.py", line 303 in _repopulate_pool
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/pool.py", line 212 in __init__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/multiprocessing/context.py", line 119 in Pool
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/main.py", line 778 in _run_inner
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/unittest/runner.py", line 176 in run
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/main.py", line 831 in run
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/unittest/main.py", line 271 in runTests
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/main.py", line 362 in run_tests_real
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/main.py", line 377 in runTests
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/unittest/main.py", line 101 in __init__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/main.py", line 204 in __init__
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/__init__.py", line 54 in run_tests
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/testing/_runtests.py", line 25 in _main
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/site-packages/numba/runtests.py", line 9 in <module>
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/runpy.py", line 87 in _run_code
  File "/raid/mambaforge/envs/numba-9001-pip/lib/python3.8/runpy.py", line 194 in _run_module_as_main

@gmarkall
Copy link
Member Author

gmarkall commented Jun 7, 2023

One failure mode is that it seems to be attempting to resolve a relocation for an ADRP instruction to _Py_NoneStruct in the GOT which is more than 4GB away - assembly from GDB disassembly and Numba dump:

259081         .type   _ZN7cpython5numba5tests23test_array_manipulation19numpy_fill_diagonalB3v96B38c8tJ       TIcFKzyF2ILShI4CrgQElQb6HczSBAA_3dE5ArrayIdLi2E1C7mutable7alignedE5ArrayIxLi2E1F7mutable7alignedE       28omitted_28default_3dFalse_29,@function
259082 _ZN7cpython5numba5tests23test_array_manipulation19numpy_fill_diagonalB3v96B38c8tJTIcFKzyF2ILShI4C       rgQElQb6HczSBAA_3dE5ArrayIdLi2E1C7mutable7alignedE5ArrayIxLi2E1F7mutable7alignedE28omitted_28defa       ult_3dFalse_29:
259083         .cfi_startproc
259084         str     x29, [sp, #-32]!
259085         stp     x30, x19, [sp, #16]
259086         sub     sp, sp, #528
259087         .cfi_def_cfa_offset 560
259088         .cfi_offset w19, -8
259089         .cfi_offset w30, -16
259090         .cfi_offset w29, -32
259091         mov     x0, x1
259092         adrp    x8, :got:_Py_NoneStruct
259093         ldr     x8, [x8, :got_lo12:_Py_NoneStruct]
259094         str     x8, [sp, #336]
Dump of assembler code for function _ZN7cpython5numba5tests23test_array_manipulation19numpy_fill_diagonalB4v101B38c8tJTIcFKzyF2ILShI4CrgQElQb6HczSBAA_3dE5ArrayIdLi2E1C7mutable7alignedE5ArrayIxLi2E1A7mutable7alignedE28omitted_28default_3dFalse_29:
   0x0000fffe3396c1d0 <+0>:	str	x29, [sp, #-32]!
   0x0000fffe3396c1d4 <+4>:	stp	x30, x19, [sp, #16]
   0x0000fffe3396c1d8 <+8>:	sub	sp, sp, #0x210
   0x0000fffe3396c1dc <+12>:	mov	x0, x1
   0x0000fffe3396c1e0 <+16>:	adrp	x8, 0xfffe3396c000 <_ZN5numba5tests23test_array_manipulation19numpy_fill_diagonalB4v101B38c8tJTIcFKzyF2ILShI4CrgQElQb6HczSBAA_3dE5ArrayIdLi2E1C7mutable7alignedE5ArrayIxLi2E1A7mutable7alignedE28omitted_28default_3dFalse_29>
   0x0000fffe3396c1e4 <+20>:	ldr	x8, [x8]
   0x0000fffe3396c1e8 <+24>:	str	x8, [sp, #336]

The target to write the relocation:

(gdb) p TargetPtr
$1 = (uint32_t *) 0xfffe3396c1e0 <cpython::numba::tests::test_array_manipulation::numpy_fill_diagonal[abi:v101][abi:c8tJTIcFKzyF2ILShI4CrgQElQb6HczSBAA_3d](Array<double, 2, C, mutable, aligned>, Array<long long, 2, A, mutable, aligned>, omitted_28default_3dFalse_29)+16>

The value to insert:

(gdb) p/x Value
$3 = 0xffffa7921c20

the distance between them:

In [24]: (0xffffa7921c20 - 0xfffe3396c1e0) / 1048576
Out[24]: 5951.709533691406

... almost 6GB apart, which is too far even for the large code model - see the table in https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#code-models

@gmarkall
Copy link
Member Author

I now have a simpler reproducer, which works with Numba and llvmlite main built on my Jetson AGX Xavier against the Numba llvmdev package, running on a "plain" Ubuntu 20.04 (i.e. no docker, no additional tools / libraries needed):

from numba import njit

@njit
def f(x, y):
    return x + y

i = 0

while True:
    print(i)
    t = tuple(range(i))
    f(t, (1j,))
    i += 1

executed with:

$ ulimit -s 1048576
$ python repro.py

gives:

0
1
2
3
4
5
6
7
8
9
python: /opt/conda/conda-bld/llvmdev_1684517249134/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.
Aborted (core dumped)

@gmarkall
Copy link
Member Author

Repository for reproducer work: https://github.com/gmarkall/numba-issue-9001

I now have a repro that only uses NumPy and llvmlite, not Numba, which is the present state recorded in that repo.

@zansibal
Copy link

zansibal commented Nov 2, 2023

Hi, I don't know if this is helpful, but I thought I'd share my experience of this issue. I have come across the issue with my code both on the Ampere 80 core machine running Ubuntu and on AWS Graviton arm64 instances running Amazon Linux 2. On smaller AWS instances (like 8 cores) the problem is less frequent than on larger instances (like 32/64 cores or the Ampere 80). Perhaps this is because smaller instances have less RAM (8 GB vs 128 GB)?

I can reproduce the numba-issue-9001 code on AWS m7gd.2xlarge after only 5 iterations.

@gmarkall
Copy link
Member Author

gmarkall commented Nov 3, 2023

Thanks for the input - when there's a lot of RAM it does seem that it's easier for the issue to manifest in our experience too.

There's also a start of a discussion on the LLVM Discourse regarding a potential fix: https://discourse.llvm.org/t/llvm-rtdyld-aarch64-abi-relocation-restrictions/74616

gmarkall added a commit to gmarkall/llvmlite that referenced this issue Nov 10, 2023
Based on the Impala memory manager:

MikaelSmith/impala@ac8561b

This allows the test suite to pass but still does not fix
numba/numba#9001.
@gmarkall
Copy link
Member Author

@zansibal It's early work at the moment, but if you're able to build llvmlite from source with the PR numba/llvmlite#1009, and let me know whether you still observe the issue with it (or observe any other issues) that would be good feedback - hopefully this resolves the issue, but there's a lot of testing / review to be done to have confidence in the strategy.

@gmarkall
Copy link
Member Author

This was fixed by numba/llvmlite#1009.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants