M1 LLVM Runtimedyld Invalid page reloc value assertion error #8567

sklam · 2022-11-02T00:32:31Z

We are seeing a LLVM Assertion error occurring randomly in our build farm.

The error message is:

Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value."), function encodeAddend, file /path/to/conda-bld/llvmdev_1643905487494/work/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldMachOAArch64.h, line 210

Earliest report is from gitter on July 15, 2022

The error can be triggered with the below script on bdb2384. The error usually occurs within 10 iteration.

!python setup.py build_ext --inplace 
c = 0
_exit_code = 0
tests = """
numba.tests.test_stencils.TestManyStencils.test_basic40
numba.tests.test_stencils.TestManyStencils.test_basic70
numba.tests.test_array_constants.TestConstantArray.test_too_big_to_freeze
numba.tests.test_array_manipulation.TestArrayManipulation.test_fill_diagonal_basic
""".split()
cmdarg = ' '.join(tests)
while _exit_code == 0 and c < 150:
    print(f"c={c}".center(80, '='))
    !NUMBA_OPT=0 python -m unittest -vb $cmdarg $cmdarg
    c += 1
    print(f"exit={_exit_code}")
    assert _exit_code == 0

The error occurs in both LLVM 11 and LLVM 14.

The current hypothesis is that the LLVM Runtimedyld is mishandling far jumps. To relate this to the reproducer above, the situation can be created by:

first JITing some stencil kernels, which tend to be large and esp. larger when OPT=0
allocating large amount of memory as in test_too_big_to_freeze (the compilation and execution bits in the tests can be commented out and it will still trigger the error)
JITing more array operations as in test_fill_diagonal_basic. The assertion error occurs here. The guess is that JITed code emitted for the stencil tests are reused here. The large allocation in between help make sure there is a gap/fragmentation in the memory space such that the fill_diagonal functions are JITed in somewhere far away.

Julia devs is pointing to a broken large code model in LLVM Runtimedyld for MachO aarch64. See JuliaLang/julia#42295 (comment), JuliaLang/julia#43664.

The text was updated successfully, but these errors were encountered:

Francyrad · 2022-11-14T22:55:31Z

Dear users, i'm trying to run a script for a python3 program and i get this error 9/10. My script works, but randomly. When the code is too long, sometimes the error appears, some times it doesn't. So i have to run the script multiple times in the hope that it arrives at an end. I paste you the error that i get:

Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value."), function encodeAddend, file /Users/ci/miniconda3-arm64/conda-bld/llvmdev_1643905487494/work/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldMachOAArch64.h, line 210.
zsh: abort python3 test2.py

i'm running in a macbookPro with M1Pro.
I have no idea how to solve this error, i don't even have idea if i can really solve it or if it depends from llvm. Do you know something about that?
Thanks in advance, i hope in an answer...

esc · 2022-11-15T14:25:19Z

Dear users, i'm trying to run a script for a python3 program and i get this error 9/10. My script works, but randomly. When the code is too long, sometimes the error appears, some times it doesn't. So i have to run the script multiple times in the hope that it arrives at an end. I paste you the error that i get:

Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value."), function encodeAddend, file /Users/ci/miniconda3-arm64/conda-bld/llvmdev_1643905487494/work/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldMachOAArch64.h, line 210. zsh: abort python3 test2.py

i'm running in a macbookPro with M1Pro. I have no idea how to solve this error, i don't even have idea if i can really solve it or if it depends from llvm. Do you know something about that? Thanks in advance, i hope in an answer...

@Francyrad thank you for asking about this. You do indeed encounter the same error as reported in this issue. You may consider Numba (more precisely, LLVM) to be broken on M1. There is currently no known fix or workaround and we are not sure if this has been reported upstream to LLVM or if there is a fix in progress. IIRC @sklam also checked LLVM 14 and it appears as though this has not be fixed. My only remaining guess here would be to try to run your script in a docker container on the M1 using a linux-aarch64 docker image. Performance should not be too bad as the hardware will not be simulated in this case. Note however, that I am guessing at this and it may very well also not work.

TL:DR Running Numba on an M1 may cause the segfaults you see above and the only known workaround is to use different hardware.

Francyrad · 2022-11-15T15:04:07Z

@esc thank you for your answer, i hope someone will be able to fix that. Please, let me know when it will be fixed commenting this issue

Thank you again

esc · 2022-11-15T15:06:36Z

@esc thank you for your answer, i hope someone will be able to fix that. Please, let me know when it will be fixed commenting this issue

Thank you again

Yes, we hope so too, if you subscribe to this issue, you will receive updates regarding this quest.

sfc-gh-jhu · 2023-03-11T07:27:45Z

Hi, is there any update on this? I'm under Python3.9 and LLVM 11.1.0 and M1 mac, and am having the same issue right now when running multi-processing of a forecast model (AutoCES) under statsforecast package.
I've tried to bootstrap dev versions of both numba (0.57.0.dev0+1257.gce69f3010) and llvmlite 0.40.0.dev0+70.ge6901e0) from github repos, and still failed and keep facing this issue.

It seems like the temporary fix by #8583 is not working for me.

I have other models tested without issues, but they're all with the numba in the backend to speed up the computing. The only difference that I can think of is this specific model using complex values rather than some real-number values.

With numba (0.46) and llvmlite(0.39), this exactly the same error is raised when running. However, with dev version of numba (0.57.0.dev0+1257.gce69f3010) and llvmlite 0.40.0.dev0+70.ge6901e0), basically the multiprocessing just stuck in the terminal without any errors raised. (But I'm pretty sure it's still the same issue)

Can anyone help here? Thanks @esc @sklam

Francyrad · 2023-03-11T09:20:19Z

I have still the issue. Sometimes I waste more of my time try to running my scripts instead of working

esc · 2023-03-11T15:06:17Z

It seems like the temporary fix by #8583 is not working for me.

#8583 only disables the tests so that we can complete the test-suite and ship the package, so it won't actually help with the issue.

Can anyone help here? Thanks @esc @sklam

No, unfortunately not, there is no known workaround, it's broken in LLVM 11 and 14 (supported by next Numba/llvmlite release). I am not aware of anyone working on a fix at present, so your best bet for now will be to use non-M1/Apple silicon, i.e. change hardware. So sorry I don't have better news for you.

@sklam for reference, was this ever reported to the LLVM issue tracker and if so, can you post the issue ID please? Thank you.

iamlll · 2023-05-02T19:36:30Z

Just wanted to mention that I'm having the same issue on Mac M1, llvm-openmp 16.0.2 and llvmlite 0.40.0! I run into this issue when solving systems of PDEs using py-pde. I've subscribed to this issue and fingers crossed that it will get fixed in the near-future.

Francyrad · 2023-05-02T19:41:46Z

@iamlll another bug that I don't is that the parallelisation with OpenMP don't work with the following chips:

M1Pro, M1Max and M1Ultra

It works just with M1

Is there some llvm where is it possible to do some report?

esc · 2023-05-04T08:08:28Z

Just wanted to mention that I'm having the same issue on Mac M1, llvm-openmp 16.0.2 and llvmlite 0.40.0! I run into this issue when solving systems of PDEs using py-pde. I've subscribed to this issue and fingers crossed that it will get fixed in the near-future.

@iamlll The reason you are seeing this with llvmlite 0.40.0 is because it is based on LLVM 14 and that is indeed buggy.

Francyrad · 2023-05-04T16:16:38Z

buggy.

So how can we solve the problem
With OpenMP?

sklam · 2023-05-04T16:47:26Z

This a problem of the LLVM JIT that we are using (MCJIT) and we need to migrate to OrcJIT (numba/llvmlite#919) so we can use JitLink and hopefully that will fix it.

esc · 2023-05-04T18:50:49Z

buggy.

So how can we solve the problem With OpenMP?

This issue is about M1 LLVM Runtimedyld Invalid page reloc value assertion error -- you are inquiring about a different issue here. In order to keep the signal-to-noise low, please open a new issue with the OpenMP issues you are seeing, thank you!

mzient · 2023-05-22T08:47:18Z

The issue is not limited to Apple M1 or MacOS. We're seeing it on Neoverse-N1 running Ubuntu 20.04 ever since we've uprgraded to Numba 0.57. This is a server machine - and not just one. Unfortunately, we cannot downgrade Numba because we need CUDA 12.1 support.

Error message:

python: /root/miniconda3/envs/buildenv/conda-bld/llvmdev_1680642098205/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.

System info:

uname -a: Linux <hostname_redacted> 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Mon Aug 8 18:51:21 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux

cat /etc/os-release | grep PRETTY
  PRETTY_NAME="Ubuntu 20.04.4 LTS"


lscpu
  Architecture:                    aarch64
  CPU op-mode(s):                  32-bit, 64-bit
  Byte Order:                      Little Endian
  CPU(s):                          80
  On-line CPU(s) list:             0-79
  Thread(s) per core:              1
  Core(s) per socket:              80
  Socket(s):                       1
  NUMA node(s):                    1
  Vendor ID:                       ARM
  Model:                           1
  Model name:                      Neoverse-N1
  Stepping:                        r3p1
  Frequency boost:                 disabled
  CPU max MHz:                     3000.0000
  CPU min MHz:                     1000.0000
  BogoMIPS:                        50.00
  L1d cache:                       5 MiB
  L1i cache:                       5 MiB
  L2 cache:                        80 MiB
  NUMA node0 CPU(s):               0-79
  Vulnerability Itlb multihit:     Not affected
  Vulnerability L1tf:              Not affected
  Vulnerability Mds:               Not affected
  Vulnerability Meltdown:          Not affected
  Vulnerability Mmio stale data:   Not affected
  Vulnerability Retbleed:          Not affected
  Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
  Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
  Vulnerability Spectre v2:        Mitigation; CSV2, BHB
  Vulnerability Srbds:             Not affected
  Vulnerability Tsx async abort:   Not affected
  Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

free -m
                total        used        free      shared  buff/cache   available
  Mem:         514318        5282       73564           6      435471      504537
  Swap:          2047         213        1834

gmarkall · 2023-06-02T10:00:18Z

I can confirm being able to reproduce a similar issue on a non-M1 AArch64 - in general we can overflow relocations - the assertion is a little different because Linux on AArch64 us using RuntimeDyldELF and not RuntimDyldMachO, but I think the principle (and the root cause) is the same, but I need to investigate further to be sure. At present I'm reproducing with DALI like:

/opt/dali/dali/test/python# DALI_EXTRA_PATH=/opt/dali_extra python -m nose2 --verbose --plugin=nose2_test_timer.plugin --with-timer --timer-color --timer-top-n 20 -A '!slow' -s operator_1 test_numba_func
test_numba_func.test_multiple_ins ... ok
test_numba_func.test_split_images_col ... ok
test_numba_func.test_numba_func:1
[(10, 10, 10)], <class 'numpy.uint8'>, <function set_all_values_to_255_batch at ... ok
test_numba_func.test_numba_func:2
[(10, 10, 10)], <class 'numpy.uint8'>, <function set_all_values_to_255_sample a ... python: /root/miniconda3/envs/buildenv/conda-bld/llvmdev_1680642098205/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.
Aborted (core dumped)

and I need to figure out how to make a Numba-only reproducer.

I'm working on a system very similar to the one reported by @mzient in #8567 (comment) - just some small minor OS / kernel version differences.

gmarkall · 2023-06-02T10:25:29Z

I couldn't trigger this issue with @sklam's script from #8567 (comment), even after hundreds of runs on a Linux AArch64 system. However, the following (still using DALI, but without needing a test harness) does reproduce the issue pretty reliably:

test_standalone.py

import numpy as np
from nvidia.dali import pipeline_def
import nvidia.dali as dali
import nvidia.dali.fn as fn
import nvidia.dali.types as dali_types
from nvidia.dali.plugin.numba.fn.experimental import numba_function


def set_all_values_to_255_batch(out0, in0):
    out0[0][:] = 255


def set_all_values_to_255_sample(out0, in0):
    out0[:] = 255


def set_all_values_to_float_batch(out0, in0):
    out0[0][:] = 0.5


def set_all_values_to_float_sample(out0, in0):
    out0[:] = 0.5


def setup_change_out_shape(out_shape, in_shape):
    out0_shape = out_shape[0]
    in0_shape = in_shape[0]
    perm = [1, 2, 0]
    for sample_idx in range(len(out0_shape)):
        for d in range(len(perm)):
            out0_shape[sample_idx][d] = in0_shape[sample_idx][perm[d]]


def change_out_shape_batch(out0, in0):
    for sample_id in range(len(out0)):
        out0[sample_id][:] = 42


def change_out_shape_sample(out0, in0):
    out0[:] = 42


def get_data(shapes, dtype):
    return [np.empty(shape, dtype=dtype) for shape in shapes]


def get_data_zeros(shapes, dtype):
    return [np.zeros(shape, dtype=dtype) for shape in shapes]

@pipeline_def
def numba_func_pipe(shapes, dtype, run_fn=None, out_types=None, in_types=None,
                    outs_ndim=None, ins_ndim=None, setup_fn=None, batch_processing=None):
    data = fn.external_source(lambda: get_data(shapes, dtype), batch=True, device="cpu")
    return numba_function(
        data, run_fn=run_fn, out_types=out_types, in_types=in_types,
        outs_ndim=outs_ndim, ins_ndim=ins_ndim, setup_fn=setup_fn,
        batch_processing=batch_processing)


def _testimpl_numba_func(shapes, dtype, run_fn, out_types, in_types,
                         outs_ndim, ins_ndim, setup_fn, batch_processing, expected_out):
    batch_size = len(shapes)
    pipe = numba_func_pipe(
        batch_size=batch_size, num_threads=1, device_id=0,
        shapes=shapes, dtype=dtype,
        run_fn=run_fn, setup_fn=setup_fn, out_types=out_types,
        in_types=in_types, outs_ndim=outs_ndim, ins_ndim=ins_ndim,
        batch_processing=batch_processing)
    pipe.build()
    for _ in range(3):
        outs = pipe.run()
        for i in range(batch_size):
            out_arr = np.array(outs[0][i])
            assert np.array_equal(out_arr, expected_out[i])

def test_numba_func():
    # shape, dtype, run_fn, out_types,
    # in_types, out_ndim, in_ndim, setup_fn, batch_processing,
    # expected_out
    args = [
        ([(10, 10, 10)], np.uint8, set_all_values_to_255_batch, [dali_types.UINT8],
         [dali_types.UINT8], [3], [3], None, True,
         [np.full((10, 10, 10), 255, dtype=np.uint8)]),
        ([(10, 10, 10)], np.uint8, set_all_values_to_255_sample, [dali_types.UINT8],
         [dali_types.UINT8], [3], [3], None, None,
         [np.full((10, 10, 10), 255, dtype=np.uint8)]),
        ([(10, 10, 10)], np.float32, set_all_values_to_float_batch, [dali_types.FLOAT],
         [dali_types.FLOAT], [3], [3], None, True,
         [np.full((10, 10, 10), 0.5, dtype=np.float32)]),
        ([(10, 10, 10)], np.float32, set_all_values_to_float_sample, [dali_types.FLOAT],
         [dali_types.FLOAT], [3], [3], None, None,
         [np.full((10, 10, 10), 0.5, dtype=np.float32)]),
        ([(10, 20, 30), (20, 10, 30)], np.int64, change_out_shape_batch, [dali_types.INT64],
         [dali_types.INT64], [3], [3], setup_change_out_shape, True,
         [np.full((20, 30, 10), 42, dtype=np.int32),
          np.full((10, 30, 20), 42, dtype=np.int32)]),
        ([(10, 20, 30), (20, 10, 30)], np.int64, change_out_shape_sample, [dali_types.INT64],
         [dali_types.INT64], [3], [3], setup_change_out_shape, None,
         [np.full((20, 30, 10), 42, dtype=np.int32),
          np.full((10, 30, 20), 42, dtype=np.int32)]),
    ]

    for shape, dtype, run_fn, out_types, in_types, outs_ndim, ins_ndim, \
            setup_fn, batch_processing, expected_out in args:
        _testimpl_numba_func(
            shape, dtype, run_fn, out_types, in_types, outs_ndim, ins_ndim, \
            setup_fn, batch_processing, expected_out
        )

test_numba_func()

which gives this on almost every run:

$ python test_standalone.py 
python: /root/miniconda3/envs/buildenv/conda-bld/llvmdev_1680642098205/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.
Aborted (core dumped)

gmarkall · 2023-08-24T14:38:24Z

I can give you a script that is able to reproduce it quite often if that can help

Yes please!

Francyrad · 2023-08-24T14:48:44Z

I can give you a script that is able to reproduce it quite often if that can help

Yes please!

please write me in francyrad.info@gmail.com

The script and the file that you will read is quite big

carstenr · 2023-08-25T07:30:05Z

Alright, that means we got two large cases then to reporoduce. We will focus on reducing is as much as possible.

gmarkall · 2023-08-25T10:12:27Z

Alright, that means we got two large cases then to reporoduce. We will focus on reducing is as much as possible.

Another thought I think worth sharing - it should be possible to get to a reproducer that doesn't depend on Numba at all - if it's minimised as much as possible, it would just involve calls to llvmlite. (Or even simpler than that, a small C++ source that links to LLVM only, to even take llvmlite out of the loop - but I think the "just llvmlite" case would already be a good starting point)

carstenr · 2023-08-28T14:31:52Z

Might take a while to get there as our developers naturally have a strong python background. We will start with a minimal nixtla setup, which is where this popped up for us. And from there on we will work our way down.

PhilipVinc · 2023-10-19T15:52:40Z

Bump.

I am consistently seeing this on M1 Pro and M2. It's a bit involved, but it occurs with ~30% probability in my code.

Are you still looking for a reproducer @gmarkall ?

PhilipVinc · 2023-10-19T16:00:00Z

FYI by googling I noticed that when porting Julia to ARM they also hit the same bug. Look at JuliaLang/julia#36617 and search in the page for "Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value."),".

Apparently, if this can help at all, the PR that fixed the issue was JuliaLang/julia#43664 ...

Francyrad · 2023-10-19T16:01:56Z

FYI by googling I noticed that when porting Julia to ARM they also hit the same bug. Look at JuliaLang/julia#36617 and search in the page for "Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value."),".

Apparently, if this can help at all, the PR that fixed the issue was JuliaLang/julia#43664 ...

The problem is still present

gmarkall · 2023-10-19T16:16:26Z

Are you still looking for a reproducer @gmarkall ?

Luckily, and coincidentally, I was working on this today, and I now have a pretty good one, which I'm going to add to #9001 because I'm tackling the issue on Linux AArch64 at present.

In case you want to try it, it's:

from numba import njit

@njit
def f(x, y):
    return x + y

i = 0

while True:
    print(i)
    t = tuple(range(i))
    f(t, (1j,))
    i += 1

executed with:

$ ulimit -s 1048576
$ python repro.py

gives:

0
1
2
3
4
5
6
7
8
9
python: /opt/conda/conda-bld/llvmdev_1684517249134/work/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507: void llvm::RuntimeDyldELF::resolveAArch64Relocation(const llvm::SectionEntry&, uint64_t, uint64_t, uint32_t, int64_t): Assertion `isInt<33>(Result) && "overflow check failed for relocation"' failed.
Aborted (core dumped)

It'd be interesting to know if that also triggers the error on your Mac. You might need to do something similar to my ulimit invocation above to increase the stack limit.

PhilipVinc · 2023-10-19T16:35:04Z

I can't set the ulimit to such large numbers on Mac. it errors with

ulimit: value exceeds hard limit

The largest ulimit I can set is ulimit -s 65520

but it is not crashing for now...

gmarkall · 2023-10-19T16:47:07Z

What number did it get to before you stopped it?

PhilipVinc · 2023-10-19T16:48:16Z

340. I can let it run the whole night if you tell me it can be useful for you. Il 19 ott 2023, 6:47 PM +0200, Graham Markall ***@***.***>, ha scritto:

…

What number did it get to before you stopped it? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

gmarkall · 2023-10-19T16:48:54Z

That would be great if you could give it a go!

gmarkall · 2023-10-24T08:53:14Z

@PhilipVinc Is it still running? :-)

PhilipVinc · 2023-10-24T12:38:01Z

@gmarkall it crashes at 1001 but I think this is due to some check in numba itself?

999
1000
1001
Traceback (most recent call last):
  File "/Users/filippo.vicentini/Dropbox/Ricerca/Codes/Python/netket/repro.py", line 12, in <module>
  File "/Users/filippo.vicentini/Documents/pythonenvs/netket/python-3.11.2/lib/python3.11/site-packages/numba/core/dispatcher.py", line 471, in _compile_for_args
    error_rewrite(e, 'unsupported_error')
  File "/Users/filippo.vicentini/Documents/pythonenvs/netket/python-3.11.2/lib/python3.11/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.UnsupportedError: Failed in nopython mode pipeline (step: ensure features that are in use are in a valid form)
Tuple 'x' length must be smaller than 1000.
Large tuples lead to the generation of a prohibitively large LLVM IR which causes excessive memory pressure and large compile times.
As an alternative, the use of a 'list' is recommended in place of a 'tuple' as lists do not suffer from this problem.

File "repro.py", line 3:
<source missing, REPL/exec in use?>

EDIT: This is with ulimit -s 65520

gmarkall · 2023-10-25T09:28:01Z

@PhilipVinc Thanks - indeed, that was a Numba limitation. I think in #9001 and https://github.com/gmarkall/numba-issue-9001 we're getting close to a really good reproducer now, so there's probably no need for additional testing here - thanks for everything you've looked into so far :-)

gmarkall · 2023-11-03T06:58:54Z

LLVM discourse discussion started to discuss a potential fix: https://discourse.llvm.org/t/llvm-rtdyld-aarch64-abi-relocation-restrictions/74616

gmarkall · 2023-11-15T15:32:33Z

@Francyrad @PhilipVinc @carstenr It's early work at the moment, but if you're able to build llvmlite from source with the PR numba/llvmlite#1009, and let me know whether you still observe the issue with it (or observe any other issues) that would be good feedback - hopefully this resolves the issue, but there's a lot of testing / review to be done to have confidence in the strategy.

jacobjivanov · 2023-12-21T03:01:31Z

I have experienced this issue repeatedly over the past month, getting errors similar to the following for my ~150 line code for solving a specific PDE:

Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value."), function encodeAddend, file /Users/ci/miniconda3-arm64/conda-bld/llvmdev_1643905487494/work/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldMachOAArch64.h, line 210.

@gmarkall, I'm not quite sure how to build from source but am happy to try and test it out.

gmarkall · 2023-12-21T14:03:49Z

@jacobjivanov Thanks for sharing this info - fortunately you don't need to build from source to test the fix now, as it's part of the llvmlite 0.42 / Numba 0.59 release candidates. You can follow the instructions here to install the Numba and llvmlite release candidates: https://numba.discourse.group/t/ann-numba-0-59-0rc1-and-llvmlite-0-42-0rc1/2329

If you try this, I'd really appreciate if you can let me know whether it appears to have solved the issue for you.

jacobjivanov · 2023-12-21T21:47:24Z

@gmarkall, I can't confirm whether it'll ever fail, but it no longer fails for the particular script that would fail roughly 50% of the time previously. Ran it ~20 times with different initial conditions.

GeorgWa · 2023-12-22T10:59:42Z

@gmarkall Your work is greatly appreciated! Switching to the release candidate also solved the issue for one of our packages which would occasionally fail.

esc · 2024-02-05T08:33:12Z

With llvmlite now at 0.42.0 and the new memory manager merged, can we close this?

gmarkall · 2024-02-05T11:45:52Z

I've not heard of any reports of this issue manifesting in llvmlite 0.42, so I think so.

esc · 2024-02-05T16:42:01Z

I've not heard of any reports of this issue manifesting in llvmlite 0.42, so I think so.

Alright, let's put a proverbial checkmark behind this issue. We always have the option to re-open in case.

@gmarkall thank you again for the fix for this, it is much appreciated!

sklam added llvm LLVM related issues ISA: ARM Issue related to ARM ISA bug - segfault Bugs that cause SIGSEGV, SIGABRT, SIGILL, SIGBUS labels Nov 2, 2022

sklam added a commit to sklam/numba that referenced this issue Nov 7, 2022

Skip tests that contributes to M1 RuntimeDyLd Assertion error numba#8567

d84add6

sklam mentioned this issue Nov 7, 2022

Skip tests that contribute to M1 RuntimeDyLd Assertion error #8583

Merged

stuartarchibald mentioned this issue Jan 25, 2023

change the include style in _pymodule.h and remove unused or duplicate headers in two header files #8473

Merged

sklam mentioned this issue Feb 3, 2023

LLVM14 linux-aarch64 blocker #8738

Closed

stuartarchibald mentioned this issue Feb 23, 2023

Fix test suite #8776

Closed

gmarkall mentioned this issue Mar 14, 2023

IMPOSSIBLE to run long scripts that include llvm dependencies in M1 Macbooks. Assertion failed: (isInt<33>(Addend) && "Invalid page reloc value.") llvm/llvm-project#59604

Open

stuartarchibald mentioned this issue Mar 14, 2023

Add n_keys option to Dict.empty() #8183

Merged

esc mentioned this issue Mar 23, 2023

Can't install in environment using Python 3.11 #8844

Closed

mzient mentioned this issue May 22, 2023

Disable Numba CPU tests on AARCH64. NVIDIA/DALI#4862

Merged

18 tasks

esc mentioned this issue Jun 2, 2023

Fix incorrect byval and other attributes on LLVM 14 numba/llvmlite#950

Merged

gmarkall mentioned this issue Jun 5, 2023

Linux AArch64 RuntimeDyld relocation overflows (#8567 specific to Linux only) #9001

Closed

gmarkall mentioned this issue Nov 15, 2023

Fix relocation overflows by implementing preallocation in the memory manager numba/llvmlite#1009

Merged

lucidyan mentioned this issue Dec 25, 2023

datasets needs to be installed to import the library outlines-dev/outlines#382

Closed

esc closed this as completed Feb 5, 2024

gmarkall mentioned this issue Apr 30, 2024

Can AArch64 Numba tests be re-renabled? NVIDIA/DALI#5450

Open

1 task

gmarkall mentioned this issue May 30, 2024

Should we enable osx-arm64 testing conda-forge/numba-feedstock#143

Closed

M1 LLVM Runtimedyld Invalid page reloc value assertion error #8567

M1 LLVM Runtimedyld Invalid page reloc value assertion error #8567

Comments

sklam commented Nov 2, 2022 • edited Loading

Francyrad commented Nov 14, 2022 • edited Loading

esc commented Nov 15, 2022

Francyrad commented Nov 15, 2022

esc commented Nov 15, 2022

sfc-gh-jhu commented Mar 11, 2023 • edited Loading

Francyrad commented Mar 11, 2023

esc commented Mar 11, 2023

iamlll commented May 2, 2023

Francyrad commented May 2, 2023

esc commented May 4, 2023

Francyrad commented May 4, 2023

sklam commented May 4, 2023

esc commented May 4, 2023

mzient commented May 22, 2023 • edited Loading

gmarkall commented Jun 2, 2023

gmarkall commented Jun 2, 2023

gmarkall commented Aug 24, 2023

Francyrad commented Aug 24, 2023 • edited Loading

carstenr commented Aug 25, 2023

gmarkall commented Aug 25, 2023

carstenr commented Aug 28, 2023

PhilipVinc commented Oct 19, 2023 • edited Loading

PhilipVinc commented Oct 19, 2023

Francyrad commented Oct 19, 2023

gmarkall commented Oct 19, 2023

PhilipVinc commented Oct 19, 2023

gmarkall commented Oct 19, 2023

PhilipVinc commented Oct 19, 2023 via email

gmarkall commented Oct 19, 2023

gmarkall commented Oct 24, 2023

PhilipVinc commented Oct 24, 2023 • edited Loading

gmarkall commented Oct 25, 2023

gmarkall commented Nov 3, 2023

gmarkall commented Nov 15, 2023

jacobjivanov commented Dec 21, 2023

gmarkall commented Dec 21, 2023

jacobjivanov commented Dec 21, 2023

GeorgWa commented Dec 22, 2023

esc commented Feb 5, 2024

gmarkall commented Feb 5, 2024

esc commented Feb 5, 2024

sklam commented Nov 2, 2022 •

edited

Loading

Francyrad commented Nov 14, 2022 •

edited

Loading

sfc-gh-jhu commented Mar 11, 2023 •

edited

Loading

mzient commented May 22, 2023 •

edited

Loading

Francyrad commented Aug 24, 2023 •

edited

Loading

PhilipVinc commented Oct 19, 2023 •

edited

Loading

PhilipVinc commented Oct 24, 2023 •

edited

Loading