Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: Occasional "internal compiler error" with gcc on coverage CI runs #18529

Closed
mattip opened this issue Mar 2, 2021 · 16 comments
Closed

BLD: Occasional "internal compiler error" with gcc on coverage CI runs #18529

mattip opened this issue Mar 2, 2021 · 16 comments
Labels
26 - Compiler 36 - Build Build related PR 57 - Close? Issues which may be closable unless discussion continued

Comments

@mattip
Copy link
Member

mattip commented Mar 2, 2021

I thought we had an issue open for this but cannot find it now. Occasionally, CI jobs fail like this one with

gcc: build/src.linux-x86_64-3.7/numpy/core/src/multiarray/einsum_sumprod.c
during IPA pass: profile
numpy/core/src/multiarray/einsum_sumprod.c.src: In function ‘longdouble_sum_of_products_contig_three’:
numpy/core/src/multiarray/einsum_sumprod.c.src:1264:1: internal compiler error: in coverage_begin_function, at coverage.c:656
 1264 | }
      | ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-9/README.Bugs> for instructions.

Note it is in the coverage compilation of the generated code, in the "full" CI run with gcc 9.3.0

@mattip mattip added 26 - Compiler 36 - Build Build related PR labels Mar 2, 2021
@mattip mattip changed the title Occasional "internal compiler error" with gcc on CI runs BLD: Occasional "internal compiler error" with gcc on CI runs Mar 2, 2021
@mattip mattip changed the title BLD: Occasional "internal compiler error" with gcc on CI runs BLD: Occasional "internal compiler error" with gcc on coverage CI runs Mar 2, 2021
@seberg
Copy link
Member

seberg commented Mar 11, 2021

So, I wanted to run coverage locally, and ran into this... And I actually found out how a "workaround": If I remove all #line's the file it compiles fine even with coverage enabled! But I have no clue if there is an error in the #line or whether this is a gcc bug, I guess the latter is totally reasonable. I am on gcc (Debian 10.2.1-6) 10.2.1 20210110.

Might be time to report upstream with this info, although I got no clue if its possible to make this "minimal" :(.

@charris
Copy link
Member

charris commented Mar 11, 2021

I think it is worth reporting upstream if only to discover if it is intentional.

@seberg
Copy link
Member

seberg commented Mar 11, 2021

Hmmm, was about to, but then noticed this one which might be a duplicate: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95847 I am wondering if our #line directives are somehow inconsistent.

EDIT: NVM, if it is the same bug (and it matches practically perfect aside from being C not fortran). In that case it is fairly likely already fixed in all gcc major versions, but not released yet :(.

@charris
Copy link
Member

charris commented Mar 11, 2021

We do have repeated line numbers in the generated C code because they refer to lines the template file.

@seberg
Copy link
Member

seberg commented Mar 11, 2021

Yeah, but we do that everywhere. I thought for a bit we may be doing something subtly wrong, but I now doubt it.

The next thing I would try is whether the development 11.x or 10.x versions of gcc fix, but I don't feel like digging into getting a dev version of gcc running.

@seberg
Copy link
Member

seberg commented Mar 11, 2021

Ah OK, we can actually learn one other thing from it, but I guess we knew that before: If CI starts to fail again, we probably just have to make sure to use a GCC 8.x (until a gcc 10.3 or 11 is available).

@seberg
Copy link
Member

seberg commented May 26, 2021

@charris since you are already on GCC 11, could you try if:

python3.9 runtests.py --gcov

doesn't fail to compile for you?

@charris
Copy link
Member

charris commented May 27, 2021

@seberg Yes, it fails for me also.

numpy/core/src/multiarray/einsum_sumprod.c.src:1264:1: internal compiler error: in coverage_begin_function, at coverage.c:662
 1264 | }
      | ^

There are also a bunch of warnings.

@seberg
Copy link
Member

seberg commented May 27, 2021

:(, I guess that means we have to report it as a gcc bug if its not fixed in the 11.x branch.

@seberg
Copy link
Member

seberg commented May 27, 2021

Created a bug report here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100788 not sure what the GCC folks need with respect to minimal reproducer, but lets see.

@seberg
Copy link
Member

seberg commented Jun 11, 2021

It seems GCC will fix this (remove an assert which leads to the failure). In theory, there should be something wrong with the #line directives that we are adding. I.e. apparently the line directives cause the function longdouble_sum_of_products_contig_three to end before it starts (based on the line numbering). (I think I can make sense of it #line directives are ignored when enclosed by #if 0? But not sure.)

So there may be ways to work around it in NumPy as well, if we run into it more at some point.

@seberg
Copy link
Member

seberg commented Aug 13, 2021

This should be fixed now in the next releases of GCC 10 and 11 (whenever those are). So hopefully, whenever our CI runs into the issue more often, we can skip gcc 9 and jump to one of those fixed versions directly.

@seberg seberg added the 57 - Close? Issues which may be closable unless discussion continued label Aug 13, 2021
@charris
Copy link
Member

charris commented Aug 13, 2021

python3.9 runtests.py --gcov still fails with gcc 11.2, but that may not be recent enough.

@seberg
Copy link
Member

seberg commented Aug 13, 2021

Yeah, it was just now backported, 11.3 and whatever the next 10.x is (if it comes) should be fine.

@charris
Copy link
Member

charris commented Aug 13, 2021

I also note new mismatched boundwarnings with ordinary compiles that also seem gcc 11 related.

@seberg
Copy link
Member

seberg commented Nov 6, 2021

Closing, it now works for me. Oddly enough, on a gcc 11.2.0 (but with a few debian patches, so who knows). Hopefully, we can get around doing some gcc version dance on CI though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
26 - Compiler 36 - Build Build related PR 57 - Close? Issues which may be closable unless discussion continued
Projects
None yet
Development

No branches or pull requests

3 participants