-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
need to decide on how to support FLOAT*16 for Fortran. #5
Comments
From the Fortran point of view: if we have access to binary32 and binary64 floating-point types, not having larger types is not a show-stopper, because:
|
Here is the current situation confirmed with a build of 6363793:
Both the front-end and libgfortran correctly detect and build for those kinds. From my point of view, this is standard-conforming under the Fortran standard. I note that we could in theory expose |
Given that gcc on x86_64/linux and aarch64/linux supports 128-bit floats, I think it would be very important to offer them on aarch64/darwin. Plus, there are a couple of big codes that I use/develop that rely on quad precision at certain points in the execution path. |
(as a writer of scientific and signal processing code) I am inclined to agree, there are times when intermediates for a double calc are easier to handle as quads than some numerical gymnastic. (As a toolchain developer) I have to point out that this is not just something that requires us to write a bit of library code and/or support in the compiler. Adding a different-size type requires ABI changes, C++ mangling, C++ library support and (in some way) libc support [think, printf]. GCC's Fortran is part of a suite of compilers, that share much common infrastructure. While we could adopt the [128b long double] ABI as per AAPCS (the aarch64 'official' ABI) .. in fact, Darwin/macOS has a different ABI (darwinpcs) that specifies 64b long double. It would be prudent, at least, to discuss the way forward with the Apple folks. |
/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128' #0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77 #1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190 #2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338 #3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370 #4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f) #5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24) #6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd) Differential Revision: https://reviews.llvm.org/D97263 Cherry-pick from 16ede0956cb1f4b692dfa619ccfa6ab1de28e19b.
I'll talk to some folks within Apple and see if we can add an explicit C ABI for _Float128 in the darwinpcs. Speaking personally, double-double is something that no language should foist on users unless they explicitly say they want it. Binding it to FLOAT*16 would be quite unfortunate. |
(some of) the Apple folks are already aware of the interest. what would be great is to minimise any additional divergence between the AAPCS and darwinpcs - every such difference is a maintenance burden (my guess is that's equally true for those responsible for maintaining the aarch64 LLVM backend). At least in terms of CC and type mangling etc that's very highly desirable (of course, if already have [64] libc++ symbols mangled as 'long double' then we will be obliged to use a different mangling for __float128, _Float128 (however it is eventually spelt). The transition from mlong-double-64 => mlong-double-128 was handled once before at the OS level with libc symbols with $LDBL128 appended. That worked (and continues to work) well on powerpc-darwin (the fact that the underlying implementation of the 128b version is double-double there is academic to the overall presentation of an API). |
note that if one did use double-double as a stop-gap, it would be mangled __ibm128 for compatibility - so would not actually tread on the toes of a final solution. It might be instructive to look at how fortran handles the two possible 128 long doubles for powerpc ... (there are more pressing code-gen issues on my agenda before this one tho, sadly). |
Given that aarch64-linux gcc supports TFmode (which it calls long double, as I understand), we should ideally use the same ABI as that (except in our case it would be called _float128). Nobody wants double-double/ibm128. |
gcc/ada/ * raise-gcc.c (__gnat_others_value): Remove const qualifier. (__gnat_all_others_value): Likewise. (__gnat_unhandled_others_value): Likewise. (GNAT_OTHERS): Cast to Exception_Id instead of _Unwind_Ptr. (GNAT_ALL_OTHERS): Likewise. (GNAT_UNHANDLED_OTHERS): Likewise. (Is_Handled_By_Others): Change parameter type to Exception_Id. (Language_For): Likewise. (Foreign_Data_For): Likewise. (is_handled_by): Likewise. Adjust throughout, remove redundant line and fix indentation. * libgnat/a-exexpr.adb (Is_Handled_By_Others): Remove pragma and useless qualification from parameter type. (Foreign_Data_For): Likewise. (Language_For): Likewise.
The fixed error is: ==21166==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60300000d900 #0 0x7367d7 in operator delete(void*, unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:172 #1 0x3b82e6e in pointer_equiv_analyzer::~pointer_equiv_analyzer() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:161 #2 0x3b83387 in hybrid_folder::~hybrid_folder() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:517 #3 0x3b83387 in execute_early_vrp /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:686 #4 0x1790611 in execute_one_pass(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2567 #5 0x1792003 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2656 #6 0x1792029 in execute_pass_list_1 /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2657 #7 0x179209f in execute_pass_list(function*, opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:2667 #8 0x178a5f3 in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:1773 #9 0x1792fac in do_per_function_toporder(void (*)(function*, void*), void*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/plugin.h:191 #10 0x1792fac in execute_ipa_pass_list(opt_pass*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/passes.c:3001 #11 0xc525fc in ipa_passes /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2154 #12 0xc525fc in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2289 #13 0xc5a096 in symbol_table::compile() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2269 #14 0xc5a096 in symbol_table::finalize_compilation_unit() /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/cgraphunit.c:2537 #15 0x1a7a17c in compile_file /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:482 #16 0x69c758 in do_compile /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2210 #17 0x69c758 in toplev::main(int, char**) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/toplev.c:2349 #18 0x6a932a in main /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/main.c:39 #19 0x7ffff7820b34 in __libc_start_main ../csu/libc-start.c:332 #20 0x6aa5fd in _start (/home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/objdir/gcc/cc1+0x6aa5fd) 0x60300000d900 is located 0 bytes inside of 32-byte region [0x60300000d900,0x60300000d920) allocated by thread T0 here: #0 0x735ab7 in operator new[](unsigned long) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/libsanitizer/asan/asan_new_delete.cpp:102 #1 0x3b82dac in pointer_equiv_analyzer::pointer_equiv_analyzer(gimple_ranger*) /home/marxin/BIG/buildbot/buildworker/marxinbox-gcc-asan/build/gcc/gimple-ssa-evrp.c:156 gcc/ChangeLog: * gimple-ssa-evrp.c (pointer_equiv_analyzer::~pointer_equiv_analyzer): Use delete[].
…atch.pd All of the optimizations/transformations mentioned in bugzilla for PR tree-optimization/40210 are already implemented in mainline GCC, with one exception. In comment #5, there's a suggestion that (bswap64(x)>>56)&0xff can be implemented without the bswap as (unsigned char)x, or equivalently x&0xff. This patch implements the above optimization, and closely related variants. For any single bit, (bswap(X)>>C1)&1 can be simplified to (X>>C2)&1, where bit position C2 is the appropriate permutation of C1. Similarly, the bswap can eliminated if the desired set of bits all lie within the same byte, hence (bswap(x)>>8)&255 can always be optimized, as can (bswap(x)>>8)&123. Previously, int foo(long long x) { return (__builtin_bswap64(x) >> 56) & 0xff; } compiled with -O2 to foo: movq %rdi, %rax bswap %rax shrq $56, %rax ret with this patch, it now compiles to foo: movzbl %dil, %eax ret 2021-07-08 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR tree-optimization/40210 * match.pd (bswap optimizations): Simplify (bswap(x)>>C1)&C2 as (x>>C3)&C2 when possible. Simplify bswap(x)>>C1 as ((T)x)>>C2 when possible. Simplify bswap(x)&C1 as (x>>C2)&C1 when 0<=C1<=255. gcc/testsuite/ChangeLog PR tree-optimization/40210 * gcc.dg/builtin-bswap-13.c: New test. * gcc.dg/builtin-bswap-14.c: New test.
Hi all. Are there any updates on this issue? |
no, the priority at the moment is on getting the base port ready for inclusion in gcc-12 |
Ok, thanks for the update. All your work on this port is very much appreciated. |
Fixes: ==129444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000666ca5c at pc 0x000000ef094b bp 0x7fffffff8180 sp 0x7fffffff8178 READ of size 4 at 0x00000666ca5c thread T0 #0 0xef094a in parse_optimize_options ../../gcc/d/d-attribs.cc:855 #1 0xef0d36 in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:916 #2 0xef107e in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:887 #3 0xff85b1 in decl_attributes(tree_node**, tree_node*, int, tree_node*) ../../gcc/attribs.c:829 #4 0xef2a91 in apply_user_attributes(Dsymbol*, tree_node*) ../../gcc/d/d-attribs.cc:427 #5 0xf7b7f3 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:1346 #6 0xf87bc7 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:967 #7 0xf87bc7 in DeclVisitor::visit(FuncDeclaration*) ../../gcc/d/decl.cc:808 #8 0xf83db5 in DeclVisitor::build_dsymbol(Dsymbol*) ../../gcc/d/decl.cc:146 for the following test-case: gcc/testsuite/gdc.dg/attr_optimize1.d. gcc/d/ChangeLog: * d-attribs.cc (parse_optimize_options): Check index before accessing cl_options.
Is it possible to an a configure option to tell GCC to use int128 as the ABI for binary128? What doesn't work in that scenario? |
I think the limitation is not “what do we chose” but “how do we make sure the choices made are consistent”. In a way, Apple “owns” this ABI and we probably cannot decide without them. |
Given that Apple doesn't have a Fortran compiler, I don't see how GCC could be inconsistent with their |
|
Can we at least converge on the idea that REAL*16 must be equivalent to REAL(kind=REAL128) and this must be binary128, ie that neither long double nor double double is relevant? That means we can connect directly to C/C++ float128. I think we're mostly here but I figure it's good to be unambiguous. |
" C/C++ float128 " is defined where and with what ABI? The version of the darwinpcs I am working from has no 128b float (C long double is 64 bits). Iff the port transitions to having a 128b float type, then I would expect C long double to be using that ( at least selectively with -mlong-double-128 as is done elsewhere ) .. at some point one would also expect that to become the default and then -mlong-double-64 would be provided as a legacy option. This is the kind of mechanism that has been used in the past. It seems that no-one is interested in using double-double, so that is a moot point (although as can be seen with the current PPC64 work, there are strategies for supporting both that and IEEE128 in the same toolchain) |
I'm not saying Apple has a C/C++ float128 ABI, I'm merely saying that mapping REAL*16 to that simplifies the problem and gives Apple more of a reason to care. I don't think long double is worth thinking about. It's a garbage feature of C that exists to expose x87 80-bit floats, and is beyond useless in any other context, because it's not even a platform stable ABI (Linux yes, but not always). |
My own take on this:
I disagree with that. On most targets, C long double is equal to double, at least with default options. We should keep it that way, as there may be code that expects it. In fact, I think we have to keep
Edit, to add clarification on that sqrtl example above: also, I don't think we should or can switch long double to something other than Apple's choice (long double = double), because then the library functions would not match anymore, and that's a nightmare (it happened on powerpc-darwin at some point, and created lot of trouble. |
Well, I understand that your focus is on Fortran - but some of us transitioned away from Fortran around the same time the VAX 11/785 was retired (not a co-incidence sometimes one has no choice on employer's decisions). We carried on writing numerical analysis, signal processing and simulation code (just now in C) .. if Fortran users have a requirement for 128b float, then so do c-family users ;) .. clearly, most of the use of aarch64 darwin has been under iOS and therefore has not seen any such desires(?) - but I'd think to move forward, the platform would have to adopt an 128b float type and agree ABI before we could plug it into fortran. That would mean that clang and the relevant parts of the system would be updated to deal with this; FWIW (despite that you recall pain associated) the approach used for PPC is solid and functional to this day (completely independent of what the 128bits represent). |
clarification: none of what I've said implies switching from Apple's choice - rather, my thesis is that we have to adhere to Apple's choice - if they elect not to adopt a 128b float (officially) then we might think again - but in the first instance, the right solution is to lobby for an official 128b float type on the platform. I do not think it significant that no Apple comments have been made on this thread (there is no reason for anyone from Apple to read it). |
It's my focus here because it's in the title of the issue, and I came here as a result of not being about to build Fortran applications on my Apple M1 laptop. |
That's not correct (#5 (comment)), although I know Steve isn't speaking with institutional authority here. |
I fear there is an amount of "talking past" going on here...
|
If Apple declines to respond, does that mean we can never fix this issue for users? I contend that Apple has had their chance to respond, and they have not, so GCC should do the right thing for its users and solve the problem using the best available option. I agree with your arguments that the best non-Apple solution is AAPCS. I also believe that the only reasonable choice here is
On the |
This is correct. Furthermore, Tim Northover correctly observed in an email correspondence:
So there's no need to "invent our own ABI (or use that of AAPCS)"; since the Darwin PCS document does not say anything about quad, the behavior is defined by AAPCS. |
@iains unless I misunderstand something, I think that ship has sailed, anyway. Apple's ABI (https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms) explicitly states:
|
which is what we implement .. (and does not make any difference to me as a toolchain maintainer) the comment was from me as a "user" of c-family : I do not see long double as an X87 thing, but then most of my DSP and numerical analysis coding was/is not on X86 platforms, it's still a nuisance to have to edit the source to use a different type for the extra precision. Anyway, we can agree to differ on this ..
|
I don't see why the long double ABI matters here. REAL128 is not long double on most systems. It's certainly not on an x86 or ARM ones. If Apple has no ABI for binary128 then we fall back on AAPCS. |
The libquadmath in GCC is designed to provide quad-precision math and I/O routines on targets for which
I'll try to come up with something. |
I have discussed this with the relevant folks, and the following is an official statement:
We're going to look into where it makes the most sense to document this other than a random post on a GitHub issue. I'll add a reference to that once it's available. |
thanks very much for the info! we are currently looking into (with some success mostly down to FX) in implementing __float128 (which automatically enables C's _Float128 in GCC) this is currently following the existing aarch64 backend ABI (which is AAPCS64). It might be necessary to document how we deal with variadic calls, since there are significant differences between darwinpcs and AAPCS64 there. |
Ok. If you need further clarification on variadic conventions, please let us know. |
I think we should be in a position to close this issue now (technically, the decision is made, and practically the latest sources support __float128 in c-family and IEEE QP in libgfortran). I am sure that there are wrinkles - but those can be dealt with separately.
My intention is that the technical content of the README.md will be added to the GCC port directory as documentation for how the Darwin sub-port differs from AAPCS64.
Unless anyone else has questions or additions I will close this in 24h or so. |
we have an implementation on the branch now. |
…imize or target pragmas [PR103012] The following testcases ICE when an optimize or target pragma is followed by a long line (4096+ chars). This is because on such long lines we can't use columns anymore, but the cpp_define calls performed by c_cpp_builtins_optimize_pragma or from the backend hooks for target pragma are done on temporary buffers and expect to get columns from whatever line they appear on (which happens to be the long line after optimize/target pragma), and we run into: #0 fancy_abort (file=0x3abec67 "../../libcpp/line-map.c", line=502, function=0x3abecfc "linemap_add") at ../../gcc/diagnostic.c:1986 #1 0x0000000002e7c335 in linemap_add (set=0x7ffff7fca000, reason=LC_RENAME, sysp=0, to_file=0x41287a0 "pr103012.i", to_line=3) at ../../libcpp/line-map.c:502 #2 0x0000000002e7cc24 in linemap_line_start (set=0x7ffff7fca000, to_line=3, max_column_hint=128) at ../../libcpp/line-map.c:827 #3 0x0000000002e7ce2b in linemap_position_for_column (set=0x7ffff7fca000, to_column=1) at ../../libcpp/line-map.c:898 #4 0x0000000002e771f9 in _cpp_lex_direct (pfile=0x40c3b60) at ../../libcpp/lex.c:3592 #5 0x0000000002e76c3e in _cpp_lex_token (pfile=0x40c3b60) at ../../libcpp/lex.c:3394 #6 0x0000000002e610ef in lex_macro_node (pfile=0x40c3b60, is_def_or_undef=true) at ../../libcpp/directives.c:601 #7 0x0000000002e61226 in do_define (pfile=0x40c3b60) at ../../libcpp/directives.c:639 #8 0x0000000002e610b2 in run_directive (pfile=0x40c3b60, dir_no=0, buf=0x7fffffffd430 "__OPTIMIZE__ 1\n", count=14) at ../../libcpp/directives.c:589 #9 0x0000000002e650c1 in cpp_define (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2513 #10 0x0000000002e65100 in cpp_define_unused (pfile=0x40c3b60, str=0x2f784d1 "__OPTIMIZE__") at ../../libcpp/directives.c:2522 #11 0x0000000000f50685 in c_cpp_builtins_optimize_pragma (pfile=0x40c3b60, prev_tree=<optimization_node 0x7fffea042000>, cur_tree=<optimization_node 0x7fffea042020>) at ../../gcc/c-family/c-cppbuiltin.c:600 assertion that LC_RENAME doesn't happen first. I think the right fix is emit those predefined macros upon optimize/target pragmas with BUILTINS_LOCATION, like we already do for those macros at the start of the TU, they don't appear in columns of the next line after it. Another possibility would be to force them at the location of the pragma. 2021-12-30 Jakub Jelinek <jakub@redhat.com> PR c++/103012 gcc/ * config/i386/i386-c.c (ix86_pragma_target_parse): Perform cpp_define/cpp_undef calls with forced token locations BUILTINS_LOCATION. * config/arm/arm-c.c (arm_pragma_target_parse): Likewise. * config/aarch64/aarch64-c.c (aarch64_pragma_target_parse): Likewise. * config/s390/s390-c.c (s390_pragma_target_parse): Likewise. gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Perform cpp_define_unused/cpp_undef calls with forced token locations BUILTINS_LOCATION. gcc/testsuite/ PR c++/103012 * g++.dg/cpp/pr103012.C: New test. * g++.target/i386/pr103012.C: New test.
…04617] On #define A(n) int foo1##n(void) { return 1##n; } #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9) #define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) D(n##5) D(n##6) D(n##7) D(n##8) D(n##9) E(0) E(1) E(2) D(30) D(31) C(320) C(321) C(322) C(323) C(324) C(325) B(3260) B(3261) B(3262) B(3263) A(32640) A(32641) A(32642) testcase with ./xgcc -B ./ -c -g -fpic -ffat-lto-objects -flto -O0 -o foo1.o foo1.c -ffunction-sections ./xgcc -B ./ -shared -g -fpic -flto -O0 -o foo1.so foo1.o /tmp/ccTW8mBm.debug.temp.o: file not recognized: file format not recognized (testcase too slow to be included into testsuite). The problem is clearly reported by readelf: readelf: foo1.o.debug.temp.o: Warning: Section 2 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 5 has an out of range sh_link value of 65321 readelf: foo1.o.debug.temp.o: Warning: Section 10 has an out of range sh_link value of 65323 readelf: foo1.o.debug.temp.o: Warning: [ 2]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [ 5]: Link field (65321) should index a symtab section. readelf: foo1.o.debug.temp.o: Warning: [10]: Link field (65323) should index a string section. because simple_object_elf_copy_lto_debug_sections doesn't adjust sh_info and sh_link fields in ElfNN_Shdr if they are in between SHN_{LO,HI}RESERVE inclusive. Not adjusting those is incorrect though, SHN_{LO,HI}RESERVE range is only relevant to the 16-bit fields, mainly st_shndx in ElfNN_Sym where if one needs >= SHN_LORESERVE section number, SHN_XINDEX should be used instead and .symtab_shndx section should contain the real section index, and in ElfNN_Ehdr e_shnum and e_shstrndx fields, where if >= SHN_LORESERVE value is needed it should put those into Shdr[0].sh_{size,link}. But, sh_{link,info} are 32-bit fields which can contain any section index. Note, as simple-object-elf.c mentions, binutils from 2.12 to 2.18 (so before 2011) used to mishandle the > 63.75K sections case and assumed there is a hole in between the sections, but what simple_object_elf_copy_lto_debug_sections does wouldn't help in that case for the debug temp object creation, we'd need to detect the case also in that routine and take it into account in the remapping etc. I think it is not worth it given that it is over 10 years, if somebody needs 63.75K or more sections, better use more recent binutils. 2022-02-22 Jakub Jelinek <jakub@redhat.com> PR lto/104617 * simple-object-elf.c (simple_object_elf_match): Fix up URL in comment. (simple_object_elf_copy_lto_debug_sections): Remap sh_info and sh_link even if they are in the SHN_LORESERVE .. SHN_HIRESERVE range (inclusive).
Here we weren't respecting SFINAE when evaluating a call to a consteval function, which caused us to reject the new testcase below. This patch fixes this by making build_over_call use the SFINAE-friendly version of cxx_constant_value. This change causes us to no longer diagnose ahead of time a couple of non-constant non-dependent consteval calls in consteval-if2.C with -fchecking=2. These errors were apparently coming from the call to fold_non_dependent_expr in build_non_dependent_expr (for the RHS of the +=) despite complain=tf_none being passed. Now that build_over_call respects the value of complain during constant evaluation of a consteval call, the errors are gone. That the errors are also gone without -fchecking=2 is a regression caused by r12-7264-gc19f317a78c0e4 and is the subject of PR104620. As described in comment #5, I think it's basically an accident that we were diagnosing these two calls correctly before r12-7264, so perhaps we can live without these errors for GCC 12. Thus this patch just XFAILs the two tests. PR c++/104620 gcc/cp/ChangeLog: * call.cc (build_over_call): Use cxx_constant_value_sfinae instead of cxx_constant_value to evaluate a consteval call. * constexpr.cc (cxx_constant_value_sfinae): Add decl parameter and pass it to cxx_eval_outermost_constant_expr. * cp-tree.h (cxx_constant_value_sfinae): Add decl parameter. * pt.cc (fold_targs_r): Pass NULL_TREE as decl parameter to cxx_constant_value_sfinae. gcc/testsuite/ChangeLog: * g++.dg/cpp23/consteval-if2.C: XFAIL two dg-error tests where the argument to the non-constant non-dependent consteval call is wrapped by NON_DEPENDENT_EXPR. * g++.dg/cpp2a/consteval30.C: New test.
I noticed that for member class templates of a class template we were unnecessarily substituting both the template and its type. Avoiding that duplication speeds compilation of this silly testcase from ~12s to ~9s on my laptop. It's unlikely to make a difference on any real code, but the simplification is also nice. We still need to clear CLASSTYPE_USE_TEMPLATE on the partial instantiation of the template class, but it makes more sense to do that in tsubst_template_decl anyway. #define NC(X) \ template <class U> struct X##1; \ template <class U> struct X##2; \ template <class U> struct X##3; \ template <class U> struct X##4; \ template <class U> struct X##5; \ template <class U> struct X##6; #define NC2(X) NC(X##a) NC(X##b) NC(X##c) NC(X##d) NC(X##e) NC(X##f) #define NC3(X) NC2(X##A) NC2(X##B) NC2(X##C) NC2(X##D) NC2(X##E) template <int I> struct A { NC3(am) }; template <class...Ts> void sink(Ts...); template <int...Is> void g() { sink(A<Is>()...); } template <int I> void f() { g<__integer_pack(I)...>(); } int main() { f<1000>(); } gcc/cp/ChangeLog: * pt.cc (instantiate_class_template): Skip the RECORD_TYPE of a class template. (tsubst_template_decl): Clear CLASSTYPE_USE_TEMPLATE.
Consider constexpr int VAL = 1; struct foo { template <int B> void bar(typename std::conditional<B==VAL, int, float>::type arg) { } }; template void foo::bar<1>(int arg); where we since r11-291 fail to emit the code for the explicit instantiation. That's because cp_walk_subtrees/TYPENAME_TYPE now walks TYPE_CONTEXT ('conditional' here) as well, and in a template finds the B==VAL template argument. VAL is constexpr, which implies const, which in the global scope implies static. constrain_visibility_for_template then makes "struct conditional<(B == VAL), int, float>" non-TREE_PUBLIC. Then symtab_node::needed_p checks TREE_PUBLIC, sees it's 0, and we don't emit any code. I thought the fix would be some ODR-esque check to not consider constexpr variables/fns that are used just for their value. But it turned out to be tricky. For instance, we can't skip determine_visibility in a template; we can't even skip it for value-dep expressions. For example, no-linkage-expr1.C has using P = struct {}*; template <int N> void f(int(*)[((P)0, N)]) {} where ((P)0, N) is value-dep, but N is not relevant here: we have to ferret out the anonymous type. When instantiating, it's already gone. This patch uses decl_constant_var_p. This is to implement (an approximation) [basic.def.odr]#14.5.1 and [basic.def.odr]#5.2. PR c++/110323 gcc/cp/ChangeLog: * decl2.cc (min_vis_expr_r) <case VAR_DECL>: Do nothing for decl_constant_var_p VAR_DECLs. gcc/testsuite/ChangeLog: * g++.dg/template/explicit-instantiation6.C: New test. * g++.dg/template/explicit-instantiation7.C: New test.
Unfortunately, the current arm64 iOS/macOS long double is 64 bit - we need to figure out some reasonable way to give users access to a larger version.
immediate options:
The text was updated successfully, but these errors were encountered: