Merge upstream, 2024-03-09 #2944

tschwinge · 2024-04-10T12:08:54Z

Merging in several stages, towards #2802 and further.

This must of course not be rebased by GitHub merge queue, but has to become a proper Git merge. (I'll handle that, once ready.)

In some cases exits can lack LC PHI nodes for the virtual operand. We have to create them when the epilog loop requires them which also allows us to remove some only halfway correct fixups. This is the variant triggering for alternate exits. PR tree-optimization/114099 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Create and fill in a needed virtual LC PHI for the alternate exits. Remove code dealing with that missing. * gcc.dg/vect/vect-early-break_120-pr114099.c: New testcase.

* sv.po, zh_CN.po: Update.

The PR complains that for the __builtin_stdc_bit_* "builtins" the diagnostics doesn't mention the name of the builtin the user used, but instead __builtin_{clz,ctz,popcount}g instead (which is what the FE immediately lowers it to). The following patch repeats the checks from check_builtin_function_arguments which are there done on BUILT_IN_{CLZ,CTZ,POPCOUNT}G, such that they diagnose it with the name of the "builtin" user actually used before it is gone. 2024-02-26 Jakub Jelinek <jakub@redhat.com> PR c/114042 * c-parser.cc (c_parser_postfix_expression): Diagnose __builtin_stdc_bit_* argument with ENUMERAL_TYPE or BOOLEAN_TYPE type or if signed here rather than on the replacement builtins in check_builtin_function_arguments. * gcc.dg/builtin-stdc-bit-2.c: Adjust testcase for actual builtin names rather than names of builtin replacements.

…ata section [PR113617] If default_elf_select_rtx_section is called to put a reference to some local symbol defined in a comdat section into memory, which happens more often since the r14-4944 RA change, linking might fail. default_elf_select_rtx_section puts such constants into .data.rel.ro.local etc. sections and if linker chooses comdat sections from some other TU and discards the one to which a relocation in .data.rel.ro.local remains, linker diagnoses error. References to private comdat symbols can only appear from functions or data objects in the same comdat group, so the following patch arranges using .data.rel.ro.local.pool.<comdat_name> and similar sections. 2024-02-26 Jakub Jelinek <jakub@redhat.com> H.J. Lu <hjl.tools@gmail.com> PR rtl-optimization/113617 * varasm.cc (default_elf_select_rtx_section): For references to private symbols in comdat sections use .data.relro.local.pool.<comdat>, .data.relro.pool.<comdat> or .rodata.<comdat> comdat sections. * g++.dg/other/pr113617.C: New test. * g++.dg/other/pr113617.h: New test. * g++.dg/other/pr113617-aux.cc: New test.

…R114012] PR fortran/114012 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_conv_procedure_call): Evaluate non-trivial arguments just once before assigning to an unlimited polymorphic dummy variable. gcc/testsuite/ChangeLog: * gfortran.dg/pr114012.f90: New test.

gcc/ * config/avr/avr.cc (avr_out_compare) [AVR_TINY]: Remove code in an "if avr_adiw_reg_p()" block that's dead for AVR_TINY.

Some options that are pure optimizations where not tagged as such. gcc/ * config/avr/avr.opt (mcall-prologues, mrelax, maccumulate-args) (mstrict-X): Tag as "Optimization".

gcc.dg/attr-weakref-1.c FAILs on 32 and 64-bit Solaris/x86 with the native assembler: FAIL: gcc.dg/attr-weakref-1.c (test for excess errors) UNRESOLVED: gcc.dg/attr-weakref-1.c compilation failed to produce executable Excess errors: Assembler: attr-weakref-1.c "/var/tmp//ccUSaysF.s", line 171 : Multiply defined symbol: "Wv3a" This is a bug in the native as, which isn't seeing fixes recently. Since only a single subtest is affected, this patch omits that one. Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu. 2024-02-24 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR ipa/70582 * gcc.dg/attr-weakref-1.c (dg-additional-options): Define SOLARIS_X86_AS as appropriate. (lv3, Wv3a, pv3a): Wrap in !SOLARIS_X86_AS. (main): Likewise for chk (pv3a).

The following implements manual update for multi-exit loop prologue peeling during vectorization. PR tree-optimization/114081 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Perform manual dominator update for prologue peeling. (vect_do_peeling): Properly update dominators after adding the prologue-around guard. * gcc.dg/vect/vect-early-break_121-pr114081.c: New testcase.

…[PR114044] While it seems a lot of places in various optimization passes fold bit query internal functions with INTEGER_CST arguments to INTEGER_CST when there is a lhs, when lhs is missing, all the removals of such dead stmts are guarded with -ftree-dce, so with -fno-tree-dce those unfolded ifn calls remain in the IL until expansion. If they have large/huge BITINT_TYPE arguments, there is no BLKmode optab and so expansion ICEs, and bitint lowering doesn't touch such calls because it doesn't know they need touching, functions only containing those will not even be further processed by the pass because there are no non-small BITINT_TYPE SSA_NAMEs + the 2 exceptions (stores of BITINT_TYPE INTEGER_CSTs and conversions from BITINT_TYPE INTEGER_CSTs to floating point SSA_NAMEs) and when walking there is no special case for calls with BITINT_TYPE INTEGER_CSTs either, those are for normal calls normally handled at expansion time. So, the following patch adjust the expansion of these 6 ifns, by doing nothing if there is no lhs, and also just in case and user disabled all possible passes that would fold this handles the case of setting lhs to ifn call with INTEGER_CST argument. 2024-02-27 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/114044 * internal-fn.def (CLRSB, CLZ, CTZ, FFS, PARITY): Use DEF_INTERNAL_INT_EXT_FN macro rather than DEF_INTERNAL_INT_FN. * internal-fn.h (expand_CLRSB, expand_CLZ, expand_CTZ, expand_FFS, expand_PARITY): Declare. * internal-fn.cc (expand_bitquery, expand_CLRSB, expand_CLZ, expand_CTZ, expand_FFS, expand_PARITY): New functions. (expand_POPCOUNT): Use expand_bitquery. * gcc.dg/bitint-95.c: New test.

When folding a multiply CHRECs are handled like {a, +, b} * c is {a*c, +, b*c} but that isn't generally correct when overflow invokes undefined behavior. The following uses unsigned arithmetic unless either a is zero or a and b have the same sign. I've used simple early outs for INTEGER_CSTs and otherwise use a range-query since we lack a tree_expr_nonpositive_p and get_range_pos_neg isn't a good fit. PR tree-optimization/114074 * tree-chrec.h (chrec_convert_rhs): Default at_stmt arg to NULL. * tree-chrec.cc (chrec_fold_multiply): Canonicalize inputs. Handle poly vs. non-poly multiplication correctly with respect to undefined behavior on overflow. * gcc.dg/torture/pr114074.c: New testcase. * gcc.dg/pr68317.c: Adjust expected location of diagnostic. * gcc.dg/vect/vect-early-break_119-pr114068.c: Do not expect loop to be vectorized.

GCC 13's changes file documents that iwmmx is deprecated. Raise the bar by warning when the mmintrin.h header is included by users, but provide a way to suppress the warning. gcc: * config/arm/mmintrin.h: Warn if this header is included without defining __ENABLE_DEPRECATED_IWMMXT.

gcc/analyzer/ChangeLog: PR analyzer/111881 * constraint-manager.cc (bound::ensure_closed): Assert that m_constant has integral type. (range::add_bound): Bail out on floating point constants. gcc/testsuite/ChangeLog: PR analyzer/111881 * c-c++-common/analyzer/conditionals-pr111881.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>

…y_{to,from}_device*} These routines map simply to the C counterpart and are meanwhile defined in OpenACC 3.3. (There are additional routine changes, including the Fortran addition of acc_attach/acc_detach, that require more work than a simple addition of an interface and are therefore excluded.) libgomp/ChangeLog: * libgomp.texi (OpenACC Runtime Library Routines): Document new 3.3 routines that simply map to their C counterpart. * openacc.f90 (openacc): Add them. * openacc_lib.h: Likewise. * testsuite/libgomp.oacc-fortran/acc_host_device_ptr.f90: New test. * testsuite/libgomp.oacc-fortran/acc-memcpy.f90: New test. * testsuite/libgomp.oacc-fortran/acc-memcpy-2.f90: New test. * testsuite/libgomp.oacc-c-c++-common/lib-59.c: Crossref to f90 test. * testsuite/libgomp.oacc-c-c++-common/lib-60.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.

This is a regression present on the mainline, 13 and 12 branches. For the attached Ada case, it's a tree checking failure on the mainline at -O: +===========================GNAT BUG DETECTED==============================+ | 14.0.1 20240226 (experimental) [master r14-9171-g4972f97a265] GCC error:| | tree check: expected tree that contains 'decl common' structure, | | have 'component_ref' in tree_could_trap_p, at tree-eh.cc:2733 | | Error detected around /home/eric/cvs/gcc/gcc/testsuite/gnat.dg/opt104.adb: Time is a 10-byte record and Packed_Rec.T is placed at bit-offset 65 because of the packing. so tree-ssa-dse.cc:setup_live_bytes_from_ref has computed a const_size of 88 from ref->offset of 65 and ref->max_size of 80. Then in tree-ssa-dse.cc:compute_trims: 411 int last_live = bitmap_last_set_bit (live); (gdb) next 412 if (ref->size.is_constant (&const_size)) (gdb) 414 int last_orig = (const_size / BITS_PER_UNIT) - 1; (gdb) 418 *trim_tail = last_orig - last_live; (gdb) call debug_bitmap (live) n_bits = 256, set = {0 1 2 3 4 5 6 7 8 9 10 } (gdb) p last_live $33 = 10 (gdb) p const_size $34 = 80 (gdb) p last_orig $35 = 9 (gdb) p *trim_tail $36 = -1 In other words, compute_trims is overlooking the alignment adjustments that setup_live_bytes_from_ref applied earlier. Moveover it reads: /* We use sbitmaps biased such that ref->offset is bit zero and the bitmap extends through ref->size. So we know that in the original bitmap bits 0..ref->size were true. We don't actually need the bitmap, just the REF to compute the trims. */ but setup_live_bytes_from_ref used ref->max_size instead of ref->size. It appears that all the callers of compute_trims assume that ref->offset is byte aligned and that the trimmed bytes are relative to ref->size, so the patch simply adds an early return if either condition is not fulfilled. gcc/ * tree-ssa-dse.cc (compute_trims): Fix description. Return early if either ref->offset is not byte aligned or ref->size is not known to be equal to ref->max_size. (maybe_trim_complex_store): Fix description. (maybe_trim_constructor_store): Likewise. (maybe_trim_partially_dead_store): Likewise. gcc/testsuite/ * gnat.dg/opt104.ads, gnat.dg/opt104.adb: New test.

Also handle V2BF mode. PR target/113871 gcc/ChangeLog: * config/i386/mmx.md (V248FI): Add V2BF mode. (V24FI_32): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr113871-5a.c: New test. * gcc.target/i386/pr113871-5b.c: New test.

…3,PR111802] On e.g. gcc211 the use of "%li" with unsigned HOST_WIDE_INT led to this warning: ../../src/gcc/analyzer/access-diagram.cc: In member function ‘void ana::string_literal_spatial_item::add_column_for_byte(text_art::table&, const ana::bit_to_table_map&, text_art::style_manager&, ana::byte_offset_t, ana::byte_offset_t, int, int) const’: ../../src/gcc/analyzer/access-diagram.cc:1909:40: warning: format ‘%li’ expects argument of type ‘long int’, but argument 3 has type ‘long long unsigned int’ [-Wformat=] byte_idx_within_string.ulow ())); ^ and to all values being erroneously printed as "0". Fixed thusly. gcc/analyzer/ChangeLog: PR analyzer/110483 PR analyzer/111802 * access-diagram.cc (string_literal_spatial_item::add_column_for_byte): Use %wu for printing unsigned HOST_WIDE_INT. Signed-off-by: David Malcolm <dmalcolm@redhat.com>

This is a (partial) reversion of r14-8987-gdd9d14f7d53 to return to eagerly emitting inline variables to the middle-end when they are declared. 'import_export_decl' will still continue to accept them, as allowing this is a pure extension and doesn't seem to cause issues with modules, but otherwise deferring the emission of inline variables appears to cause issues on some targets and prevents some code using inline variable templates from correctly linking. There might be a more targetted way to support this, but due to the complexity of handling linkage and emission I'd prefer to wait till GCC 15 to explore our options. PR c++/113970 PR c++/114013 gcc/cp/ChangeLog: * decl.cc (make_rtl_for_nonlocal_decl): Don't defer inline variables. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/inline-var10.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

… large struct I think we have no coverage for the case where structure_value_addr_parm and TYPE_NO_NAMED_ARGS_STDARG_P are both true. The if (type_arg_types != 0) n_named_args = (list_length (type_arg_types) /* Count the struct value address, if it is passed as a parm. */ + structure_value_addr_parm); else if (TYPE_NO_NAMED_ARGS_STDARG_P (funtype)) n_named_args = 0; else /* If we know nothing, treat all args as named. */ n_named_args = num_actuals; code should probably have n_named_args = structure_value_addr_parm; instead of n_named_args = 0;, this testcase is an attempt to see if it is broken on any target. 2024-02-28 Jakub Jelinek <jakub@redhat.com> * gcc.dg/c23-stdarg-6.c: New test.

…ge integral types in memcpy etc. folding [PR113988] The following patch changes the memcpy etc. folding to use bitwise vector types rather than huge INTEGER_TYPEs for copying of > MAX_FIXED_MODE_SIZE lengths. The problem with the huge INTEGER_TYPEs is that they aren't supported very much, usually there are just optabs to handle moves of them, perhaps misaligned moves and that is it, so they pose problems e.g. to BITINT_TYPE lowering. 2024-02-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/113988 * stor-layout.h (bitwise_mode_for_size): Declare. * stor-layout.cc (bitwise_mode_for_size): New function. * gimple-fold.cc (gimple_fold_builtin_memory_op): Use it. Use bitwise_type_for_mode instead of build_nonstandard_integer_type. Use BITS_PER_UNIT instead of 8. * gcc.dg/bitint-91.c: New test.

The following testcases are miscompiled, because graphite ignores boolean, enumerated or _BitInt comparisons, rewrites the code as if the comparisons were always true or always false. The INTEGER_TYPE checks were initially added in r6-2239 but at that point it was both in add_conditions_to_domain and in parameter_index_in_region. Later on the check was also added to stmt_simple_for_scop_p, and finally r8-3931 changed the stmt_simple_for_scop_p check to INTEGRAL_TYPE_P and turned the parameter_index_in_region -> assign_parameter_index_in_region into INTEGRAL_TYPE_P assertion, but the add_conditions_to_domain check for INTEGER_TYPE remained. The following patch uses INTEGRAL_TYPE_P to complete the change. 2024-02-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114041 * graphite-sese-to-poly.cc (add_conditions_to_domain): Check for INTEGRAL_TYPE_P check rather than INTEGER_TYPE. * gcc.dg/graphite/run-id-pr114041-1.c: New test. * gcc.dg/graphite/run-id-pr114041-2.c: New test.

The emulation via word mode tries to perform integer arithmetic on floating point values instead of floating point arithmetic. This leads to mis-compilations. Failure occured on s390x on these existing test cases: gcc.dg/vect/tsvc/vect-tsvc-s112.c gcc.dg/vect/tsvc/vect-tsvc-s113.c gcc.dg/vect/tsvc/vect-tsvc-s119.c gcc.dg/vect/tsvc/vect-tsvc-s121.c gcc.dg/vect/tsvc/vect-tsvc-s131.c gcc.dg/vect/tsvc/vect-tsvc-s132.c gcc.dg/vect/tsvc/vect-tsvc-s2233.c gcc.dg/vect/tsvc/vect-tsvc-s421.c gcc.dg/vect/vect-alias-check-14.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c gcc/ChangeLog: PR tree-optimization/114075 * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating point vectors Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>

This adds testcase from PR114075 which has been fixed by the r14-9205 change on s390x-linux with -march=z13. 2024-02-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114075 * gcc.dg/gomp/pr114075.c: New test.

…4 [PR91567] gcc.dg/tree-ssa/builtin-snprintf-6.c currently XPASSes on i?86-*-* configurations with -m64: XPASS: gcc.dg/tree-ssa/builtin-snprintf-6.c scan-tree-dump-times optimized "Function test_assign_aggregate" 1 (seen e.g. on i386-pc-solaris2.11, i686-pc-linux-gnu, or i386-apple-darwin*). The problem is that the xfail only handles x86_64, ignoring that i?86 configurations can also be multilibbed. This patch fixes the by handling both forms alike. Tested on i386-pc-solaris2.11, amd64-pc-solaris2.11, sparc-sun-solaris2.11, and sparcv9-sun-solaris2.11. 2024-02-28 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: PR tree-optimization/91567 * gcc.dg/tree-ssa/builtin-snprintf-6.c (scan-tree-dump-times): Treat i?86-*-* like x86_64-*-*.

powerpc64-linux apparently (not very surprisingly) behaves the same way as powerpc64le-linux and has 4 sunk statements rather than 5, so we should xfail it on powerpc64*-*-* rather than just powerpc64le-*-*. powerpc-linux has 3 sunk statements, but the scan pattern is done for lp64 only as the comment explains. 2024-02-28 Jakub Jelinek <jakub@redhat.com> PR testsuite/111462 * gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also on powerpc64.

libstdc++-v3/ChangeLog: * include/std/stacktrace: Add nodiscard attribute to all functions without side effects.

libstdc++-v3/ChangeLog: * include/bits/alloc_traits.h: Include <bits/stl_iterator.h> for __make_move_if_noexcept_iterator.

Cygwin should use std::fwrite, not WriteConsoleW. And the -lstdc++exp library is only needed when running the tests on *-*-mingw*. libstdc++-v3/ChangeLog: * include/std/ostream (vprint_unicode) [__CYGWIN__]: Use POSIX code path for Cygwin instead of Windows. * include/std/print (vprint_unicode) [__CYGWIN__]: Likewise. * testsuite/27_io/basic_ostream/print/1.cc: Only add -lstdc++exp for *-*-mingw* targets. * testsuite/27_io/print/1.cc: Likewise.

Implementing all chrono::from_stream overloads in terms of chrono::sys_time meant that a leap second time like 23:59:60.001 cannot be parsed, because that cannot be represented in a sys_time. The fix to support parsing leap seconds as utc_time is to convert the parsed date to utc_time<days> and then add the parsed time to that, which allows the result to land in a leap second, rather than doing all the arithmetic with sys_time which doesn't have leap seconds. For local_time we also allow %S to parse a 60s value, because doing otherwise might disallow some valid uses. We can't know all use cases users have for treating times as local_time. For all other clocks, we can reject times that have 60 or 60.nnn as the seconds part, because that cannot occur in a valid UNIX, GPS, or TAI time. Since our chrono::file_clock uses sys_time, it can't occur for that clock either. In order to support this a new _M_is_leap_second member is needed in the _Parser type. This can be added at the end, where most targets currently have padding bytes. Similar to what I did recently for formatter _Spec structs, we can also reserve additional padding bits for future expansion. This also fixes bugs in the from_stream overloads for utc_time, tai_time, gps_time, and file_time, which were not using time_point_cast to explicitly convert to the result type. That's needed because the result type might have lower precision than the value returned from from_sys or from_utc, which has a precision no lower than seconds. libstdc++-v3/ChangeLog: PR libstdc++/114279 * include/bits/chrono_io.h (_Parser::_M_is_leap_second): New data member. (_Parser::_M_reserved): Reserve padding bits for future use. (_Parser::operator()): Set _M_is_leap_second if %S reads 60s. (from_stream): Only allow _M_is_leap_second for utc_time and local_time. Adjust arithmetic for utc_time so that leap seconds are preserved. Use time_point_cast to convert to a possibly lower-precision result type. * testsuite/std/time/parse.cc: Move to ... * testsuite/std/time/parse/parse.cc: ... here. * testsuite/std/time/parse/114279.cc: New test.

When parsing a std::chrono::sys_days (or a sys_time with an even longer period) we should not require a time-of-day to be present in the input, because we can't represent that in the result type anyway. Rather than trying to decide which specializations should require a time-of-date and which should not, follow the direction of Howard Hinnant's date library, which allows extracting a sys_time of any period from input that only contains a date, defaulting the time-of-day part to 00:00:00. This seems consistent with the intent of the standard, which says it's an error "If the parse fails to decode a valid date" (i.e., it doesn't care about decoding a valid time, only a date). libstdc++-v3/ChangeLog: PR libstdc++/114240 * include/bits/chrono_io.h (_Parser::operator()): Assume hours(0) for a time_point, so that a time is not required to be present. * testsuite/std/time/parse/114240.cc: New test.

…mic_compare_and_swapsi. If the hardware does not support LAMCAS, atomic_compare_and_swapsi needs to be implemented through "ll.w+sc.w". In the implementation of the instruction sequence, it is necessary to determine whether the two registers are equal. Since LoongArch's comparison instructions do not distinguish between 32-bit and 64-bit, the two operand registers that need to be compared are symbolically extended, and one of the operand registers is obtained from memory through the "ll.w" instruction, which can ensure that the symbolic expansion is carried out. However, the value of the other operand register is not guaranteed to be the value of the sign extension. gcc/ChangeLog: * config/loongarch/sync.md (atomic_cas_value_strong<mode>): In loongarch64, a sign extension operation is added when operands[2] is a register operand and the mode is SImode. gcc/testsuite/ChangeLog: * g++.target/loongarch/atomic-cas-int.C: New test.

When the value of the macro DEFAULT_CFLAGS is set to '-ansi -pedantic-errors', regname-s9-fp.c will test to fail. To solve this problem, add the compilation option '-Wno-pedantic -std=gnu90' to this test case. gcc/testsuite/ChangeLog: * gcc.target/loongarch/regname-fp-s9.c: Add compilation option '-Wno-pedantic -std=gnu90'.

When I've added the -mnoreturn-no-callee-saved-registers option to i386.opt, I forgot to regenerate i386.opt.urls and Mark's CI kindly reminded me of that. Fixed thusly. 2024-03-09 Jakub Jelinek <jakub@redhat.com> * config/i386/i386.opt.urls: Regenerate.

gcc/ * config/avr/avr.cc (avr_rtx_costs_1) [PLUS]: Determine cost for usum_widenqihi and add_zero_extend1. [MINUS]: Determine costs for udiff_widenqihi, sub+zero_extend, sub+sign_extend. * config/avr/avr.md (*addhi3.sign_extend1, *subhi3.sign_extend2): Compute exact insn lengths. (*usum_widenqihi3): Allow input operands to commute.

…to allow the IE to LE linker relaxation In Binutils we need to make IE to LE relaxation only allowed when there is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid "partial" relaxation won't happen with the extreme code model. So if we are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an R_LARCH_RELAX to allow the relaxation. The IE to LE relaxation does not require the pcalau12i and the ld instruction to be adjacent, so we don't need to limit ourselves to use the macro. For the distro maintainers backporting changes: this change depends on r14-8721, without r14-8721 R_LARCH_RELAX can be emitted mistakenly in the extreme code model. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_print_operand_reloc): Support 'Q' for R_LARCH_RELAX for TLS IE. (loongarch_output_move): Use 'Q' to print R_LARCH_RELAX for TLS IE. * config/loongarch/loongarch.md (ld_from_got<mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/loongarch/tls-ie-relax.c: New test. * gcc.target/loongarch/tls-ie-norelax.c: New test. * gcc.target/loongarch/tls-ie-extreme.c: New test.

… MEMs [PR114284] Before the recent PR111267 r14-8319 fwprop changes, fwprop would never try to propagate what was not considered PROFITABLE, where the profitable part actually was partly about profitability, partly about very good reasons not to actually propagate and partly for cases where propagation is completely incorrect. In particular, classify_result has: /* Allow (subreg (mem)) -> (mem) simplifications with the following exceptions: 1) Propagating (mem)s into multiple uses is not profitable. 2) Propagating (mem)s across EBBs may not be profitable if the source EBB runs less frequently. 3) Propagating (mem)s into paradoxical (subreg)s is not profitable. 4) Creating new (mem/v)s is not correct, since DCE will not remove the old ones. */ if (single_use_p && single_ebb_p && SUBREG_P (old_rtx) && !paradoxical_subreg_p (old_rtx) && MEM_P (new_rtx) && !MEM_VOLATILE_P (new_rtx)) return PROFITABLE; and didn't mark any other MEM_P (new_rtx) or rtxes which contain a MEM in its subrtxes as PROFITABLE. Now, since r14-8319 profitable_p method has been renamed to likely_profitable_p and has just a minor role. Now, rule 4) above is something that isn't about profitability, but about correct behavior, if you propagate mem/v, the code is miscompiled. This particular case has been fixed elsewhere by Haochen in r14-9379. But I think even the 1) and 2) and maybe 3) are a strong don't do it, don't rely solely on rtx costs, increasing the number of loads of the same memory, even when cached, is undesirable, canceling load hoisting can be undesirable as well. So, the following patch restores previous behavior of src contains any MEMs, in that case likely_profitable_p () is taken as the old profitable_p () as a requirement rather than just a hint. For propagation of something which doesn't load from memory this keeps the r14-8319 behavior. 2024-03-09 Jakub Jelinek <jakub@redhat.com> PR target/114284 * fwprop.cc (try_fwprop_subst_pattern): Don't propagate src containing MEMs unless prop.likely_profitable_p ().

gcc/ * config/avr/avr.md: Fix typos in comment, indentation glitches and some other nits.

]

rguenth and others added 30 commits February 26, 2024 15:20

Update gcc sv.po, zh_CN.po

10c73c1

* sv.po, zh_CN.po: Update.

AVR: Dead code removal.

9b0f7ef

gcc/ * config/avr/avr.cc (avr_out_compare) [AVR_TINY]: Remove code in an "if avr_adiw_reg_p()" block that's dead for AVR_TINY.

AVR: Tag optimization options as "Optimization".

96773ce

Some options that are pure optimizations where not tagged as such. gcc/ * config/avr/avr.opt (mcall-prologues, mrelax, maccumulate-args) (mstrict-X): Tag as "Optimization".

Daily bump.

1e2a3b2

Daily bump.

6309ad2

testsuite: Add testcase for recently fixed PR [PR114075]

db46523

This adds testcase from PR114075 which has been fixed by the r14-9205 change on s390x-linux with -march=z13. 2024-02-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/114075 * gcc.dg/gomp/pr114075.c: New test.

libstdc++: Add more [[nodiscard]] to <stacktrace>

d59175e

libstdc++-v3/ChangeLog: * include/std/stacktrace: Add nodiscard attribute to all functions without side effects.

libstdc++: Include <bits/stl_iterator.h> in <bits/alloc_traits.h>

3c1e624

libstdc++-v3/ChangeLog: * include/bits/alloc_traits.h: Include <bits/stl_iterator.h> for __make_move_if_noexcept_iterator.

jwakely and others added 27 commits March 9, 2024 00:21

i386: Regenerate i386.opt.urls

e9753f4

When I've added the -mnoreturn-no-callee-saved-registers option to i386.opt, I forgot to regenerate i386.opt.urls and Mark's CI kindly reminded me of that. Fixed thusly. 2024-03-09 Jakub Jelinek <jakub@redhat.com> * config/i386/i386.opt.urls: Regenerate.

AVR: Fix typos in comment, indentation glitches in avr.md.

f5a805d

gcc/ * config/avr/avr.md: Fix typos in comment, indentation glitches and some other nits.

Merge commit 'fc59a3995cb46c190c0efb0431ad204e399975c4^' into HEAD

0ba53bf

Merge commit 'fc59a3995cb46c190c0efb0431ad204e399975c4' into HEAD [#2183

041fef1

]

Merge commit '7a6906c8d80e437a97c780370a8fec4e00561c7b' into HEAD [#2288

4966574

]

Merge commit '4bd09ce06f50d266c992c984cc993384d5e6655e' into HEAD

ca224bd

Merge commit 'a5258f3a11ab577835ef5e93be5cb65ec9e44132^' into HEAD

d2bcecd

Merge commit 'e621b174d7c622aa4b677a4c812e5061e311cc5c' into HEAD

c9e59de

Merge commit '2341df1cb9b3681bfefe29207887b2b3dc271a95^' into HEAD

d1a0609

Merge commit '2341df1cb9b3681bfefe29207887b2b3dc271a95' into HEAD [#2801

0de2032

]

Merge commit 'ceed844b5284aeabbdfe25ccf099e7ebeeb14a9b^' into HEAD

e02c6e6

Merge commit '2a9881565c7b48d04cf891666a66a1a2e560bce8' into HEAD

17ee9c6

Merge commit 'f89186f962421f6d972035fc4b4c20490e7b1c5b^' into HEAD

1af2c40

Merge commit 'af3f0482367232d2d655e51bee382e98ddbfb117' into HEAD

1cae91f

Merge commit 'f0b1cf01782ba975cfda32800c91076df78058d6^' into HEAD

9575360

Merge commit 'f0b1cf01782ba975cfda32800c91076df78058d6' into HEAD [#2857

31fed21

]

Merge commit '8534cc772def8142379c0e72ab6392d40f3f60f6^' into HEAD

17d389c

Merge commit '767698ff6c8f07047ad90bef89f3dc4c4515f0df' into HEAD

30a67f5

Merge commit 'f5a805d82902fe2d6e0a7af8c0e6519f9d25a8f3' into HEAD

c53dd85

Adjust '.github/bors_log_expected_warnings'

013b520

tschwinge force-pushed the tschwinge/merge-upstream branch from 90bb4d8 to 013b520 Compare April 10, 2024 13:06

tschwinge merged commit 0201fa1 into master Apr 10, 2024
6 of 9 checks passed

tschwinge mentioned this pull request Apr 12, 2024

Emit error on async trait functions #2767

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge upstream, 2024-03-09 #2944

Merge upstream, 2024-03-09 #2944

tschwinge commented Apr 10, 2024

Merge upstream, 2024-03-09 #2944

Merge upstream, 2024-03-09 #2944

Conversation

tschwinge commented Apr 10, 2024