-
Notifications
You must be signed in to change notification settings - Fork 185
Merge upstream, 2023-10-18 #2912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This adds a pipeline description for a generic out-of-order core. Latency and units are not based on any real processor but more or less educated guesses what such a processor would look like. In order to account for latency scaling by LMUL != 1, sched_adjust_cost is implemented. It will scale an instruction's latency by its LMUL so an LMUL == 8 instruction will take 8 times the number of cycles the same instruction with LMUL == 1 would take. As this potentially causes very high latencies which, in turn, might lead to scheduling anomalies and a higher number of vsetvls emitted this feature is only enabled when specifying -madjust-lmul-cost. Additionally, in order to easily recognize pre-RA vsetvls this patch introduces an insn type vsetvl_pre which is used in sched_adjust_cost. In the future we might also want a latency adjustment similar to lmul for reductions, i.e. make the latency dependent on the type and its number of units. gcc/ChangeLog: * config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter. * config/riscv/riscv-opts.h (enum riscv_microarchitecture_type): Add generic_ooo. * config/riscv/riscv.cc (riscv_sched_adjust_cost): Implement scheduler hook. (TARGET_SCHED_ADJUST_COST): Define. * config/riscv/riscv.md (no,yes"): Include generic-ooo.md * config/riscv/riscv.opt: Add -madjust-lmul-cost. * config/riscv/generic-ooo.md: New file. * config/riscv/vector.md: Add vsetvl_pre.
Like ARM SVE, RVV is vectorizing these 2 cases in the same way. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-23.c: Add RVV like ARM SVE. * gcc.dg/vect/slp-perm-10.c: Ditto.
RVV vectortizes this case with stride8 load_lanes. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-reduc-4.c: Adapt test for stride8 load_lanes.
This case is vectorized by stride8 load_lanes. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-12a.c: Adapt for stride 8 load_lanes.
These cases are vectorized by vec_load_lanes with strided = 8 instead of SLP with -fno-vect-cost-model. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr97832-2.c: Adapt dump check for target supports load_lanes with stride = 8. * gcc.dg/vect/pr97832-3.c: Ditto. * gcc.dg/vect/pr97832-4.c: Ditto.
RVV vectorize it with stride5 load_lanes. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-perm-4.c: Adapt test for stride5 load_lanes.
Turns out we didnt need this as there is no unordered relations managed by the oracle. * gimple-range-gori.cc (gori_compute::compute_operand1_range): Do not call get_identity_relation. (gori_compute::compute_operand2_range): Ditto. * value-relation.cc (get_identity_relation): Remove. * value-relation.h (get_identity_relation): Remove protyotype.
A floating point equivalence may not properly reflect both signs of zero, so be pessimsitic and ensure both signs are included. PR tree-optimization/111694 gcc/ * gimple-range-cache.cc (ranger_cache::fill_block_cache): Adjust equivalence range. * value-relation.cc (adjust_equivalence_range): New. * value-relation.h (adjust_equivalence_range): New prototype. gcc/testsuite/ * gcc.dg/pr111694.c: New.
gcc/analyzer/ChangeLog: * access-diagram.cc (boundaries::add): Explicitly state "boundaries::" scope for "kind" enum. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Verifier checks have recently been strengthened to check that all counts and probabilities are initialized. The checks fired during autoprofiledbootstrap build and this patch fixes it. Tested on x86_64-pc-linux-gnu. gcc/ChangeLog: * auto-profile.cc (afdo_calculate_branch_prob): Fix count comparisons * tree-vect-loop-manip.cc (vect_do_peeling): Guard against zero count when scaling loop profile
For RVV, we have VLS modes enable according to TARGET_MIN_VLEN from M1 to M8. For example, when TARGET_MIN_VLEN = 128 bits, we enable 128/256/512/1024 bits VLS modes. This patch fixes following FAIL: FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "optimized: basic block" 2 FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: basic block" 2 gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add 256/512/1024
Refurbish add compare patterns: use 'r' constraint, fix identation, and fix pattern to match 'if (a+b) { ... }' constructions. gcc/ * config/arc/arc.cc (arc_select_cc_mode): Match NEG code with the first operand. * config/arc/arc.md (addsi_compare): Make pattern canonical. (addsi_compare_2): Fix identation, constraint letters. (addsi_compare_3): Likewise. gcc/testsuite/ * gcc.target/arc/add_f-combine.c: New test. Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
Here is the reference comparing dump IR between ARM SVE and RVV. https://godbolt.org/z/zqess8Gss We can see RVV has one more dump IR: optimized: basic block part vectorized using 128 byte vectors since RVV has 1024 bit vectors. The codegen is reasonable good. However, I saw GCN also has 1024 bit vector. This patch may cause this case FAIL in GCN port ? Hi, GCN folk, could you check this patch in GCN port for me ? gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-pr65935.c: Add vect1024 variant. * lib/target-supports.exp: Ditto.
The following fixes fallout of r10-7145-g1dc00a8ec9aeba which made us cautionous about CSEing a load to an object that has padding bits. The added check also triggers for BLKmode entities like STRING_CSTs but by definition a BLKmode entity does not have padding bits. PR tree-optimization/111751 * tree-ssa-sccvn.cc (visit_reference_op_load): Exempt BLKmode result from the padding bits check.
Add testcase for PR111751 which has been fixed: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632474.html PR target/111751 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr111751.c: New test.
…ning gcc/ada/ * sem_attr.adb (Analyze_Attribute): Protect the frontend against replacing 'Size by its static value if 'Size is not known at compile time and we are processing pragmas Compile_Time_Warning or Compile_Time_Errors.
The concept of extended nodes was retired at the same time Gen_IL was introduced, but there was a reference to that concept left over in a comment. This patch removes that reference. Also, the description of the field Comes_From_Check_Or_Contract was incorrectly placed in a section for fields present in all nodes in sinfo.ads. This patch fixes this. gcc/ada/ * atree.ads, nlists.ads, types.ads: Remove references to extended nodes. Fix typo. * sinfo.ads: Likewise and fix position of Comes_From_Check_Or_Contract description.
This patch fixes the behavior of Ada.Directories.Search when being requested to filter out regular files or directories. One of the configurations in which that behavior was incorrect was that when the caller requested only the regular and special files but not the directories, the directories would still be returned. gcc/ada/ * libgnat/a-direct.adb: Fix filesystem entry filtering.
This occurs when one of the types has an incomplete declaration in addition to its full declaration in its package. In this case AI05-129 says that the incomplete type is not part of the limited view of the package, i.e. only the full view is. Now, in the GNAT implementation, it's the opposite in the regular view of the package, i.e. the incomplete type is the visible one. That's why the implementation needs to also swap the types on the visibility chain while it is swapping the views when the clauses are either installed or removed. This works correctly for the installation, but does not for the removal, so this change rewrites the code doing the latter. gcc/ada/ PR ada/111434 * sem_ch10.adb (Replace): New procedure to replace an entity with another on the homonym chain. (Install_Limited_With_Clause): Rename Non_Lim_View to Typ for the sake of consistency. Call Replace to do the replacements and split the code into the regular and the special cases. Add debuggging output controlled by -gnatdi. (Install_With_Clause): Print the Parent_With and Implicit_With flags in the debugging output controlled by -gnatdi. (Remove_Limited_With_Unit.Restore_Chain_For_Shadow (Shadow)): Rewrite using a direct replacement of E4 by E2. Call Replace to do the replacements. Add debuggging output controlled by -gnatdi.
This happens when the conditional expression is immediately returned, for example in an expression function. gcc/ada/ * exp_aggr.adb (Is_Build_In_Place_Aggregate_Return): Return true if the aggregate is a dependent expression of a conditional expression being returned from a build-in-place function.
It is only called once. gcc/ada/ * sem_util.ads (Set_Scope_Is_Transient): Delete. * sem_util.adb (Set_Scope_Is_Transient): Likewise. * exp_ch7.adb (Create_Transient_Scope): Set Is_Transient directly.
The purpose of this patch is to work around false-positive warnings emitted by GNAT SAS (also known as CodePeer). It does not change the behavior of the modified subprogram. gcc/ada/ * libgnat/a-direct.adb (Start_Search_Internal): Tweak subprogram body.
…component This is a small bug present on strict-alignment platforms for questionable representation clauses. gcc/ada/ * gcc-interface/decl.cc (inline_status_for_subprog): Minor tweak. (gnat_to_gnu_field): Try harder to get a packable form of the type for a bitfield.
…etation The following ups the limit in fold_view_convert_expr to handle 1024bit vectors as used by GCN and RVV. It also robustifies the handling in visit_reference_op_load to properly give up when constants cannot be re-interpreted. PR tree-optimization/111751 * fold-const.cc (fold_view_convert_expr): Up the buffer size to 128 bytes. * tree-ssa-sccvn.cc (visit_reference_op_load): Special case constants, giving up when re-interpretation to the target type fails.
Richard patch resolve PR111751: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=7c76c876e917a1f20a788f602cc78fff7d0a2a65 which cause ICE in RISC-V regression: FAIL: gcc.dg/torture/pr53144.c -O2 (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328) FAIL: gcc.dg/torture/pr53144.c -O2 (test for excess errors) FAIL: gcc.dg/torture/pr53144.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328) FAIL: gcc.dg/torture/pr53144.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: gcc.dg/torture/pr53144.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328) FAIL: gcc.dg/torture/pr53144.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: gcc.dg/torture/pr53144.c -O3 -g (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328) FAIL: gcc.dg/torture/pr53144.c -O3 -g (test for excess errors) VLS BOOL modes vcond_mask is needed to fix this regression ICE. More details: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 Tested and Committed. PR target/111751 gcc/ChangeLog: * config/riscv/autovec.md: Add VLS BOOL modes.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu> ChangeLog: * MAINTAINERS: Add myself.
This test is testing fold_extract_last pattern so it's more reasonable use vect_fold_extract_last instead of specifying targets. This is the vect_fold_extract_last property: proc check_effective_target_vect_fold_extract_last { } { return [expr { [check_effective_target_aarch64_sve] || [istarget amdgcn*-*-*] || [check_effective_target_riscv_v] }] } include ARM SVE/GCN/RVV. It perfectly matches what we want and more reasonable, better maintainment. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr65947-8.c: Use vect_fold_extract_last.
Like GCN, add -fno-tree-vectorize. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/predcom-2.c: Add riscv.
This patch fixes following 2 FAILs in RVV regression since the check is not accurate. It's inspired by Robin's previous patch: https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac11b@gmail.com/ gcc/testsuite/ChangeLog: * gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern. * gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto.
Before the r5-3834 commit for PR63362, GCC 4.8-4.9 refuses to compile cse.cc which contains a variable with rtx_def type, because rtx_def contains a union with poly_uint16 element. poly_int template has defaulted default constructor and a variadic template constructor which could have empty parameter pack. GCC < 5 treated it as non-trivially constructible class and deleted rtunion and rtx_def default constructors. For the cse_insn purposes, all we need is a variable with size and alignment of rtx_def, not necessarily rtx_def itself, which we then memset to 0 and fill in like rtx is normally allocated from heap, so this patch for GCC_VERSION < 5000 uses an unsigned char array of the right size/alignment. 2023-10-18 Jakub Jelinek <jakub@redhat.com> PR bootstrap/111852 * cse.cc (cse_insn): Add workaround for GCC 4.8-4.9, instead of using rtx_def type for memory_extend_buf, use unsigned char arrayy with size of rtx_def and its alignment. (cherry picked from commit bc4bd69)
Reformat the upstream GCC commit f4a2ae2 "Change MODE_BITSIZE to MODE_PRECISION for MODE_VECTOR_BOOL" change to 'gcc/rust/backend/rust-tree.cc' to clang-format's liking. gcc/rust/ * backend/rust-tree.cc (c_common_type_for_mode): Placate clang-format.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merging in several stages, towards #2802 and further.
This must of course not be rebased by GitHub merge queue, but has to become a proper Git merge. (I'll handle that, once ready.)