forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 4
[AutoBump] Merge with 89b34ec9 (Dec 18) (23) [Only tested MLIR] #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Do not remove S_CBRANCH_EXECZ if one of the following blocks contains an unconditional branch to a block other than the one immediately following it. This can cause unwanted behavior like infinite loops.
…m#118117) This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to check that it is safe to fold a store into a node that will expand to a library call that takes output pointers. This requires checking for two (independent) properties: 1. The store is not within a CALLSEQ_START..CALLSEQ_END pair * If it is, the expansion would lead to nested call sequences (which is invalid) 2. The node does not appear as a predecessor to the store * If it does, attempting to merge the store into the call would result in a cycle in the DAG These two properties are checked as part of the same traversal in `canFoldStoreIntoLibCallOutputPointers()`
This adds support for the loongarch64 architecture to the offload host plugin. Similar to llvm#115773 To fix some test issues, I've had to add the LoongArch64 target to: - CompilerInvocation::ParseLangArgs - linkDevice in ClangLinuxWrapper.cpp - OMPContext::OMPContext (to set the device_kind_cpu trait) Reviewed By: jhuber6 Pull Request: llvm#120173
…ethod (llvm#118951) [llvm-debuginfo-analyzer] Fix crash due to un-checked error in LVReaderHandler::handleArchive method. - Added README describing how to generated the binary files used for the test. - A follow up patch to add extra ASSERT_NE Committed on behalf of @aurelien35
…0002) Theses modules are deprecated and have trivial implementations in modern cmake.
Multiple reentry points may be associated with a single key.
…#119546) This patch implements the following intrinsics: Multi-vector 8-bit floating-point multiply-add long (multiple vectors). ``` c // Only if __ARM_FEATURE_SME_F8F16 != 0 void svmla_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); // Only if __ARM_FEATURE_SME_F8F32 != 0 void svmla_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); void svmla_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm, fpm_t fpm) __arm_streaming __arm_inout("za"); ``` In accordance with ARM-software/acle#323
…E routines. (llvm#119864)" Avoid issues caused by `.subsections_via_symbols` directive, by using numbered labels instead of named labels for the branch locations. This reverts commit 4032ce3.
Building on top of llvm#115489 extend support for binops with SExt operand. PR: llvm#115879
This patch adds initial matchers for unary and binary SCEV expressions and specializes it for SExt, ZExt and binary add expressions. Also adds matchers for SCEVConstant and SCEVUnknown. This patch only converts a few instances to use the new matchers to make sure everything builds as expected for now. The goal of the matchers is to hopefully make it slightly easier to write code matching SCEV patterns. Depends on llvm#119389 PR: llvm#119390
We still need to check the input pointer, so let this go through BitCastPrim.
Before this patch, there was a calling-convention mismatch between the constructors and the actual call emitted for the entrypoint wrapper. Such mismatch causes the InstCombine pass to replace this call with an `unreachable`, breaking the whole function. Signed-off-by: Nathan Gauër <brioche@google.com>
llvm#120218) With all the recent versions of Clang that I tested, ObjC forward declarations like ``` @Class ForwardObjcClass; ``` don't emit the kind of DWARF that this workaround was put in place for. Also, zero-sized structures are valid in C (and thus Objective-C), so this workaround makes things confusing to reason about when mixing the two languages. This workaround has been in place for at least a decade, and given that recent compilers don't produce this anymore, we think it's a good time to remove it.
Store GEPNoWrapFlags instead of only InBounds and propagate them.
The `llvm-gcc` front-end has been EOL'd at least since 2011 (based on some `git` archeology). And Clang/LLVM has been removing references to it ever since. This patch removes the remaining references to it from LLDB. One benefit of this is that it will allow us to remove the code checking for `DW_AT_decl_file_attributes_are_invalid` and `Supports_DW_AT_APPLE_objc_complete_type`.
…lete_type and DW_AT_decl_file_attributes_are_invalid (llvm#120226) Depends on llvm#120225 With `llvm-gcc` support being removed from LLDB, these APIs are now trivial and can be removed too.
Class BuiltinTypeMethodBuilder has a user-defined destructor so likely compiler generated special functions may behave incorrectly. Delete explicitly copy constructor and copy assignment operator to avoid potential errors.
A necessary AddrSpaceCast was wrongfully deleted in 5c91b28 . Recover the AddrSpaceCast. This fixes llvm#86791 .
Summary: This message is only confusing and shouldn't have been added in the first place.
Summary: This installs the shared header to the users installation. I couldn't decide if this should be a standalone thing or use the existing support in `include/` mostly because this is completely separate from hdrgen stuff and it's C++.
This patch introduces the LLVM components of a type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help. For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected. The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime. The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls. The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated. Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test in the runtime patch. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer. When the sanitizer is active, we disable actually using the TBAA metadata for AA. This way we're less likely to use TBAA to remove memory accesses that we'd like to verify. As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work. It goes together with the corresponding clang changes (llvm#76260) and compiler-rt changes (llvm#76261) PR: llvm#76259
Remove duplicate word.
ptxas needs a proper triplet for 133352f
This change has a long history. It was first attempted naively in https://reviews.llvm.org/D131425, which didn't work because we broke the ability for code to include e.g. <stdio.h> multiple times and get different definitions based on the pre-defined macros. However, in llvm#86843 we managed to simplify <stddef.h> by including the underlying system header outside of any include guards, which worked. This patch applies the same simplification we did to <stddef.h> to the other headers that currently mention __need_FOO macros explicitly.
…ss (llvm#120135) Add some comments that hopefully clarify a few things. This was supposed to be NFC, but there is a difference in the inferred register class for EXTRACT_SUBREG. Pull Request: llvm#120135
This patch introduces the Clang components of type sanitizer: a sanitizer for type-based aliasing violations. It is based on Hal Finkel's https://reviews.llvm.org/D32198. The Clang changes are mostly formulaic, the one specific change being that when the TBAA sanitizer is enabled, TBAA is always generated, even at -O0. It goes together with the corresponding LLVM changes (llvm#76259) and compiler-rt changes (llvm#76261) PR: llvm#76260
Don't unnecessarily clone for a caller that wasn't matched to a call instruction. This necessitated updated a couple of tests that were either unnecessarily cloning or unnecessarily processing an allocation and hinting it not cold.
Some bots are failing with 2916352, likely due to the escapes in the FileCheck pattern. Add extra quotes to try to fix this. E.g. https://lab.llvm.org/buildbot/#/builders/46/builds/9442
Two bugs here. First calling `Inst->getFunction()` has undefined behavior if the instruction is not tracked to a function. I suspect the `replaceAllUsesWith` was leaving the GEPs in a weird ghost parent situation. I switched up the visitor to be able to `eraseFromParent` as part of visiting and then everything started working. The second bug was in `DXILFlattenArrays.cpp`. I was unaware that you can have multidimensional arrays of `zeroinitializer`, and `undef` so fixed up the initializer to handle these two cases. fixes llvm#117273
…lvm#118958) In many cases the emptyTensorElimination can not transform or eliminate the empty tensor which is being inserted into the `SubsetInsertionOpInterface`. Two major reasons for that: 1- Failing when trying to find a legal/suitable insertion point for the `subsetExtract` which is about to replace the empty tensor. However, we may try to handle this issue by moving the needed values which responsible on building the `subsetExtract` nearby the empty tensor (which is about to be eliminated). Thus increasing the probability to find a legal insertion point. 2-The EmptyTensorElimination transform replaces the tensor.empty's uses all at once in one apply, rather than replacing only the specific use which was visited in the use-def chain (when traversing from the tensor.insert_slice). This scenario of replacing all the uses of the tensor.empty may lead into additional read effects after bufferization of the specific subset extract/subview which should not be the case. Both cases may result in many copies in the coming bufferization which can not be canonicalized. The first case can be noticed when having a `tensor.empty` followed by `SubsetInsertionOpInterface` (or in simple words `tensor.insert_slice`), which have been lowered from `tensor/tosa.concat`. The second case can be noticed when having a `tensor.empty`, with many uses and leading to applying the transformation only once, since the whole uses have been replaced at once. The first commit in the PR only adds the lit tests for the cases shown above (NFC), to emphasize how the transform works, in the coming MRs will upload a slight changes to handle these case. The second commit in this PR, we want to replace only the specific use which was visited in the `use-def` chain (when traversing from the `tensor.insert_slice`'s source).
This sets up the initial blocks needed to initialize a VPlan directly in the constructor. This will allow tracking of all created blocks directly in VPlan, simplifying block deletion.
Give the properties from tablegen a `predicate` field that holds the predicate that the property needs to satisfy, if one exists, and hook that field up to verifier generation.
This patch undrifts source locations in MemProfRecord before readMemprof starts the matching process. The thoery of operation is as follows: 1. Collect the lists of direct calls, one from the IR and the other from the profile. 2. Compute the correspondence (called undrift map in the patch) between the two lists with longestCommonSequence. 3. Apply the undrift map just before readMemprof consumes MemProfRecord. The new function gated by a flag that is off by default.
…lhs floordiv rhs` (llvm#119245) Fixes an issue where the `SimpleAffineExprFlattener` would simplify `lhs % rhs` to just `-(lhs floordiv rhs)` instead of `lhs - (lhs floordiv rhs)` if `lhs` happened to be equal to `lhs floordiv rhs`. The reported failure case was `(d0, d1) -> (((d1 - (d1 + 2)) floordiv 8) % 8)` from llvm#114654. Note that many paths that simplify AffineMaps (e.g. the AffineApplyOp folder and canonicalization) would not observe this bug because of of slightly different paths taken by the code. Slightly different grouping of the terms could also result in avoiding the bug. Resolves llvm#114654.
Found assertion failures when using EXPENSIVE_CHECKS and running lit tests for APINotes: Assertion `left.first != right.first && "two entries for the same version"' failed. It seems like std::is_sorted is verifying that the comparison function is reflective (comp(a,a)=false) when using expensive checks. So we would get callbacks to the lambda used for comparison, even for vectors with a single element in APINotesReader::VersionedInfo<T>::VersionedInfo, with "left" and "right" being the same object. Therefore the assert checking that we never found equal values would fail. Fix makes sure that we skip the check for equal values when "left" and "right" is the same object.
This patch also makes following amendments to core exegesis:
* Added distinction between regular registers aliasing check and
registers used as memory address in instruction.
* Added scratch memory space pointer register.
* General exegesis options were amended:
* mattr - new option to pass a list of enabled target features
Llvm-exegesis RISCV port is a result of team effort. Below everyone
involved listed.
Co-authored-by: Konstantin Vladimirov
<konstantin.vladimirov@syntacore.com>
Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Dmitry Bushev <dmitry.bushev@syntacore.com>
Co-authored-by: Mark Goncharov <mark.goncharov@syntacore.com>
Co-authored-by: Anastasiya Chernikova
<anastasiya.chernikova@syntacore.com>
Original pr: llvm#89047
---------
Co-authored-by: Kazu Hirata <kazu@google.com>
Support true16 format for v_cvt_pknorm_i16/u16_f16 in MC.
Support true16 format for v_div_fixup_f16 in MC.
Support true16 format for v_minmax/maxmin_f16 in MC. Since we are replacing `v_minmax/maxmin_f16` to `v_minmax/maxmin_f16_t16 / v_minmax/maxmin_f16_fake16` in Post-GFX11, have to update the CodeGen pattern for `v_minmax/maxmin_f16` to get CodeGen test passing.
The arguments to this are the same as for the 'wait' clause, so this reuses all of that infrastructure. So all this has to do is support a pair of clauses that are already implemented (if and async), plus create an AST node. This patch does so, and adds proper testing.
'-mllvm -ubsan-unique-traps' (llvm#65972) applies to all UBSan checks. This patch introduces -fsanitize-merge (defaults to on, maintaining the status quo behavior) and -fno-sanitize-merge (equivalent to '-mllvm -ubsan-unique-traps'), with the option to selectively applying non-merged handlers to a subset of UBSan checks (e.g., -fno-sanitize-merge=bool,enum). N.B. we do not use "trap" in the argument name since llvm#119302 has generalized -ubsan-unique-traps to work for non-trap modes (min-rt and regular rt). This patch does not remove the -ubsan-unique-traps flag; that will override -f(no-)sanitize-merge.
…lvm#112694 (llvm#120418) I missed that FalseCnt for each Case was used to calculate percentage in the SwitchStmt. At the moment I resurrect them. In `!HasDefaultCase`, the pair of Counters shall be `[CaseCountSum, FalseCnt]`. (Reversal of before llvm#112694) I think it can be considered as the False count on SwitchStmt. FalseCnt shall be folded (same as current impl) in the coming SingleByteCoverage changes, since percentage would not make sense.
…utside functions (llvm#120416) Fixes llvm#119952
…120464)" This reverts commit 7eaf470. Reason: buildbot breakage (e.g., https://lab.llvm.org/buildbot/#/builders/144/builds/14299/steps/6/logs/FAIL__Clang__ubsan-trap-debugloc_c)
…llvm#118242) Scalarize vector FPOWI instead of promoting the type. This allows the scalar FPOWIs to be visited and converted to libcalls before promoting the type. FIXME: This should be done in LegalizeVectorOps/LegalizeDAG, but call lowering needs the unpromoted EVT. Without this patch, in some backends, such as RISCV64 and LoongArch64, the i32 type is illegal and will be promoted. This causes exponent type check to fail when ISD::FPOWI node generates a libcall. Fix llvm#118079
This reverts commit 9af5de3. Reason: buildbot breakage (https://lab.llvm.org/buildbot/#/builders/24/builds/3394/steps/10/logs/stdio) "Unexpectedly Passed Tests (1): llvm-libc++-shared.cfg.in :: libcxx/language.support/support.dynamic/libcpp_deallocate.sh.cpp"
…464)" (llvm#120511) This reverts commit 2691b96. This reapply fixes the buildbot breakage of the original patch, by updating clang/test/CodeGen/ubsan-trap-debugloc.c to specify -fsanitize-merge (the default, which is merge, is applied by the driver but not clang_cc1). This reapply also expands clang/test/CodeGen/ubsan-trap-merge.c. ---- Original commit message: '-mllvm -ubsan-unique-traps' (llvm#65972) applies to all UBSan checks. This patch introduces -fsanitize-merge (defaults to on, maintaining the status quo behavior) and -fno-sanitize-merge (equivalent to '-mllvm -ubsan-unique-traps'), with the option to selectively applying non-merged handlers to a subset of UBSan checks (e.g., -fno-sanitize-merge=bool,enum). N.B. we do not use "trap" in the argument name since llvm#119302 has generalized -ubsan-unique-traps to work for non-trap modes (min-rt and regular rt). This patch does not remove the -ubsan-unique-traps flag; that will override -f(no-)sanitize-merge.
…ice memory (llvm#120485) When emboxing memory that comes from CUFMemAlloc, we need to allocate the descriptor in manage memory as it might be passed to a kernel.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.