forked from llvm-mirror/llvm
-
Notifications
You must be signed in to change notification settings - Fork 17
roc-1.7.x rc4 updates #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introducing Vector Neural Network Instructions, consisting of:
vpdpbusd{s}
vpdpwssd{s}
Differential Revision: https://reviews.llvm.org/D40208
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318746 91177308-0d34-0410-b5e6-96231b3b80d8
Mention the purpose of the BICri tests added by r318398, as requested in post-commit review. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318747 91177308-0d34-0410-b5e6-96231b3b80d8
vpopcnt{b,w}
Differential Revision: https://reviews.llvm.org/D40213
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318748 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes instregex for interger vector add/sub instructions Differential revision: https://reviews.llvm.org/D40254 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318749 91177308-0d34-0410-b5e6-96231b3b80d8
It's on all other LWP instruction but I missed it from lwpins, despite similar scheduling behaviour. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318751 91177308-0d34-0410-b5e6-96231b3b80d8
Reviewers: mstorsjo Differential Revision: https://reviews.llvm.org/D40286 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318756 91177308-0d34-0410-b5e6-96231b3b80d8
As pointed out in post-commit review of r318738, `return ReplaceNode(..)` when both ReplaceNode and the current function return void is confusing. This patch moves to using a more obvious early return, and moves to just using an if to catch the one case we currently care about. A future patch that adds further custom instruction selection can introduce a switch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318757 91177308-0d34-0410-b5e6-96231b3b80d8
All match equivalent basic classes (WritePHAdd, WriteFAdd etc.) according to both the AMD 15h SOG and Agner's tables. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318758 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The generated diagnostic by the AsmMatcher isn't always applicable to the AsmOperand. This is because the code will only update the diagnostic if it is more specific than the previous diagnostic. However, when having validated operands and 'moved on' to a next operand (for some instruction/alias for which all previous operands are valid), if the diagnostic is InvalidOperand, than that should be set as the diagnostic, not the more specific message about a previous operand for some other instruction/alias candidate. Reviewers: craig.topper, olista01, rengolin, stoklund Reviewed By: olista01 Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D40011 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318759 91177308-0d34-0410-b5e6-96231b3b80d8
Almost too trivial to worry about, but it seems worth having consistency with upcoming commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318760 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: VOP2b instructions (v_subbrev_u32, v_add_i32 ...) shouldn't support OMod operand in SDWA encoding Reviewers: rampitec, dp Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40172 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318761 91177308-0d34-0410-b5e6-96231b3b80d8
on floats, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318764 91177308-0d34-0410-b5e6-96231b3b80d8
Differential revision: https://reviews.llvm.org/D39195 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318766 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318768 91177308-0d34-0410-b5e6-96231b3b80d8
This is NFC, as the matcher would continue looping up to the maximum number of operands with no effect, but this should improve performance a bit, and makes the debug trace clearer. Differential revision: https://reviews.llvm.org/D36744 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318769 91177308-0d34-0410-b5e6-96231b3b80d8
- We can still emit this error if the actual instruction has two or more operands missing compared to the expected one. - We should only emit this error once per instruction. Differential revision: https://reviews.llvm.org/D36746 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318770 91177308-0d34-0410-b5e6-96231b3b80d8
This was causing the (invalid) predicated versions of the NEON VRINTX and VRINTZ instructions to be accepted, with the condition code being ignored. Also, there is no NEON VRINTR instruction, so that part of the check was not necessary. Differential revision: https://reviews.llvm.org/D39193 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318771 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: First step in adding MemorySSA as dependency for loop pass manager. Adding the dependency under a flag. New pass manager: MSSA pointer in LoopStandardAnalysisResults can be null. Legacy and new pass manager: Use cl::opt EnableMSSALoopDependency. Disabled by default. Reviewers: sanjoy, davide, gberry Subscribers: mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D40274 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318772 91177308-0d34-0410-b5e6-96231b3b80d8
These are pre-UAL syntax, and we don't support any other pre-UAL instructions, with the exception of FLDMX/FSTMX, which don't have a UAL equivalent. Therefore there's no reason to keep them or their AsmParser hacks around. With the AsmParser hacks removed, the FLDMX and FSTMX instructions get the same operand diagnostics as the UAL instructions. Differential revision: https://reviews.llvm.org/D39196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318777 91177308-0d34-0410-b5e6-96231b3b80d8
It works just like __cyg_profile_func_enter but takes no arguments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318783 91177308-0d34-0410-b5e6-96231b3b80d8
The pass was renamed in r318195. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318784 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318786 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318787 91177308-0d34-0410-b5e6-96231b3b80d8
…ects. This partially reverts r298851. The the underlying issue is that we don't currently model the dependency between mrs (read system register) and msr (write system register) instructions. Something like the below should never be reordered: msr TPIDR_EL0, x0 ;; set thread pointer mrs x8, TPIDR_EL0 ;; read thread pointer but was being reordered after r298851. The functional part of the patch that wasn't reverted needed to remain in place in order to not break r299462. PR35317 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318788 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318792 91177308-0d34-0410-b5e6-96231b3b80d8
Segment moves to memory are always 16-bit. Remove invalid 32 and 64 bit variants. Recommiting with missing clang inline assembly test change. Fixes PR34478. Reviewers: rnk, craig.topper Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39847 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318797 91177308-0d34-0410-b5e6-96231b3b80d8
…ctions to icelake CPU. This is based on table 1-1 of the October 2017 revision of Intel® Architecture Instruction Set Extensions and Future Features Programming Reference git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318799 91177308-0d34-0410-b5e6-96231b3b80d8
…ow load folding. The commuting patterns for the AVX version actually still had priority over the new patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318800 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319139 91177308-0d34-0410-b5e6-96231b3b80d8
Additional checks for phi operands: - first operand should be a virtual register def. It should not be tied, implicit, internalread, earlyclobber or a read. - The other operands should be register/mbb operands next to each other - The register operands should not be implicit, internalread, earlyclobber, debug or tied. - We can perform most of the PHI checks even for unreachable blocks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319140 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes cases where we wouldn't perform various register operand checks just because we didn't happen to have a definition in the MCInstrDesc. This changes the code to only skip the tests that actually depend on the MCInstrDesc definition. This makes the machine verifier spot the problem from https://llvm.org/PR33071 after the pass that actually caused it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319141 91177308-0d34-0410-b5e6-96231b3b80d8
Unoptimized IR can have linear sequences of stores to an array, where the initial GEP for the first store is formed from the pointer to the array, and the GEP for each store after the first is formed from the previous GEP with some offset in an inductive fashion. The (large) resulting DAG when analyzed by DAGCombine undergoes an excessive number of combines as each store node is examined every time its' offset node is combined with any child of the offset. One of the transformations is findBetterNeighborChains which assists MergeConsecutiveStores. The former relies on repeated chain walking to do its' work, however MergeConsecutiveStores is disabled at O0 which makes the transformation redundant. Any optimization level other than O0 would invoke InstCombine which would resolve the chain of GEPs into flat base + offset GEP for each store which does not exhibit the repeated examination of each store to the array. Disabling this optimization fixes an excessive compile time issue (30~ minutes for the test case provided) at O0. Reviewers: niravd, craig.topper, t.p.northover Differential Revision: https://reviews.llvm.org/D40193 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319142 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319143 91177308-0d34-0410-b5e6-96231b3b80d8
Fast-isel routines need to bail out in the case that fast-isel fails on the operands. This fixes https://bugs.llvm.org/show_bug.cgi?id=35064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319144 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r319073. Bot fails with a mismatch that looks like pygments-generated HTML. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319146 91177308-0d34-0410-b5e6-96231b3b80d8
…pass control flow to successors This is to address a problem similar to those in D37460 for Scalar PRE. We should not PRE across an instruction that may not pass execution to its successor unless it is safe to speculatively execute it. Differential Revision: https://reviews.llvm.org/D38619 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319147 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, we use a set of pairs to cache responces like `CompareSCEVComplexity(X, Y) == 0`. If we had proved that `CompareSCEVComplexity(S1, S2) == 0` and `CompareSCEVComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareSCEVComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` any two values from the same set are equal from point of `CompareSCEVComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40428 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319149 91177308-0d34-0410-b5e6-96231b3b80d8
The priorities in the section name suffixes are zero padded, allowing the linker to just do a lexical sort. Add zero padding for .ctors sections in ELF as well. Differential Revision: https://reviews.llvm.org/D40407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319150 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, we use a set of pairs to cache responces like `CompareValueComplexity(X, Y) == 0`. If we had proved that `CompareValueComplexity(S1, S2) == 0` and `CompareValueComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareValueComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` that merges Values into equivalence sets so that any two values from the same set are equal from point of `CompareValueComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40429 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319153 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The PeepholeOptimizer pass calls this function solely based on checking DefMI->isMoveImmediate(), which only checks the MoveImm bit of the instruction description. So it's up to FoldImmediate itself to properly check that DefMI *actually* moves from an immediate. I don't have a separate test case for this, but the next patch introduces a test case which happens to crash without this change. This error is caught by the assertion in MachineOperand::getImm(). Change-Id: I88e7cdbcf54d75e1a296822e6fe5f9a5f095bbf8 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40342 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319155 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The entire algorithm operates per basic-block, so for cache locality it should be better to re-optimize a basic-block immediately rather than in a separate loop. I don't have performance measurements. Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40344 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319156 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: I think we do not need to analyze debug intrinsics here, as they should not impact codegen. This has 2 benefits: 1) slightly less work to do and 2) avoiding generating optimization remarks for converting calls to debug intrinsics to tail calls, which are not really helpful for users. Based on work by Sander de Smalen. Reviewers: davide, trentxintong, aprantl Reviewed By: aprantl Subscribers: llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D40440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319158 91177308-0d34-0410-b5e6-96231b3b80d8
…operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after *everything* that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319164 91177308-0d34-0410-b5e6-96231b3b80d8
Certain ARM implementations treat icache clear instruction as a memory read,
and CPU segfaults on trying to clear cache on !PROT_READ page.
We workaround this in Memory::protectMappedMemory by adding
PROT_READ to affected pages, clearing the cache, and then setting
desired protection.
This fixes "AllocationTests/MappedMemoryTest.***/3" unit-tests on
affected hardware.
Reviewers: psmith, zatrazz, kristof.beyls, lhames
Reviewed By: lhames
Subscribers: llvm-commits, krytarowski, peter.smith, jgreenhalgh, aemerson,
rengolin
Patch by maxim-kuvrykov!
Differential Revision: https://reviews.llvm.org/D40423
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319166 91177308-0d34-0410-b5e6-96231b3b80d8
…ms/prefetch/prefetchw git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319167 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319168 91177308-0d34-0410-b5e6-96231b3b80d8
Author
|
Ping. |
searlmc1
approved these changes
Dec 11, 2017
searlmc1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.